Finding The First Non-NULL Value In A Window

Bert Wagner shows off the FIRST_VALUE window function and walks us through a case it struggles with:

The SQL Server FIRST_VALUE function makes it easy to return the “first value in an ordered set of values.”

The problem is that if that first value happens to be a NULL, there is no easy, built-in way to skip it.

While a UserVoice item exists to add the ability to ignore nulls (go vote!), today, we’re going accomplish that end result with some alternative queries.

Click through for the demo, as well as a video version of the post.

GOTO And Labels In T-SQL

Ryan Desmond demonstrates the purpose of GOTO in T-SQL:

So I was playing around at work today and decided for whatever reason to see how I could get the code I was writing to fire off only in certain situations.

If it’s Sunday maybe, or if this is in a particular environment, or if a record in an admin table was something specific.  I’m not sure how I’ll use this but I stumbled on Labels and decided to play with them.

Ok, so how to get to know labels.  Well, in order to get them to work sometimes I have to create labels that are based on some criteria.

I do try to avoid these as much as possible, but they are valid syntax and I’ve seen a couple of cases where it makes sense to use GOTO.

Window Functions Have Defaults, Too

Steve Jones reminds us that when running a window function, there is a default window in place:

What I want to do is compare the passing yards each year with the most current value for that player, showing the plus or minus. This means that for Aaron Rodgers, who threw for 1675 yards in 2017, I’d want to show this for the first few years of his career:

This shows me an easy view of the years where he was better in his career than he is now. Last year was likely a down year because of injury, but we’ll see this year.

In any case, if I run this query using LAST_VALUE() for the final year of his career, I don’t get the right results.

It’s good to keep in mind the full syntax for a window function for just this reason.

Joins When No Join Types Are Valid

Kevin Feasel

2018-08-08

Syntax

Hugo Kornelis has a brain-teaser for us:

The query below can be executed in any version of the AdventureWorks sample database. Don’t bother understanding the logic, there is none. It is merely constructed to show how SQL Server handles what appears to be an impossible situation.

If you look at the descriptions of the various join operators in the Execution Plan Reference, you will see that this query poses the optimizer for what appears to be an insolvable problem: none of the join operators can be used for this query!

But it’s possible, and Hugo explains exactly what happens, as well as places where the optimizer could be better at solving the impossible (or at least marginally difficult).

Implementation Matters: CTEs In Postgres And SQL Server

Kevin Feasel

2018-08-07

Syntax

Brent Ozar looks at a couple of places where Postgres and SQL Server differ in implementation details:

In SQL Server, if you write this query:

SQL Server builds a query plan for the entire operation at once, and passes the WHERE clause filter into the CTE. The resulting query plan is efficient, doing just a single clustered index seek.

In Postgres, CTEs are processed separately first, and subsequent WHERE clauses aren’t applied until later. That means the above query works just fine – but performs horribly. You’ll get much better results if you include your filters inside each CTE, like this:

That’s less than ideal.

The comments are valuable here as well.

The Value Of CROSS APPLY

Kendra Little talks about one of my favorite T-SQL operators:

Here’s my top 3 favorite uses for CROSS APPLY and OUTER APPLY:

  1. APPLY is fantastic for calling table valued functions. I didn’t include questions about those in the quiz, simply for the purposes of keeping the code simple, and because I wanted the quiz to be about thinking through how apply works — but it’s still the #1 use.

  2. Another thing I might use it for is when a query needs a correlated subquery — somewhat like an inline function.

  3. And I also like it for queries that have a calculation that needs to be done and which is referenced in multiple columns in the query, or perhaps also in a predicate and the select. You can perform the computation once in the apply and then reference it multiple times. That way if you have to change the formula later on, you only have to change it in once place, plus I find it’s easier to read in some cases.

Item number three is something I learned from an older blog post Kendra wrote and regularly use APPLY for this purpose.

Creating An Inline Table-Valued Function In SQL Server

Jeanne Combrinck looks at inline table-valued functions in SQL Server:

Lets start off with what is a table-valued function (TVF)? A TVF is a dynamic table produced at the time of execution, depending on parameters. Like a view, a TVF creates a result set only when it’s executed, but, unlike a view, it can be parameterized.

You get two types of TVFs, an Inline Table-Valued Function (ITVFs) and Multi-statement Table-Valued Function (MTVFs). I find them easy to remember, think of the “I” in ITVF as 1 (single statement) and the “M” in MTVF as “many” (multiple statements).

As you can imagine, a TVF produces a result set that can be used as a virtual table or view. Yes, you can actually select data from a TVF, or join it with some other tables, views, or even other TVFs. In this post I will go into more detail on ITVFs.

In my experience, the best part about using inline table-valued function is the 3-9x performance improvement you get when removing them and inlining their code.  It’s a great idea but the performance cost is just too high for me.

COUNT And NULL Values

Kenneth Fisher digs into the COUNT() function and sees how it deals with NULL values:

Count the number of values
1
2
3
SELECT COUNT(FieldName) FROM TableName;
-- or
SELECT COUNT(ALL FieldName) FROM TableName;

The ALL argument is the default and is unnecessary (I didn’t even know it existed until I started this post). Here you are counting the number of non NULL values in FieldName. So in a column with (1, NULL, 1, 2, 3, NULL, 1) you’ll get a count of 5. You do get a nice warning (depending on your ANSI_WARNINGS setting) if there was a NULL value though.

By the way, the ALL operator isn’t useful there, but can be useful along with its counterparts SOME and ANY.  I rarely keep them in my mind, so I instead tend to write EXISTS and NOT EXISTS statements which have operate on an equivalent function.

Grouping And Aggregating In SQL, R, And Python

Dejan Sarka has a few examples of aggregation in different languages, including SQL, R, and Python:

The query calculates the coefficient of variation (defined as the standard deviation divided the mean) for the following groups, in the order as they are listed in the GROUPING SETS clause:

  • Country and education – expression (g.EnglishCountryRegionName, c.EnglishEducation)
  • Country only – expression (g.EnglishCountryRegionName)
  • Education only – expression (c.EnglishEducation)
  • Over all dataset- expression ()

Note also the usage of the GROUPING() function in the query. This function tells you whether the NULL in a cell comes because there were NULLs in the source data and this means a group NULL, or there is a NULL in the cell because this is a hyper aggregate. For example, NULL in the Education column where the value of the GROUPING(Education) equals to 1 indicates that this is aggregated in such a way that education makes no sense in the context, for example aggregated over countries only, or over the whole dataset. I used ordering by NEWID() just to shuffle the results. I executed query multiple times before I got the desired order where all possibilities for the GROUPING() function output were included in the first few rows of the result set. Here is the result.

GROUPING SETS is an underappreciated bit of SQL syntax.

Docker Commands For DBAs

Marek Masko has some commands which will help DBAs get familiar with Docker:

Stop container

To stop container:

  • docker stop <container ID>

To stop all running containers:

  • docker stop $(docker ps -a -q)

Most of the commands are straightforward but it’s nice to have a reference guide.

Categories

January 2019
MTWTFSS
« Dec  
 123456
78910111213
14151617181920
21222324252627
28293031