Press "Enter" to skip to content

Category: Syntax

T-SQL Tips Regarding Subqueries

Itzik Ben-Gan provides quality information on working with subqueries in SQL Server:

In this plan you see a Nested Loops (Left Semi Join) operator, with a scan of the clustered index on Customers as the outer input and a seek in the index on the customerid column in the Orders as the inner input. You also see an outer reference (correlated parameter) based on the custid column in Customers, and the seek predicate Orders.customerid = Customers.custid.

So why are you getting the plan in Figure 1 and not the one in Figure 2? If you haven’t figured it out yet, look closely at the definitions of both tables—specifically the column names—and at the column names used in the query. You will notice that the Customers table holds customer IDs in a column called custid, and that the Orders table holds customer IDs in a column called customerid. However, the code uses custid in both the outer and inner queries. 

Itzik covers three specific scenarios, all of which can cause trouble to database developers who haven’t been burned yet. And sometimes even those who have.

Comments closed

Breaking Up Queries with UNION ALL

Bert Wagner takes us through a scenario where it can be faster to combine queries with UNION ALL rather than using IN:

Even though this query reads the whole clustered index to get the Benefactor rows, the total number of logical reads is still smaller than the seek/key lookup pattern seen in the combined query with IN(). This UNION ALL version gives SQL Server the ability to build a hybrid execution plan, combining two different techniques to generate a plan with fewer overall reads.

Click through for the example.

Comments closed

Window Functions with IGNORE NULLs

Lukas Eder walks us through a bit of functionality I wish we had in SQL Server:

On each row, the VALUE column should either contain the actual value, or the “last_value” preceding the current row, ignoring all the nulls. Note that I specifically wrote this requirement using specific English language. We can now translate that sentence directly to SQL:

last_value (t.value) ignore nulls over (order by d.value_date)

Since we have added an ORDER BY clause to the window function, the default frame RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW applies, which colloquially means “all the preceding rows”. (Technically, that’s not accurate. It means all rows with values less than or equal to the value of the current row – see Kim Berg Hansen’s comment)

Only a few database products have this and SQL Server is not one of them.

Comments closed

The Importance of Aliasing in Subqueries

Gail Shaw explains an unexpected result when writing a statement with a subquery:

The column name in the temp table is missing an I, probably just a typo, but it has some rather pronounced effects.

The obvious next question is why the select with the subquery in it didn’t fail, after all, the query asks for ClientID from #TempClients, and there’s no such column. However there is a ClientID column available in that query, and it’s in the Orders table. And that’s a valid column for the subquery, because column binding order, when we have subqueries, is first to tables within the subquery, and then, if no match is found, to tables in the outer query.

I know that the first time I experienced this, I thought it was a bug as well. But no, it’s by design and Gail explains why.

Comments closed

Request: Add Support for Row Pattern Recognition

Itzik Ben-Gan would like to see Row Pattern Recognition make it into T-SQL:

The ISO/IEC 9075:2016 standard (aka SQL:2016) introduces support for Row Pattern Recognition (RPR) in SQL. Similar to using regular expressions to identify patterns in a string, RPR allows you to use regular expressions to identify patterns in a sequence of rows.

To me, it’s the next step in the evolution of window functions. If you think that window functions are profound and useful, RPR is really going to bake your noodle.

RPR has limitless practical applications, including identifying patterns in stock market activity, handling time series, fraud detection, material handling, shipping applications, DNA sequencing, gaps and islands, top N per group, and many others. 

I’ve voted it up and recommend you do so too. This is a great way to think of streams of data sitting in a database. If you have business use cases where this could help, adding those as comments would be great too.

Comments closed

Auto-Escaping XML Characters

Emanuele Meazzo shows how you can auto-escape XML characters using T-SQL:

Recently I had to look up the definition for a bunch of SQL objects and didn’t want to manually retrieve them manually in SSMS (with Create Scripts) or Visual Studio (by searching the object name in my TFS repository).

Since lazyness and automation are the basis of a well done engineering work, I wanted to create a list, where I could basically click on the object that I needed and see the definition right away, without any tool or having to code something externally, of course.

Click through for the solution, which is short and sweet.

Comments closed

Defending ANSI-89 Syntax

Allan Hirt prefers ANSI-89 join syntax:

Pedro Lopes who is now on the SQL Server dev team wrote the blog post “T-SQL Misconceptions – JOIN ON vs. WHERE” which does a good job of explaining the “it depends” around the newer style of joins. Why link this article? Microsoft deprecated the OUTER JOIN operators in SQL Server 2008. Two other sources of information on this are here and here (the first is Ward Pond’s old technet blog, and sadly will probably go away soon). If you’re keeping score at home, WHERE clauses are not deprecated except if you’re using *= and =*). The changes people made were wholly unnecessary and as the author, the newer stuff is harder to decipher than what I originally did. They were putting their own biases onto things.

I personally do not like ANSI-89 syntax because it blurs the lines between filters and join criteria. Those are separate things serving different purposes and keeping them separate clarifies queries more than it obscures. Allan’s example doesn’t have any filters but in a more complex scenario with several filters and several join criteria, it can require extra care figuring out what’s going on, especially with multi-column join criteria and filters mixed in (meaning it’s not join criteria and then filters, but a mishmash of the two).

Comments closed

Pivoting Performance Counter Data

Dave Bland shows how you can build a dynamic pivot to see performance counter data over a stretch of time:

The next step is to write the code to capture the counter values and insert the data it the temporary table created above.  Because we need to capture the values over a period of time, the WAITFOR DELAY is used.  In this case the delay is 10 seconds, although you can change this to suit your needs. Of course, don’t forget to increment the counter variable. You will need to determine what counters you would like to capture.  Notice in the WHERE clause, we are looking for an instance_name of ” or ‘_total’.  This will allow the code to only capture one row for each counter.  The number 10 is the number of times we want to capture the counter values.  If you want to capture the data more frequently, simpley modify the number of seconds in the WAITFOR DELAY.  Here is link to my post in this topic, WAITFOR.

Dynamic pivoting in SQL is unnecessarily difficult, especially compared to languages like R.

Comments closed

Calculating Weighted Averages in SQL

Lukas Eder shows how you can calculate weighted averages using SQL:

As can be seen, this schema is slightly denormalised as the number of lines per transaction are precalculated in the transactions.lines column. This will turn out to be quite useful for this calculation, but it isn’t strictly necessary.

Now, in the previously linked Stack Overflow question, a report was desired that would calculate:
– An aggregation of sums as provided by the line items
– An aggregation of averages as provided by the transactions

As Lukas points out, doing this in two queries is easy, but doing it in one is sublime.

Comments closed

Performance Testing Simple Scalar UDF Functions

Wayne Sheffield tries out a simple scalar UDF in SQL Server 2019 to see how it performs:

I recently published a post detailing the new Scalar UDF Inlining feature in SQL 2019 here. That post introduced the new feature in a way that I used to compare performance to the other function types, continuing the performance evaluation of functions that I had previously posted here and here. In the Scalar UDF Inlining post, I used a function to strip all non-numeral values from a string, and to return the result. This used the FOR XML output option.

In thinking about how scalar functions are commonly used, I’ve decided to revisit this feature with a simpler function. I will still compare it to all the other types of functions to see how Scalar UDF Inlining compares to the others.

Wayne’s results are music to the product team’s ears, I’m sure.

Comments closed