Press "Enter" to skip to content

Category: T-SQL

Top Value per Group: Window Function or APPLY

Erik Darling hits one of my favorite topics:

The first rule of rewrites is that they have to produce the same results, of course. Logical equivalency is tough.

In today and tomorrow’s posts I’m going to compare a couple different scenarios to get the top value.

There are additional ways to rewrite queries like this, of course, but I’m going to show you the most common anti-pattern I see, and the most common solution that tends to work better.

Click through to see when each works better.

Comments closed

TRANSLATE() and REPLACE() in SQL Server

Kevin Wilkie compares a couple of functions:

There is another function within SQL Server that many people think does the same thing, but with a slight nuance.

Sometimes, you just need to change out one character with another. For example, you need to make a “(” into a “[” to make everything consistent.

I’ve probably used REPLACE() two (or three?) orders of magnitude more often than TRANSLATE() but that’s mostly because I keep forgetting what the latter does.

Comments closed

Filtered Indexes and Functions

Eitan Blumin looks at filtered indexes:

In fact, absolutely no functions of any kind can be used within the WHERE clause of a filtered index. Not even schema-bound user-defined scalar functions.

Unfortunately, as stated in the Microsoft Docs page about Filtered Indexes, the WHERE clause of a filtered index can only support simple comparison operators.

Well, it’s not entirely true, as you CAN actually use some functions, but on two conditions:

Read the whole thing. Eitan lays out one limitation of filtered indexes and provides a couple of potential workarounds.

Comments closed

Solutions for Matching Supply with Demand

Itzik Ben-Gan continues reviewing solutions to a tricky problem:

Last month I covered a solution based on interval intersections, using a classic predicate-based interval intersection test. I’ll refer to that solution as classic intersections. The classic interval intersections approach results in a plan with quadratic scaling (N^2). I demonstrated its poor performance against sample inputs ranging from 100K to 400K rows. It took the solution 931 seconds to complete against the 400K-row input! This month I’ll start by briefly reminding you of last month’s solution and why it scales and performs so badly. I’ll then introduce an approach based on a revision to the interval intersection test. This approach was used by Luca, Kamil, and possibly also Daniel, and it enables a solution with much better performance and scaling. I’ll refer to that solution as revised intersections.

Read on for one class of solution which performed quite well.

Comments closed

Dynamic SQL No-Go

Kenneth Fisher can’t go in dynamic SQL and neither can you:

This is one of those things that when I look back on it seems really obvious. Note: If at the end of this it isn’t overly obvious to you that’s ok too. I do a lot of dynamic SQL and GO is one of my favorite commands.

Read on to understand why. I was going to “One minor clarification…” Kenneth about it being an SSMS command (implying that it’s not available elsewhere) but he successfully parried the attack en passant.

1 Comment

Handling Optional Parameters in SQL Server

Erik Darling embraces dynamic SQL:

You’ve got tables. So many tables And columns. So many columns.

Users — demanding as they are — might wanna see, search, and order by all sorts of things in those many columns in those many tables. Writing a query to do that is hard at first, especially if you’re afraid of dynamic SQL.

Read on for reasons why dynamic SQL is usually the right answer here and check out a video as well.

Comments closed

Quickly Finding Row Counts for SQL Server Tables

Jeff Iannucci doesn’t have time for a table scan:

Have you ever had to find the number of rows in a user table, and then wrote a little “SELECT COUNT(*) FROM tblWhatever” and hit execute…and waited…and waited…and waited some more? And then started wondering what was going on?

If so, this post is for you.

Read on for a stored procedure which gets a nice estimate of the total number of rows. I tend to have a form of the underlying query saved as a snippet so I can use it easily. One thing to keep in mind is that these stats-based counts can be wrong. It’s rare and typically has to do with page splits duplicating values, but on a very large, very busy table you might occasionally be off by a few rows. I might posit in return that on such a table, if precision is that important, the amount of time between querying the stats and doing something with it will probably also cause you to see a difference in row counts anyhow.

Comments closed

An Introduction to ScriptDOM

Mala Mahadevan gives us an idea of what ScriptDOM is:

I’ve been meaning to get a series of blog posts started on this topic. A twitter conversation from yesterday finally pushed me to it. Last year, I was tasked with finding a linting tool for the large t-sql code base we have at work. I looked into several tools – tsqllint, Sonarqube and several others. We ran into similar issues on all of them. Someone else defining rules for us didn’t work.
One tool called it wrong to use more than 3 tables in a query – we had several with 6-7, maybe even more. Another started to point out lack of indexes on temp tables as a problem (the rule was designed for table scripts but worked the same on code). Getting new rules that we wanted – such as not having unnaming primary keys on temp tables (Querystore doesn’t like them) or having our naming standards enforced meant extra work on someone else’s code. Our rules were custom to our environment. There were generic best practices for sure, such as finding the SELECT * or NOLOCK hints, but those were a small subset of what we needed. I then started looking for a tool with which I could make a custom linter. That’s when I discovered ScriptDOM, which has been around for a really long time with few people knowing or using it. It took me some time to understand how to put this to use. But after I figured it out it was really easy. Now I have a fairly robust, custom linter in place written in PowerShell and integrated well into our Azure DevOps Build process. It is easy to use and it is owned by us.

Read on to see what ScriptDOM can do and stay tuned to learn more.

Comments closed

TRY_CAST and TRY_PARSE

Joe Obbish shows the difference between two functions:

There’s a lot of guidance out there that states that TRY_CAST is a faster, more modern version of TRY_PARSE and that TRY_PARSE should only be used if you need to set the optional culture parameter. However, the two functions can return different results in some cases, even without the culture parameter.

That guidance is blatantly wrong. TRY_CAST() and TRY_PARSE() both came out in SQL Server 2012. TRY_PARSE() uses .NET to perform parsing, which is going to have some edge case differences, especially around cultures and localization. TRY_CAST() is CAST() in an error-safe wrapper. If anything, TRY_CAST() is the “old” version and TRY_PARSE() the “new” version, with scare quotes in place because they both came out at the same time.

Both of them are useful, though I do agree with Joe’s advice of avoiding TRY_PARSE(), at least for larger datasets. If you’re parsing a single date or a small table of dates, TRY_PARSE() does an excellent job because TRY_PARSE('13/01/2019' AS DATE USING 'fr-fr') is not something you can easily do with TRY_CAST() in a US locale.

Comments closed

Solutions for Matching Supply with Demand

Itzik Ben-Gan has some solutions to show:

This month, I’m going to start exploring the submitted solutions, roughly, going from the worse performing to the best performing ones. Why even bother with the bad performing ones? Because you can still learn a lot from them; for example, by identifying anti-patterns. Indeed, the first attempt at solving this challenge for many people, including myself and Peter, is based on an interval intersection concept. It so happens that the classic predicate-based technique for identifying interval intersection has poor performance since there’s no good indexing scheme to support it. This article is dedicated to this poor performing approach. Despite the poor performance, working on the solution is an interesting exercise. It requires practicing the skill of modeling the problem in a way that lends itself to set-based treatment. It is also interesting to identify the reason for the bad performance, making it easier to avoid the anti-pattern in the future. Keep in mind, this solution is just the starting point.

Click through for a solution which is straightforward but slow.

Comments closed