Press "Enter" to skip to content

Category: T-SQL

Quickly Finding Distinct Values with T-SQL

Paul White does some amazing things with T-SQL, news at 11:

I will be looking for distinct values in the BountyAmount column of the dbo.Votes table, presented in bounty amount order ascending. The Votes table has just under 53 million rows (52,928,720 to be exact). There are just 19 different bounty amounts, including NULL.

The Stack Overflow 2013 database comes without nonclustered indexes to minimize download time. There is a clustered primary key index on the Id column of the dbo.Votes table. It comes set to SQL Server 2008 compatibility (level 100), but we will start with a more modern setting of SQL Server 2017 (level 140):

Getting the query down from 10.5 seconds to 1ms is crazy.

Comments closed

Supporting Multiple NULL Values with a Unique Constraint

Itzik Ben-Gan walks us through a workaround in T-SQL:

Suppose that you have a database in a platform that supports the standard unique constraint and you need to migrate that database to SQL Server. You may face issues with the enforcement of unique constraints in SQL Server if the unique columns support NULLs. Data that was considered valid in the source system may be considered invalid in SQL Server. In the following sections I’ll explore a number of possible workarounds in SQL Server.

I use a simplified version of this as an interview question, so it’s nice to see an entire article from Itzik on the topic, including a couple solutions way outside the box.

Comments closed

Selecting From a Stored Procedure

Madhivanan shows us how to query the results of a stored procedure:

One of my friends asked me “Is it possible to query Stored Procedure resultset like a table. ie select * from (EXEC Stored_procedure)?”

Well. Querying the resultset from the Stored Procedure like Table can be done using OPENROWSET function

This is a fairly novel approach to the problem. In the past, I’ve inserted the results of a stored procedure into a temp table, but you can only do that if the procedure itself doesn’t call INSERT INTO ... EXEC ....

Comments closed

From IDENTITY to Sequences

Andy Levy recommends checking out sequences:

The SEQUENCE object eliminates all of this. It gives you a simple way to just grab a new number and increment for the next caller. It’s very fast as there’s no visible table I/O, and it’s unaffected by rollbacks.

And it’s so much easier to use! You just ask for the next number in the sequence!

I like sequences, though the interesting thing is that 2020 me has created them a lot less frequently than 2012 me was sure I would. I’m glad they’re in the product, however.

One thing I should point out is that sequences are like identity columns in that you can have gaps due to user behavior, such as rolling back transactions. If you absolutely need gap-free sets of numbers, you’re back to Andy’s Method One, except everything has to be serializable and wrapped in explicit transactions.

Comments closed

LAST_VALUE() and Windows

Jeet Kainth explains the importance of specifying your window when using LAST_VALUE():

To return the actual last value, add the additional clause ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING. This ensures that for each row the LAST_VALUE() function looks at all rows; from the very first row to the very last row.

Click through for the example. Remember that the default is going to the current row, not the entire data set.

Comments closed

Dealing with Large SQL Scripts

Kevin Chant has some advice when you have to deal with a giant SQL script:

If you have been given a script that is thousands of lines long from a developer, the first thing I would ask is if they can split it up.

I say this because a lot of developers who write long scripts tend to have come from various backgrounds. Hence, some of them are used to developing on other programming languages.

So, they do not always appreciate that SQL is a set-based language. In addition, they do not always appreciate SQL Server is optimised for set based queries.

Sometimes you can break these scripts down, though there are of course good ways and bad ways to do so.

Comments closed

Fun with Filtering Between Start and End Dates

Brent Ozar shows why the StartDate + EndDate pattern is not great for filtering:

If all you need to do is look up the memberships for a specific UserId, and you know the UserId, then it’s a piece of cake. You put a nonclustered index on UserId, and call it a day.

But what if you frequently need to pull all of the memberships that were active on a specific date? That’s where performance tuning gets hard: when you don’t know the UserId, and even worse, you can’t predict the date/time you’re looking up, or if it’s always Right Now.

This is where I advocate pivoting to a series of event records, so instead of a start date and end date, you have an event type (started, expired, cancelled, etc.) and a date. There are other alternatives as well, but it’s a good thought exercise.

Comments closed

Describing the First Result Set

Phil Factor dives into sys.dm_exec_describe_first_result_set_for_object():

I’ve been working on a project unkindly nicknamed ‘The Gloop’ because the code is a bit amorphous. Basically, it is an approach to documenting SQL Server databases, using the facilities provided such as the metadata views and DMFs. Although it is relatively simple to record the result returned by a table-valued function I’d rather neglected the stored procedures because there was no metadata that could produce the first result set produced by a procedure or trigger.

I’d been silly because there is, of course, an Execution system DMF that does it: sys.dm_exec_describe_first_result_set_for_object(). it takes as its parameter the object_id of a procedure or trigger and describes the first result metadata for the module with that ID. It has the same result set definition as sys.dm_exec_describe_first_result_set.

I’m going to have to try this out, but I have problems with sp_describe_first_result_set and how it blows up if you use a temp table in the stored procedure whose result set you’re grabbing.

Comments closed

String Aggregation with T-SQL

Andy Levy talks about one of my favorite functions in SQL Server 2017:


This is another in a group of several posts on modernizing T-SQL code with new features and functionality available in SQL Server.

SQL Server 2016 gave us the STRING_SPLIT() function, but what about the reverse – compiling a set of values into one delimited string? We only had to wait 15 months for the release of SQL Server 2017, and the STRING_AGG() function.

I had the STUFF() / FOR XML PATH trick memorized for quite some time, but that was always a solution which felt like it worked on accident. Even if the new solution weren’t faster than the old, I’d still use it.

Comments closed

NULL Handling Features not in T-SQL

Itzik Ben-Gan continues a series complexity around NULL:

When using the offset window functions LAG, LEAD, FIRST_VALUE and LAST_VALUE, sometimes you need to control the NULL treatment behavior. By default, these functions return the result of the requested expression in the requested position, irrespective of whether the result of the expression is an actual value or a NULL. However, sometimes you want to continue moving in the relevant direction, (backward for LAG and LAST_VALUE, forward for LEAD and FIRST_VALUE), and return the first non-NULL value if present, and NULL otherwise. The standard gives you control over this behavior using a NULL treatment clause with the following syntax:

offset_function(<expression>) IGNORE_NULLS | RESPECT NULLS OVER(<window specification>)

There are three good examples of functionality around handling NULL which the current implementation of T-SQL is missing.

Comments closed