Press "Enter" to skip to content

Category: T-SQL

Disorderly Queries

Chad Callihan wants you to think about that ORDER BY clause:

I recently came across a scenario where an application process was not performing correctly on one database but was working fine on others. The process should have been completing in seconds but was taking minutes with no indication of activity. After some investigation, I found that the process was stuck waiting on a SELECT statement to complete. Even worse, it was holding an exclusive lock on a table which was then blocking new information from processing.

One part of the SELECT query that stood out was that it was ordering by a date field. Considering what the process was doing, there was no need to have the data ordered. Generally, it’s better to have the data sorted on the application side instead of SQL Server but in this case not even that was necessary.

There are definitely good cases where you need to use ORDER BY in a database—such as paging scenarios. But if you don’t need things in a particular order, Chad shows that you can potentially save a good deal on performance without an explicit ordering.

Comments closed

Multi-Pathed Queries

Guy Glanster needs a multi-tool procedure:

This stored procedure, which I created in the AdventureWorks2017 database, has two parameters: @CustomerID and @SortOrder. The first parameter, @CustomerID, affects the rows to be returned. If a specific customer ID is passed to the stored procedure, then it returns all the orders (top 10) for this customer. Otherwise, if it’s NULL, then the stored procedure returns all orders (top 10), regardless of the customer. The second parameter, @SortOrder, determines how the data will be sorted—by OrderDate or by SalesOrderID. Notice that only the first 10 rows will be returned according to the sort order.

So, users can affect the behavior of the query in two ways—which rows to return and how to sort them. To be more precise, there are 4 different behaviors for this query:

1. Return the top 10 rows for all customers sorted by OrderDate (the default behavior)
2. Return the top 10 rows for a specific customer sorted by OrderDate
3. Return the top 10 rows for all customers sorted by SalesOrderID
4. Return the top 10 rows for a specific customer sorted by SalesOrderID

Let’s test the stored procedure with all 4 options and examine the execution plan and the statistics IO.

This is quite common for reporting procedures and Guy shares several patterns, some of which work better than others.

Comments closed

Building a Function to Get the Next Date by Date Name or Offset

Louis Davidson has a function for us:

As I have been building my Twitter management software, I have been doing a lot more ad-hoc, repetitive coding using T-SQL directly. When I was generating new tweets for upcoming days, a bit of the process that got old quick was getting the date for an upcoming day (the primary key for my tweet table is the date, the type of the tweet, and a sequence number). After having to pick the date of next Tuesday… I had to write some more code (because a true programmer doesn’t do repetitive work when code can be written… even if sometimes the code doesn’t save you time for days or weeks.

So this following function was born from that need, and it is something I could imagine most anyone using semi-regularly, especially when testing software. 

This is definitely fancy. My inclination would be to create a calendar table, as that’ll solve this particular issue as well as other complex variants (like, I want the next Tuesday which doesn’t fall on a holiday).

Comments closed

Index Creation with DROP_EXISTING

Monica Rathbun takes us through the DROP_EXISTING option when modifying an index:

When you are making changes to an existing Non-Clustered index SQL Server provides a wide variety of options. One of the more frequently used methods is DROP EXISTING; in this post you will learn all about that option. This option automatically drops an existing index after recreating it, without the index being explicitly dropped. Let us take a moment understand the behavior of this choice.

What I really want is DROP_IF_EXISTS. I want idempotent commands: if I run it once or a thousand times, I end up in the same state whether there was an index there at the start or not (or if attempt #793 failed due to running out of sort space in tempdb or something, leaving me with no index). DROP_EXISTING is only idempotent if the index already existed, but then you have to ask, why is it important if an index of that name is already there? The important part of the statement is that I want an end state which includes this index in this form.

Comments closed

Looping with WHILE(TRUE)

Eitan Blumin won’t be getting off this bus:

This is one of those things that I would have never expected to adopt, and yet here I am! Not just loving “infinite” loops, but also confident enough that I can convince you to feel the same!

It may sound very weird to most people at first, since it’s such an unorthodox way of using WHILE loops, and probably goes against what you consider to be coding best practices and standards.

I use this a fair amount as well for the reasons Eitan describes. The one thing to keep an eye out for, though, is that you are actually progressing toward an end goal. In other words, if you’re using WHILE(TRUE) to loop through batches of rows in a table by, say, minimum ID, make sure you increment that minimum ID or else you’ll be spinning your wheels for a while and not realize it. Not that I’ve ever done that before, of course…

Comments closed

The Sequence Operator

Hugo Kornelis digs into the sequence operator:

The Sequence operator reads all rows from all its inputs, in order, and without modification. It returns all rows it reads from its last input only.

This operator is typically used in execution plans where some branches are needed to do preparation work (e.g. to invoke a Table-valued Function operator) and then other branches produce the final output, or multiple branches need to do work in sequence (such as in wide data modification plans). The operators on the non-last inputs often do not produce any output at all, but even if they do the rows returned are discarded. Only rows returned from the last input are returned to the parent of the Sequence operator.

Note that this is quite difference from sequences of numbers in SQL Server, which act similarly to identity integers.

Comments closed

Azure SQL Database Startup Time

John McCormack has a tip for us:

The traditional methods used for to find the start up time for SQL Server don’t work in Azure SQL DB.

I searched high and low to find this and thought I’ve got to share, and hopefully make it search engine friendly. A traditional google or bing search wasn’t bringing up the best way to find this out. I saw a lot of complicated queries to pull data, convert it and estimate start up time using functions and all kinds of magic. 

Click through for the one-liner script.

Comments closed

More Number Series Generators

Itzik Ben-Gan continues a series:

This is the second part in a series about solutions to the number series generator challengeLast month I covered solutions that generate the rows on the fly using a table value constructor with rows based on constants. There were no I/O operations involved in those solutions. This month I focus on solutions that query a physical base table that you pre-populate with rows. For this reason, beyond reporting the time profile of the solutions like I did last month, I’ll also report the I/O profile of the new solutions. Thanks again to Alan Burstein, Joe Obbish, Adam Machanic, Christopher Ford, Jeff Moden, Charlie, NoamGr, Kamil Kosno, Dave Mason, John Nelson #2 and Ed Wagner for sharing your ideas and comments.

Read on for three more solutions, as well as a re-evaluation of the solutions in the first article.

Comments closed

Indicators of Schema Issues

Erik Darling has a good list of schema-related issues:

Something is broken in the way that you store data.

You’re overloading things, and you’re going to hit big performance problems when your database grows past puberty.

Most of what he’s describing in this post is a failure of atomicity, which implies a failure to achieve first normal form. Mind you, all of these functions are perfectly reasonable as part of data loading, and many of them are perfectly reasonable in the SELECT clause of a query (though that’s still a sign of failure of atomicity), but once you start throwing them into the WHERE clause, we’ve got problems.

Comments closed