Press "Enter" to skip to content

Category: T-SQL

More Number Series Generators

Itzik Ben-Gan continues a series:

This is the second part in a series about solutions to the number series generator challengeLast month I covered solutions that generate the rows on the fly using a table value constructor with rows based on constants. There were no I/O operations involved in those solutions. This month I focus on solutions that query a physical base table that you pre-populate with rows. For this reason, beyond reporting the time profile of the solutions like I did last month, I’ll also report the I/O profile of the new solutions. Thanks again to Alan Burstein, Joe Obbish, Adam Machanic, Christopher Ford, Jeff Moden, Charlie, NoamGr, Kamil Kosno, Dave Mason, John Nelson #2 and Ed Wagner for sharing your ideas and comments.

Read on for three more solutions, as well as a re-evaluation of the solutions in the first article.

Comments closed

Indicators of Schema Issues

Erik Darling has a good list of schema-related issues:

Something is broken in the way that you store data.

You’re overloading things, and you’re going to hit big performance problems when your database grows past puberty.

Most of what he’s describing in this post is a failure of atomicity, which implies a failure to achieve first normal form. Mind you, all of these functions are perfectly reasonable as part of data loading, and many of them are perfectly reasonable in the SELECT clause of a query (though that’s still a sign of failure of atomicity), but once you start throwing them into the WHERE clause, we’ve got problems.

Comments closed

Random Number Generation in T-SQL

Chad Callihan generates random numbers:

The first way to generate a random number is by using the SQL Server RAND function. This will return a float value. 

Both of the techniques Chad shows are examples of generating uniform distributions—distributions in which any value is just as likely as any other. There are plenty of places in which a uniform is great: drawing by lot is one of them. But when you’re generating artificial data, the results tend to look unrealistic because not many natural phenomena follow uniform distributions.

If you’re interested in generating numbers which tend to look more realistic when generating artificial data, I have a post on generating random numbers built on a normal distribution.

Comments closed

Performance Comparison of ISNULL and COALESCE

Erik Darling notes the edge cases where ISNULL() can be faster than COALESCE():

Sometimes there are very good reasons to use either coalesce or isnull, owing to them having different capabilities, behaviors, and support across databases.

But isnull has some particular capabilities that are interesting, despite its limitations: only two arguments, specific to SQL Server, and uh… well, we can’t always get three reasons, as a wise man once said.

There is one thing that makes isnull interesting in certain scenarios. Let’s look at a couple.

Read the whole thing. You (hopefully) won’t be in this situation often, but if you do happen to find yourself in it for whatever reason (and you can’t re-write the query to something better), it’s good to keep this in mind.

Comments closed

Faster String Concatenation with SQL Server

Steve Stedman has some tips for people who need to combine strings:

What the previous show as that the longer the string gets the slower the concatenation is. So instead we declare a second VARCHAR(MAX) variable called @stringBuilder, and each time through the loop we concatenate to that, then every thousandth time through the loop we take the @stringBuilder variable and concatenate it on to the @bigString, then clear out the @stringBuilder variable. This keeps the @stringBuilder variable relatively short, and reduces the number of concatenations to the @bigString to roughly 1/1000th the original.

Click through for a demo of the process. I don’t think I’ve been in too many situations where string concatenation was a performance killer in SQL Server, but it’s good to know.

Comments closed

The Pain of SELECT *

Grant Fritchey strongly recommends against SELECT *:

Quite a few years ago, I wrote a post about SELECT * and performance. That post had a bit of a click-bait title (freely admitted). I wrote the post because there was a really bad checklist of performance tips making the rounds (pretty sure it’s still making the rounds). The checklist recommended a whole bunch of silly stuff. One silly thing it recommended was to simply substitute ALL columns (let me emphasize that again, name each and every column) instead of SELECT * because “it was faster”.

My post, linked above, showed that this statement was nonsense. Let’s be clear, I’m not a fan of SELECT *. Yes, it has some legitimate functionality. However, by and large, using SELECT * causes performance problems.

I’ll use SELECT * for one-off queries (well, something like SELECT TOP(100) * but same difference), but it’s a really bad practice to include that in application code for the reasons Grant mentions.

Comments closed

Using AT TIME ZONE

Chad Callihan walks us through using AT TIME ZONE in SQL Server:

Dealing with time zones in general can be a headache. Thankfully, SQL Server 2016 included a new clause to make working with time zones more manageable. Let’s look at a couple examples:

I’m in Eastern Standard Time so we’ll start here. Keep in mind that when no offset information is included, SQL Server is going to assume that the date value is in the target time zone.

This works best when your dates are stored in UTC, but Chad does show how to convert between two other time zones. This does not perform all that well when you need to convert a lot of rows, but if you’re doing one or two conversions, it’ll do just fine.

Comments closed

Defining Deferred Complation

Gail Shaw explains the concept of deferred compilation:

When I talked about row estimations for table variables, I mentioned ‘deferred compile’, but didn’t give a whole lot of details. What, then, is a deferred compilation? Let’s start with how batches work normally.

T-SQL is an interpreted language. While we talk about compiles, they’re not compilations in the sense of what happens to C++. There’s no conversion of the script to a machine language or intermediate language which is used from that point onwards. Every time a batch executes, it has to be parsed, bound and have an execution plan generated or fetched from cache.

Click through to learn what it does mean.

Comments closed

One Reason to Avoid SELECT *

Andy Levy has a new reason for us:

I got a merge a while back that included a change I wasn’t expecting from one of my developers. All they were doing was renaming a column on a table. We had names settled a while ago. What’s up with that?

Turns out, when I created the table, I named one of the fields BrithYear. This merge request corrected it to BirthYear. The misspelling slipped past me, past everyone else, it was only discovered because when this developer was building the a new query into their code, they spelled the field as one would expect, and the query threw an error.

There’s many a reason not to use SELECT * in application code; this is one I don’t think I’d heard before.

Comments closed