Category: T-SQL

Building Test Data Following A Normal Distribution In T-SQL

Published 2019-01-03 by Kevin Feasel

In order to show you the solution, I want to build up a reasonable sized sample. Any solution looks great when reading five records, but let’s kick that up a notch. Or, more specifically, a million notches: I’m going to use a CTE tally table and load 5 million rows.
I want some realistic looking data, so I’ve adapted Dallas Snider’s strategy to build a data set which approximates a normal distribution.
Because this is a little complicated, I wanted to take the time and explain the data load process in detail in its own post, and then apply it in the follow-up post. We’ll start with a relatively small number of records for this demonstration: 50,000. The reason is that you can generate 50K records almost instantaneously but once you start getting a couple orders of magnitude larger, things slow down some.

If you do custom data generation for lower environments, I’d recommend checking this out. Your production data probably doesn’t follow a normal distribution exactly, but a normal distribution is probably closer to reality than the uniform distribution you get with functions like RAND().

Comments closed

Diving Into OPTION(RECOMPILE)

Published 2019-01-02 by Kevin Feasel

Arthur Daniels explains some of the nuance behind OPTION(RECOMPILE) on T-SQL statements:

SQL Server will compile an execution plan specifically for the statement that the query hint is on. There’s some benefits, like something called “constant folding.” To us, that just means that the execution plan might be better than a normal execution plan compiled for the current statement.
It also means that the statement itself won’t be vulnerable to parameter sniffing from other queries in cache. In fact, the statement with option recompile won’t be stored in cache.

Click through for a couple of demos as well as a discussion of positives and negatives regarding its use.

Comments closed

Preventing Execution With PARSEONLY And NOEXEC

Published 2018-12-31 by Kevin Feasel

Solomon Rutzky shows us a way to prevent accidental full script execution:

There are times when I am working on a SQL script that really shouldn’t be executed all at once. Sometimes it’s a series of examples / demos for a presentation or forum answer. Other times it’s just a temporary need while I’m in the process of creating a complex script, but once the script is completed and tested then it should run all at once. In either case, I have accidentally hit F5 too many times when I thought that a certain section of code was highlighted (so only that section would execute) but in fact nothing was highlighted so the script started executing from the very top, and either ran until completion or until I was able to cancel it (if it ran long enough for me to have time to understand what was happening and hit the “cancel” button).
So I needed some way of ensuring that a script would not execute if no section was highlighted.

Read on to learn about PARSEONLY and NOEXEC.

Comments closed

Using IDENTITY In A SELECT Statement

Published 2018-12-20 by Kevin Feasel

Kenneth Fisher shares something he learned recently about the IDENTITY function:

Now, looking at it a bit more closely you’ll see that this is a function call, not just a property. Now, in my research for this post I did find where I’d mentioned this function briefly in my somewhat comprehensive identity post. Technically I didn’t mention so much as it was mentioned to me in the comments so I added it to the list. I guess I either didn’t look at it closely enough at the time or it’s just one of those cases where I forgot. Either way, it’s worth highlighting now.

Click through to learn more.

Comments closed

Comparing Data With CHECKSUM

Published 2018-12-20 by Kevin Feasel

David Fowler shows how to use CHECKSUM and CHECKSUM_AGG to compare data:

There are times when we need to compare two tables and figure out if the data matches. I often see a number of ways of doing this suggested, most are quite slow and inefficient. I’d quite like to share a quick and slightly dirty way of doing this using the CHECKSUM and CHECKSUM_AGG functions.
CHECKSUM()
Just a reminder that CHECKSUM() will generate a checksum for an entire row or selection of columns in the row.
CHECKSUM_AGG()
Will generate a checksum for a dataset.

David then has a couple of examples showing these in action.

Comments closed

Using DISTINCT With XML Data

Published 2018-12-17 by Kevin Feasel

Dave Bland has a workaround for a limitation with processing XML in SQL Server:

Since I was working on an example for my next performance class, I decided to use an example from that session. In the query below the qp.* part of the column list will return a column named query_plan which is an XML data type.
SELECT DISTINCT pa.,cp.,qp.*
FROM
(SELECT plan_handle
FROM sys.dm_exec_cached_plans) cp
CROSS APPLY sys.dm_exec_query_plan (cp.plan_handle ) qp
CROSS APPLY sys.dm_exec_plan_attributes(cp.plan_handle) As pa
WHERE is_cache_key = 1 AND pa.attribute = ‘set_options’
However, when I add the DISTINCT keyword I get the error below.

Read on for a workaround for this.

Comments closed

Finding The Closest Numeric Match

Published 2018-12-13 by Kevin Feasel

Itzik Ben-Gan has a T-SQL puzzle for us:

As you can see, both T1 and T2 have a numeric column (INT type in this example) called val. The challenge is to match to each row from T1 the row from T2 where the absolute difference between T2.val and T1.val is the lowest. In case of ties (multiple matching rows in T2), match the top row based on val ascending, keycol ascending order. That is, the row with the lowest value in the val column, and if you still have ties, the row with the lowest keycol value. The tiebreaker is used to guarantee determinism.

Click through for the details as well as several solutions.

Comments closed

Finding The SQL Server Port With T-SQL

Published 2018-12-07 by Kevin Feasel

Jack Vamvas shows us how to find the port SQL Server is listening on using T-SQL:

Question: Without going into the SQL Server Configuration manager via the GUI is there a command oriented method to extract the port number SQL Server is listening on?

Answer: There are a few different methods to extract the port number without going into the Configuration interface.
Method 1 – use xp_readererrorlog

Read on for an explanation of this technique as well as links to a couple other methods. I hadn’t thought about using the error log as a source, but it works.

Comments closed

Continuing The Advent Of Code In T-SQL

Published 2018-12-06 by Kevin Feasel

Wayne Sheffield has a few more posts in the Advent of Code series. His latest edition:

In Day 5, we find ourselves working with the polymers of the new Santa suit. A polymer (the input file), consists of units, represented by upper and lower case letters. Adjacent units of the same letter, but of different polarity (case), cancel each other out. This may lead to other units that can then cancel each other out. The goal is to reduce the polymer to as small as possible, and report back the reduced size.

Tasks:

Perform a case-sensitive search/replace for each letter of the alphabet. The search is for a pair of the same letter, where one is upper case, and the other is lower case.

Recursively perform this operation until the string can no longer be reduced.

In my opinion, the key part to this is that the operation needs to be performed recursively. I can think of only two ways to recursively perform an operation in SQL Server:

A recursive common table expression (cte).

Using a WHILE loop.

I don’t like using either of these mechanisms in SQL Server – they both perform operations in a “Row-By-Agonizing-Row” method, instead of a more set-based approach. However, set-based recursion usually performs extremely poorly. So, I’m going to use a while loop.

The recursion requirement does limit things a bit; otherwise I could see putting something together with the LEAD() window function.

Comments closed

Understanding Recursive CTEs Via Execution Plans

Published 2018-12-04 by Kevin Feasel

Hugo Kornelis shows us how SQL Server’s database engine implements recursive common table expressions:

I am pretty sure that (almost) everyone reading this blog knows that a CTE (Common Table Expression) is an independent subquery that can be named and then referenced (multiple times if needed) in the main query. This makes CTEs an invaluable tool to increase the readability of complex queries. Almost everything we can do with a CTE can equally well be done by using subqueries directly in the query but at the cost of losing readability.

However, there is also one feature that is actually unique to CTEs: recursion. This blog post is not about what this feature is, what it does, or how to write the code. A brief description is given as part of the complete explanation of CTEs in Books Online, and many bloggers have written about recursive CTEs as well.

For this post, I’ll assume the reader knows what a recursive CTE is, and jump into the execution plan to learn how this recursion is actually implemented.

This is (as usual) a great article, and helps explain why recursive CTEs can be slow.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31