Press "Enter" to skip to content

Category: T-SQL

Spheres In SQL Server

Slava Murygin continues his quest to build a graphics engine with spatial data:

Couple of years ago I came up with an algorithm of drawing an ellipse using SQL Server spatial geometry: http://slavasql.blogspot.com/2015/02/drawing-ellipse-in-ssms.html

I’ve used that algorithm to make a sphere and as in my previous blog of drawing 3D Cube I use external procedure to simplify the process.
This time instead of temporary stored procedure I’m using a function to generate Geometrical content.

This has been an enjoyable series so far, showing how to build different shapes using spatial queries.

Comments closed

ARITHABORT And ANSI_WARNINGS

Shane O’Neill looks at what the ARITHABORT and ANSI_WARNINGS settings do in SQL Server:

So, like a dog when it sees a squirrel, when I found out about the problems with ARITHABORT and ANSI_WARNINGS I got distracted and started checking out what else I could break with it. Reading through the docs, because I found that it does help even if I have to force myself to do it sometimes, I found a little gem that I wanted to try and replicate. So here’s a reason why you should care about setting ARITHABORT and ANSI_WARNINGS on.

These are two settings where the default value makes a lot of sense.

Comments closed

Conditional Counts

Mark Broadbent shows that COUNT has a few tricks up its sleeve:

When I came to compare the results against aggregated data that I had, I noticed that the values were off and it became fairly obvious that the transactional data also contained refunds and rebates (positive values but logically reflected as negative by the Transaction_Type status) and these were not just causing inaccuracies for the SUM on Sales_Value, but were also causing the COUNT for Number_Of_Sales to be wrong. In other words, refunds and rebates must be removed from the SUM total and not aggregated in the Number_Of_Sales columns. Now at this stage, you might be thinking that we can do this by a simple WHERE clause to filter them from the aggregates, but not only is it wrong to “throw away” data, I realised that my target tables also contained aggregate columns for refunds and rebates.

I have only used the SUM(CASE) method that Mark shows.  It’s interesting that COUNT(CASE) can work, but I agree that it is probably more confusing, if only because it’s so rare.

Comments closed

Using The COMPRESS Function In SQL Server

Kendra Little explains the COMPRESS() function in SQL Server 2016:

One cool little feature in SQL Server 2016 is COMPRESS(). It’s a TSQL function available in all editions that shrinks down data using the GZIP algorithm (documentation).

Things to know about COMPRESS():

  • Compressed data is in the VARBINARY(max) data type

  • You get the data “back to normal” by  using the DECOMPRESS function – which also outputs VARBINARY(max)

  • You can’t use columns of the VARBINARY(max) type in an index key column– but it may be useful to use the column as a filter in a filtered index, in some cases

COMPRESS() uses standard GZip compression, so you could use methods other than DECOMPRESS() to inflate the data—for example, bring the compressed data out to your application and use language-specific GZip libraries to decompress the data.  Read the whole thing.

Comments closed

MERGE With Deletion

Kevin Wilkie shows an example of deleting data as part of a merge operation:

The last time we were together, we learned how to use the MERGE statement when we wanted to insert rows that didn’t exist and update rows that didn’t. This time we’re going to add onto that. We’re adding the seldom used, but delightfully potent – delete rows that no longer exist in the original table.

MERGE is an enticing but dangerous piece of syntax.  It looks so nice until you realize how many bugs and oddities there are in the command.

Comments closed

Time Slicing T-SQL

Bill Fellows has some fancy window function footwork to split data into time slices:

That looks like a lot, but it really isn’t. Starting from the first inner most query, we select the top 24 rows from sys.all_objects and use the ROW_NUMBER function to generate us a monotonically increasing set of values, thus 1…24. However, since the allowable range of hours is 0 to 23, I deduct one from this value (A). I repeat this pattern to generate minutes (B) except we get the top 60. Since I want 15 second intervals, for the seconds query, I only get the top 4 values. I deduct one so we have {0,1,2,3} and then multiply by 15 to get my increments (C). If you want different time slices, that’s how I would modify this pattern.

Only works on 2012 or later, but it’s a fancy way of slicing into 15-second (or however you define it) chunks.

Comments closed

Prioritizing Rows In A Union

Daniel Hutmacher shows several ways to combine data from multiple sources, prioritized by source:

For the sake of simplicity, I’ll assume that our example data has a clustered index on the “primary key” that we want to use to determine which rows have already been included in the set – in our example, the primary key is (FirstName, LastName). The data I’m using is a fictional example, but here’s the jist:

  • #Employees has about 33 000 rows.
  • #Customers has about 44 000 rows.
  • #Passengers has about 500 000 rows.

The data is constructed in a way that these queries should return 530 000 rows, so we’re looking at some overlap but far from totally overlapping rows.

Example #492,806 that T-SQL is not a true 4th Generation Language, that how you write the query can greatly matter for performance.

Comments closed

Reading Drive Files Recursively From T-SQL

Slava Murygin shows how to use a common table expression to read a recursive drive listing:

In this blog I’ll use undocumented in SQL Server extended stored procedure “xp_dirtree“.
That procedure is well described in Patrick Keisler’s blog.

Briefly, “xp_dirtree” extended procedure returns tree of sub-directories from given folder and has three parameters:
1. Starting or root folder
2. Depth level – determines how deep SQL Server will dig into a sub-directory structure. If it set to zero or omitted whole tree will be returned.
3. If omitted or set to zero will return only folders .If set to “1” will return files as well.

If you want to be really mean to somebody, create a shortcut to the parent directory within a directory.

Comments closed

Batch Execution Mode And Window Functions

Chris Adkin shows how taking advantage of batch execution mode on rowstore tables can lead to faster performance as degree of parallelism increases:

The SQL Server execution engine fundamentally acts like a cursor, control flow is exerted from the root node of the plan down to right most child node or iterators. The (logical) flow of data through the plan is in the opposite direction.

I say ‘Logical’ because in practise the run time uses buffers in order to minimise the data movement whilst executing the plan. However, up until SQL Server 2012, which first introduced batch mode, execution plans were executed by iterators processing data in a row by agonising row manner. Batch mode changes all of this, if we can ferry rows around in batches, this reduces the number of CPU cycles it takes to process an iterator in the plan (providing it supports batch mode). Also by sizing batches such that they fit inside the level 2 cache of the CPU, we gain even more performance by minimizing CPU last level cache misses or worse still main memory.

Adkin credits Niko Neugebauer for the insight and shows how you can use this on normal rowstore tables.

Comments closed