CROSS APPLY Replacing REPLACE

Kevin Feasel

2017-08-16

T-SQL

Bert Wagner shows off a good use of the APPLY operator:

Here we only have 4 nested REPLACE functions. My shameful record is 29. I’m not proud of it, but sometimes it’s the only way to get things done.

Not only are these nested REPLACE() functions difficult to write, but they are difficult to read too.

Instead of suffering through all of that ugly nesting, what you can do instead is use CROSS APPLY:

Click through for the example.  This is one of several great uses for the APPLY operator.

Substrings: Powershell Versus T-SQL

Shane O’Neill contrasts the SUBSTRING function in T-SQL with Powershell’s Substring method:

The main difference that I can see when using SUBSTRING() in SQL Server versus in PowerShell is that SQL Server is very forgiving.

If you have a string that is 20 characters longs and you ask for everything from the 5th character to the 100th character, SQL Server is going to look at this, see that the string does not go to the 100th character, and just give you everything that it can.

It’s a small difference but an important one.

Spheres In SQL Server

Slava Murygin continues his quest to build a graphics engine with spatial data:

Couple of years ago I came up with an algorithm of drawing an ellipse using SQL Server spatial geometry: http://slavasql.blogspot.com/2015/02/drawing-ellipse-in-ssms.html

I’ve used that algorithm to make a sphere and as in my previous blog of drawing 3D Cube I use external procedure to simplify the process.
This time instead of temporary stored procedure I’m using a function to generate Geometrical content.

This has been an enjoyable series so far, showing how to build different shapes using spatial queries.

ARITHABORT And ANSI_WARNINGS

Kevin Feasel

2017-08-07

T-SQL

Shane O’Neill looks at what the ARITHABORT and ANSI_WARNINGS settings do in SQL Server:

So, like a dog when it sees a squirrel, when I found out about the problems with ARITHABORT and ANSI_WARNINGS I got distracted and started checking out what else I could break with it. Reading through the docs, because I found that it does help even if I have to force myself to do it sometimes, I found a little gem that I wanted to try and replicate. So here’s a reason why you should care about setting ARITHABORT and ANSI_WARNINGS on.

These are two settings where the default value makes a lot of sense.

Calculating Pi In SQL Server

Kevin Feasel

2017-08-04

T-SQL

Slava  Murygin has fun calculating Pi in SQL Server:

It goes like:
N1 BaseType : numeric
N1 Precision : 10
N1 Scale : 0

Interesting to know. SQL Server interprets integers higher than 2,147,483,647 as Numeric, not as BIGINT !!!

Slava digs up a couple T-SQL edge cases worth noting.

Conditional Counts

Kevin Feasel

2017-08-04

T-SQL

Mark Broadbent shows that COUNT has a few tricks up its sleeve:

When I came to compare the results against aggregated data that I had, I noticed that the values were off and it became fairly obvious that the transactional data also contained refunds and rebates (positive values but logically reflected as negative by the Transaction_Type status) and these were not just causing inaccuracies for the SUM on Sales_Value, but were also causing the COUNT for Number_Of_Sales to be wrong. In other words, refunds and rebates must be removed from the SUM total and not aggregated in the Number_Of_Sales columns. Now at this stage, you might be thinking that we can do this by a simple WHERE clause to filter them from the aggregates, but not only is it wrong to “throw away” data, I realised that my target tables also contained aggregate columns for refunds and rebates.

I have only used the SUM(CASE) method that Mark shows.  It’s interesting that COUNT(CASE) can work, but I agree that it is probably more confusing, if only because it’s so rare.

Using The COMPRESS Function In SQL Server

Kendra Little explains the COMPRESS() function in SQL Server 2016:

One cool little feature in SQL Server 2016 is COMPRESS(). It’s a TSQL function available in all editions that shrinks down data using the GZIP algorithm (documentation).

Things to know about COMPRESS():

  • Compressed data is in the VARBINARY(max) data type

  • You get the data “back to normal” by  using the DECOMPRESS function – which also outputs VARBINARY(max)

  • You can’t use columns of the VARBINARY(max) type in an index key column– but it may be useful to use the column as a filter in a filtered index, in some cases

COMPRESS() uses standard GZip compression, so you could use methods other than DECOMPRESS() to inflate the data—for example, bring the compressed data out to your application and use language-specific GZip libraries to decompress the data.  Read the whole thing.

MERGE With Deletion

Kevin Wilkie shows an example of deleting data as part of a merge operation:

The last time we were together, we learned how to use the MERGE statement when we wanted to insert rows that didn’t exist and update rows that didn’t. This time we’re going to add onto that. We’re adding the seldom used, but delightfully potent – delete rows that no longer exist in the original table.

MERGE is an enticing but dangerous piece of syntax.  It looks so nice until you realize how many bugs and oddities there are in the command.

Time Slicing T-SQL

Kevin Feasel

2017-07-31

T-SQL

Bill Fellows has some fancy window function footwork to split data into time slices:

That looks like a lot, but it really isn’t. Starting from the first inner most query, we select the top 24 rows from sys.all_objects and use the ROW_NUMBER function to generate us a monotonically increasing set of values, thus 1…24. However, since the allowable range of hours is 0 to 23, I deduct one from this value (A). I repeat this pattern to generate minutes (B) except we get the top 60. Since I want 15 second intervals, for the seconds query, I only get the top 4 values. I deduct one so we have {0,1,2,3} and then multiply by 15 to get my increments (C). If you want different time slices, that’s how I would modify this pattern.

Only works on 2012 or later, but it’s a fancy way of slicing into 15-second (or however you define it) chunks.

Prioritizing Rows In A Union

Daniel Hutmacher shows several ways to combine data from multiple sources, prioritized by source:

For the sake of simplicity, I’ll assume that our example data has a clustered index on the “primary key” that we want to use to determine which rows have already been included in the set – in our example, the primary key is (FirstName, LastName). The data I’m using is a fictional example, but here’s the jist:

  • #Employees has about 33 000 rows.
  • #Customers has about 44 000 rows.
  • #Passengers has about 500 000 rows.

The data is constructed in a way that these queries should return 530 000 rows, so we’re looking at some overlap but far from totally overlapping rows.

Example #492,806 that T-SQL is not a true 4th Generation Language, that how you write the query can greatly matter for performance.

Categories

September 2017
MTWTFSS
« Aug  
 123
45678910
11121314151617
18192021222324
252627282930