Press "Enter" to skip to content

Curated SQL Posts

Compressing LOB Data

Andy Mallon wants to compress LOB data:

The COMPRESS() function takes an input of string or binary data, and applies the gzip algorithm to it. It returns a value of type varbinary(max). In essence, instead of storing string or binary data, you can gzip it up and store it in a varbinary(max) column. There’s also a DECOMPRESS() function for when you are reading the data and need to unzip it.

This costs some CPU, but gzip can save quite a bit of space. How much space, and whether it’s worth the CPU cost will vary depending on your data and workload. In this blog post, we’ll take a look at one table. We’ll look at the space savings we get out of using COMPRESS(), and we’ll look at the effort necessary to implement it.

Read on for Andy’s test and thoughts.

Comments closed

Plan Explorer Supports SSMS 2016

SQL Sentry Plan Explorer is now compatible with SQL Server 2016:

We’ve been busy modifying SQL Sentry Plan Explorer so it will work seamlessly with the changed add-in model in the new Management Studio. (As a bit of background, SSMS is now based on the Visual Studio 2015 shell, and some work is required to transition existing add-ins.)

I am happy to announce that, as of today, Plan Explorer 2.8 (build 10.1.94), with add-in support for SSMS 2016, is now available!

Huzzah.  That’s two plug-ins down and two more to go before I’m 100% ready for SSMS 2016.

Comments closed

Primary And Foreign Keys

Shane O’Neill discusses primary key and foreign key constraints:

If you check the two definitions for Primary key and Foreign key you’ll see that, even though they are both called keys, they serve two different purposes; namely identifying rows and enforcing links.

And those two purposes are not mutually exclusive!

A column/column combo that identifies a row can also be used to enforce a link back to another table (or itself, as shown above with Foreign keys).

The assumption, that if you were one then you couldn’t be the other, was incorrect. If your business rules call for it, don’t let a column being one type of key stop it from being the other.

Not included but still important are unique constraints.

Comments closed

Finding File Growth

Andy Galbraith shows how to use extended events to track file growth events:

…but what does this have to do with Extended Events?

As I started out, I was looking for something slick I do with a trace that I could replicate in Extended Events, and this was a great candidate.

The catch as I discovered, is that while file growths are captured in the default trace, they are *not* in the system health session…

Fortunately, you can create your own XEvent and Andy gives us the details.

Comments closed

Creating R Code

Ginger Grant introduces us to Microsoft R:

Microsoft has not one version of R, they have two but two. These two different versions are needed because they have two different purposes in mind. Microsoft R Open, is open source and fully R compatible and is faster than open source R because they rewrote a number of the algorithms to include multi-threaded math libraries. If you want to run R code on SQL Server, this is the not the version you want to use. You want to use the non-open source version designed to run on R Server, which is included with SQL Server 2016, Microsoft RRE Open. This version will run R code not only in memory but swap to disk, to create code which can access SQL Server data without needing to create a file, and can run code on the server from the client. The version of RRE Open which is included in SQL Server 2016 is 8.0.3.

She follows this up with a demo program to pull data from a SQL Server table and generate a histogram.  If you have zero R experience, there’s no time like the present to get started.

Comments closed

Alerting On Drastic Changes

Rob Collie has a post on using Power BI to spot outliers:

The basic idea here is “alert me if something has changed dramatically.”  If there’s a corner of my business that has spiked or crashed in a big way, I want to know.  If something has dramatically improved in a particular region, I may want to dive into that and see if it’s something we can replicate elsewhere.  And if something has fallen off a cliff, well, I need to know that for obvious reasons too.  And both kinds of dramatic change, positive and negative, can easily be obscured by overall aggregate values (so in some sense this is a similar theme to “Sara Problem?”)

So the first inclination is to evaluate distance from average performance.  And maybe that would be fine with high-volume situations, but when we’re subdividing our business into hundreds or perhaps thousands of micro-segments, we end up looking at smaller and smaller sample sizes, and “normal” variation can be legitimately more random than we expect.

This looks really cool.  If you read the comments, Rob notes that performance does break down at some point.  If you start hitting that point, I’d think about shifting this to R.

Comments closed

Implementing SoundEx

Dror Helper shows how to implement SoundEx in C#:

It’s fairly easy to follow the steps of the algorithm (as defined by Wikipedia):

  1. Retain the first letter of the name and drop all other occurrences of a, e, I, o, u, y, h, w.

  2. Replace consonants with digits as follows (after the first letter):

    • b, f, p, v → 1

    • c, g, j, k, q, s, x, z → 2

    • d, t → 3

    • l → 4

    • m, n → 5

    • r → 6

  3. If two or more letters with the same number are adjacent in the original name (before step 1), only retain the first letter; also two letters with the same number separated by ‘h’ or ‘w’ are coded as a single number, whereas such letters separated by a vowel are coded twice. This rule also applies to the first letter.

  4. If you have too few letters in your word that you can’t assign three numbers, append with zeros until there are three numbers. If you have more than 3 letters, just retain the first 3 numbers.

SQL Server also supports SOUNDEX as a built-in function.

Comments closed

Waiting For SP1

Guy Glantser hates “wait for SP1” advice:

Historical Facts

Throughout my career I have never seen an RTM version that was substantially less stable then the following SP1. Sure, there were bugs and issues. Sometimes there were critical bugs and issues. But there were just as much bugs and issues in SP1 and in SP2, and so on. I haven’t conducted a thorough research, so I don’t have a statistical proof, but these are the facts, at least from my experience.

I’d add one more thing:  pre-release versions of SQL Server run in production as part of Microsoft TAP (older link, and I think RDP and TAP have merged together at this point, but I don’t have those inside details).  These are real companies with real workloads running pre-RTM versions of SQL Server.  I work for a company which is in the program, and we were running our data warehouse on CTP 3 and then RCs.  By the time RTM hits the shelves, there’s already been a good deal of burn-in.

Comments closed

Choose Your Edition

Grant Fritchey explains the differences between different editions of SQL Server:

SQL Server Enterprise Edition is the high end. Here is where you need to go to multi-terrabytes in size and you have massive transaction loads. You’re looking at very sophisticated availability and disaster recovery. Again, the name gives it away. You’re generally only going to this edition when you’re working at an enterprise level of scale and architecture. Since you’re just getting started, don’t worry about this.

My version of the story is, “Buy Enterprise Edition.  Don’t cheap out because you’ll regret it later.”  Grant’s version is much more thorough.

Comments closed

Using Diagnostics Trace With Power BI

Chris Webb uses Diagnostics.Trace to track process runtime in Power BI:

To sum up, the workflow for tuning your query is:

  • Make some changes to the LongQuery query that hopefully make it faster

  • Update the Trace Message parameter with some notes about which version of the LongQuery query it is that you’ll be testing

  • Click the Refresh Preview button for the Diagnostics query to test how long LongQuery now runs for

  • Refresh, or load, the query that reads the data from the trace logs so you can see how all of your changes have affected query execution times

I give it two months before the Power BI team releases a change to make this easier…

Comments closed