Press "Enter" to skip to content

Curated SQL Posts

Why Use SSDT?

Ed Elliott has a three-part series on database projects in SQL Server Data Tools.

Part 1:  What is SSDT?

The SSOX or SQL Server Object Explorer is a cool utility that lets you connect to a live database and do things to it like debug stored procedures or update individual objects. It also lets you see a view of you projects after all references have been resolved so if you use “Same Database” references you can see how your end project will end up – really useful.

Part 2:  Deploying projects

Using the DacServices via whatever method you want (schema compare, sqlpackage, powershell, something else?) really makes it simple to spend your time writing code and tests rather than manual migration steps. It constantly amazes me who well rounded the deployment side of things is. Every time I use something obscure, something other than a table or procedure I half expect the deployment to fail but it just always works.

Over the last couple of years I must have created hundreds if not thousands of builds all with their own release scripts across tens of databases in different environments and I haven’t yet been able to break the ssdt deployment bits without it acyually being my fault or something stupid like a merge that goes haywire (that’s one reason to have tests).

Part 3:  the .Net APIs

The ScriptDom has two ways to use it, the first is to pass it some T-SQL (be it DDL or DML) and it will return a representation of the T-SQL in objects which you can examine and do things to.

The second way it can be used is to take objects and create T-SQL.

I know what you are thinking, why would I bother? It seems pretty pointless to me. Let me assure you that it is not pointless, the first time I used it for an actual issue was where I had a deployment script with about 70 tables in. For various reasons we couldn’t guarantee that the tables existed (some tables were moved into another database) the answer would have been to either split the tables into 2 files or manually wrap if exists around each table’s deploy script. Neither of these options were particularly appealing at the particular point in the project with the time we had to deliver.

This is a great series with a lot of informative links.

Comments closed

Always Encrypted

Warner Chaves has a video introducing Always Encrypted:

This is the big difference of this new feature, that the operations to encrypt/decrypt happen on the client NOT on SQL Server. That means that if your SQL Server is compromised, the key pieces to reveal the data are NOT with the server. This means that even if your DBA wants to see the data, if they don’t have access to the CLIENT application then they won’t be able to see the values.

Always Encrypted strikes me as something that will be incredibly useful for 2-3% of the population, somewhat painful for 3-5% of the population, and completely ignored by the rest.  I’m currently on the fence about whether, three years from now, I will consider “completely ignored by the rest” to be a shame.

Comments closed

Anglicize Values

Dave Mattingly shows an easy way to anglicize values:

If your customer’s name is “José” but you search for “Jose”, you won’t (by default) find him.

Here’s a simple way to take care of that in your SQL database, without changing the data that you have.

If a particularly system only needs to support one language (e.g., English), this can be helpful, at least until somebody throws in Chinese or Hebrew characters.  That said, supporting Unicode is the best move when available.

Comments closed

Check Your CHECKDBs

Richie Lee has a script to check the last known CHECKDB run date:

One of the most important duties of a DBA is to ensure that CHECKDB is run frequently to ensure that the database is both logically and physically correct. So when inheriting an instance of SQL, it’s usually a good idea to check when the last CHECKDB was last run. And ironically enough, it is actually quite difficult to get this information quickly, especially if you have a lot of databases that you need to check. The obvious way is to run DBCC DBINFO against the specific database. This returns more than just the last time CHECKDB was run, and it is not especially clear which row returned tells us the last CHECKDB (FYI the Field is “dbi_dbccLastKnownGood”.)

It’s a bit of a shame that this information isn’t made available in an easily-queryable DMV.

Comments closed

Multiple Common Table Expressions

Steve Jones shows how to chain Common Table Expressions:

In this way I can more easily see in the first example I’m joining two tables/views/CTEs together. If I want to know more about the details of one of those items, I can easily look up and see the CTE at the beginning.

However when I want multiple CTEs, how does this work?

The answer is simple but powerful.  Once you’ve read up on CTEs, you start to see the power of chaining CTEs.  And then you go CTE-mad until you see the performance hit of the monster you’ve created.  Not that I’ve ever done that…nope…

Comments closed

Common Table Expressions

Aaron Bertrand shows us Common Table Expressions:

A CTE is probably best described as a temporary inline view – in spite of its official name, it is not a table, and it is not stored (like a #temp table or @table variable). It operates more like a derived table or subquery, and can only be used for the duration of a single SELECT, UPDATE, INSERT, or DELETE statement (though it can be referenced multiple times within in that statement).

This is a great article on CTEs; give it a read, even if you’re familiar with them.

1 Comment

Improving DAX Compression

Matt Allington shows how that reducing cardinality helps with reducing data sizes with DAX:

With both of these concepts combined, the file size was reduced from the original 264 MB to 238 MB, a reduction of almost 10%.  You can see where the space savings have come from by comparing the before and after column sizes in the 2 tables below.  The SalesValueExTax column (65MB) was replaced with the Margin column (44MB) and the CostValue column (63MB) was replaced with the CostPerCase column (50MB).

Check it out, as well as the memory tool.

Comments closed

Using GeoJSON Data

Jovan Popovic shows how to use data in GeoJSON format.

First, building data in GeoJSON format from a spatial type:

In geometry object are placed type of the spatial data and coordinates. In “property” object can be placed various custom properties such as address line, town, postcode and other information that describe object. SQL Server stores spatial information as geometry or geography types, and also stores additional properties in standard table columns.

Since GeoJSON is JSON, it can be formatted using new FOR JSON clause in SQL Server.

In this example, we are going to format content of Person.Address table that has spatial column SpatialLocation in GeoJSON format using FOR JSON clause.

Then, converting GeoJSON to Geography types:

New OPENJSON function in SQL Server 2016 enables you to parse and load GeoJSON text into SQL Server spatial types.

In this example, I will load GeoJSON text that contains a set of bike share locations in Washington DC. GeoJSON sample is provided ESRI and it can be found in https://github.com/Esri/geojson-layer-js/blob/master/data/dc-bike-share.json

Check them out.

Comments closed

Parallel Horizontal

Erik Darling looks at operators which result in serial plans:

In the past, there were a number of things that caused entire plans, or sections of plans, to be serial. Scalar UDFs are probably the first one everyone thinks of. They’re bad. Really bad. They’re so bad that if you define a computed column with a scalar UDF, every query that hits the table will run serially even if you don’t select that column. So, like, don’t do that.

What else causes perfectly parallel plan performance plotzing?

Commenting on one of his comments, I can name at least one good reason to use a table variable.

Comments closed

VLFs

Tom Roush talks VLFs, changes in DBCC LOGINFO, and Availability Groups:

Turns out SQL 2008R2 (where the original script worked) returns different fields than 2012 and 2014 (where it didn’t).

I figured I didn’t want to find out which version of the script to use every time I needed to run it on a server, so I told the script to figure that out by itself, and then run the appropriate hunk of code (example below)

This is a good explanation of how to back out of a complex situation.

Comments closed