Minimal Logging With Columnstore

Niko Neugebauer continues his columnstore series by looking at columnstore insert logging in SQL Server 2016 versus 2014:

Ladies and gentlemen! That’s quite a difference to SQL Server 2014!
We better check the total length of the transaction log to see the final result: 384.032 bytes! Ok, that is significantly more than for the rowstore heap table for sure, but what about the comparison to the SQL Server 2014 ? Did this minimal logging bring any improvement ?
Well … 🙂
In SQL Server 2014 we had 1.255.224 bytes spent on the transaction log – meaning over 1.2 MB, meaning around 3 times more, for the Delta-Store insertion! For such a simple table, this is a huge improvement, but let’s take a look at the total length of the transaction log entries in both environments (SQL Server 2014 & SQL Server 2016)

This is worth a careful read.  If you’ve spent time working with 2014 clustered columnstore indexes, there are a few changes which might affect you.  The most interesting thing for me was that the deltastore is no longer page compressed.

Nested Variables In DAX

Chris Webb shows how to create nested variables inside DAX:

There aren’t any performance benefits to doing this, although of course it helps with code readability and organisation (thanks to Marius for confirming this).

Even so, click through to see an example of how to do this.

Automatic Variables In Powershell

Constantine Kokkinos has a list of automatic variables in Powershell:

Been a few days of learning since I last wrote one of these, but I have come back to the automatic variables page on the PowerShell documentation enough times that I think I should just blog the important parts for myself.

  • $?TRUE/FALSE if the last thing you did succeeded.

  • $_ – Something everyone uses in posh, current pipeline object.

  • $Args – all the undeclared params passed to a function, try to avoid.

  • $Error – the array of error objects that represent a stack of the most recent errors. use $Errors0 to get the most recent error.

Read on for more variables.

Upgrading SSMS

Melissa Coates argues that you should upgrade to the latest version of Management Studio:

SSMS is supported for managing SQL Server 2008 through 2016 (except for SSIS instances which sadly still require a version-specific SSMS at the time of this writing). If you manage numerous servers on different versions, this unification is fantastic. There is partial support for managing pre-2008 instances. And, of course as you’d expect, the newest SSMS release supports various new features in SQL Server 2016 such as Query Statistics, Live Query Plans, Compare Showplan, Security Policies for row-level security, and so on with all the new 2016 goodies we have.

SSMS also supports managing Azure components such as Azure SQL Database and Azure SQL Data Warehouse, as well as hybrid cloud features such as StretchDB and Backup to URL. This additional unification is really, really handy.

I have a copy of SSMS 16 for reading Query Store, but not all of my plugins have been updated yet, so I’m still living in SSMS 2014 for now.

Pre-Model Scripts

Richie Lee discusses what makes for a good pre-model script in an SSDT project:

Idempotent: big word, and is used in maths. But don’t let that put you off. What it means in this context is that a script can be re-run and the same result happens. So in the context of altering a primary key, the pseudo-code would look something like this:

if database exists then
if table exists then
if primary key name eq “oldname” then
drop primary key then
add primary key with “newname”

I’m a huge fan of idempotent scripts.  Releases become so much easier when you can guarantee that you can re-run the scripts, and idempotence is a necessary property of continuous integration.

Tests Are Hard

Cody Konior explains how testing is harder than it seems in the demos:

We should have 4 backup jobs

Definitely 4, not 3, and 5 is right out.

Unless of course someone builds a 500GB database with SQL CLR UDTs which slow down DBCC to a crawl, so you move the backups and maintenance for that database into their own jobs or steps with different flags, and possibly start staggering some stuff so it finishes on the weekend without dripping into Monday.

Then you’ve got more. But just kind of sweep them under the carpet. We have file system backup checks so that covers most of it…

These are just administrative test examples; when you have to start testing queries and procedures, it’s a whole new level of fun, as it feels like you’re building a castle on sand.

Elasticsearch Write Operations

Kunal Kapoor has a presentation on Elasticsearch write operations (inserts, updates, and deletes) and explains what’s going on:

In this presentation, we are going to discuss how Elasticsearch handles the various operations like insert, update, delete. We would also cover what is an inverted index and how segment merging works.

Click through for the slides; they helped me firm up a few thoughts I had about Elasticsearch.

R Graph Gallery

David Smith points out the new R Graph Gallery:

Once upon a time, there was the original R Graph Gallery, by Romain François. Sadly, it’s been unavailable for several years. Now there’s a new R Graph Gallery to fill the void, created by Yan Holtz. It contains more than 200 data visualizations categorized by type, along with the R code that created them.

You can browse the gallery by types of chart (boxplots, maps, histograms, interactive charts, 3-D charts, etc), or search the chart descriptions. Once you’ve found a chart you like, you can admire it in the gallery (and interact with it, if possible), and also find the R code which you can adapt for your own use. Some entries even include mini-tutorials describing how the chart was made. You can even submit your own graph, if you’d like to have it displayed in the gallery as well.

Looks like a good place to go to get some inspiration.

Accidental DBAs

Charity Majors on the Accidental DBA phenomenon:


My friend Laine and I are writing a book for people on the data side, called “Database Reliability Engineering“, which is aimed at generalist engineers who want to learn how to deal with data responsibly and effectively.

(Actually that’s a good point, I am supposed to be pitching this book! — which is really mostly Laine with a smidgen of me but it’s going to be super awesome.  Consider this your sales pitch.)

So first, as an accidental DBA, you should obviously buy this book  :).  Second: stateful services require a different mindset[*].  It’s cool that you are running your own databases!  But reading post mortems like this where the conclusion is “MongoDB sucks” makes me fucking grind my teeth.

The theme of the story is a Mongo upgrade gone south, but this is a post about principles.  And rainbows.

Extended Properties

Phil Factor has a detailed article on extended properties:

Extended properties are easy to read, thanks to a useful system view. Sys.extended_properties. However, they are a pain to create, update and delete; they rely on special stored procedures that have a syntax that isn’t at all intuitive for those of us without mutant mental powers. They have a limit of 7,500 characters but are actually stored in a SQL_variant so that DateTime, Approximate numeric, exact numeric, character, Unicode and binary information can be stored in it. Most of us use some sort of tool such as SSMS to maintain this documentation rather than to do it via SQL. The SQL is cumbersome.

Extended properties was an interesting idea but there was so little tooling available to make them really useful.  I don’t see that changing.


October 2016
« Sep