Press "Enter" to skip to content

Curated SQL Posts

The Value Of Sparse Columns

Erin Stellato discusses sparse columns:

In conclusion, we see a significant reduction in disk space and IO when sparse columns are used, and they perform slightly better than non-sparse columns in our simple data modification tests (note that retrieval performance should also be considered; perhaps the subject of another post).

Sparse columns have a lot of potential value, but in my experience, they fall short in one huge way:  you cannot compress tables with sparse columns.  Given that both sparse columns and data compression are things which benefit from scale, it’s important to make the right choice upfront.

Comments closed

Lineage Improvements

Andy Leonard shows LineageId improvements over the years:

SSIS 2016 CTP3.3 offers a solution. First, there are now two new columns in the SSIS Data Flow Component Error Output – ErrorCode – Description and ErrorColumn – Description:

The new columns provide extremely useful (plain language) error metadata that will, in my opinion, greatly reduce the amount of time required to identify data-related load failures in the Data Flow Task.

But that’s not all. If you configure a log to capture the DiagnosticEx event, you will receive a message that provides the Data Flow column ID. To have a look, add a new log to an SSIS package that contains a configured Data Flow Task. On the Details tab, select the DiagnosticEx event:

Backtracing LineageId was a painful experience for me, so I’m happy that they’re making this better.

Comments closed

Copying SSIS Packages

Andy Galbraith shows how to copy SSIS packages using DTUTIL:

A frequent need when performing a server migration is to copy the SSIS packages from one server to a new server.  There are a couple of different ways to do this, including a wizard in SSMS. (See  The catch to this is that these are manual and they only move one package at a time.

I recently had to migrate a server with over twenty packages, and I knew I didn’t want to click-click-click over and over again.  🙂

The best answer would be to have your packages safe and secure in source control, but sometimes that’s not an option.

Comments closed

Bulk Loading Text Files With Line Feeds

Steve Jones runs into a scenario in which he wants to bulk load a file not in standard Windows CRLF format:

That’s not good. I suspected this was because of the format of the file, so I added a row terminator.

BULK insert MyTable
from ‘C:\SampleFiles\input.txt’
with ( ROWTERMINATOR = ‘\r’)

That didn’t help. I suspected this was because of the terminators for some reason. I also tried the newline (\n) terminator, and both, but nothing worked.

Since I was worried about formatting, I decided to look at the file. My first choice here is XVI32, and when I opened the file, I could see that only a line feed (0x0A) was used.

Little annoyances like this make me more appreciative of Integration Services (and its mess of little annoyances…).

Comments closed

Runaway Queries

Aaron Bertrand talks about “runaway” queries:

Sometimes people will start a query, wait five seconds, and then declare that this must be a runaway query. Sometimes a query’s runtime can vary based on other things going on in the system, so four seconds on one run and six seconds on another is not necessarily “running forever.” Be sure that you’ve given a query adequate time to start returning results before giving up on it, and remember that Management Studio might seem “stuck” before it starts to render any grid results, especially if the resultset is large. If you feel you’ve waited a reasonable amount of time, and you’ve tried both Results to Grid and Results to Text, then…

This is a big question and can take years of experience to get correct.  Aaron’s post is introductory-level on purpose and does a great job of answering the initial “what are some things I can try?” question after you determine that yes, there is a problem.

Comments closed

Plan Explorer Row Estimates

Joey D’Antoni shows how in SQL Sentry Plan Explorer, estimated counts can differ from what SSMS shows:

I quickly honed into the fact that the bad query was doing a lazy spool with 6 BILLION rows versus the 229 million in the good query. However, my friend who was looking at Management Studio was asking where I was getting those numbers. This is what the XML from the original plan showed:

<RelOp AvgRowSize=”51″ EstimateCPU=”0.00889634″ EstimateIO=”0.01″ EstimateRebinds=”0″ EstimateRewinds=”139581″ EstimatedExecutionMode=”Row” EstimateRows=”48868″
LogicalOp=”Lazy Spool” NodeId=”55″ Parallel=”true” PhysicalOp=”Table Spool” EstimatedTotalSubtreeCost=”1242.86″>

This is a helpful feature in scenarios like this, where operator weight is skewed because it only shows a single run but in reality happens more than once.

Comments closed

Embedded Power BI Dashboard

Rob Collie has a great example of what Power BI can do with his embedded, “real-time” (realish-time, at the very least) analytics dashboard:

Yeah, that’s a DAX-powered, Power BI dashboard, right here in our website – a website that runs on WordPress, which is Linux for crying out loud.  Don’t know what Linux is?  No worries, just translate it as “there’s zero Microsoft software behind, and yet – BAM!  Power BI, right here!”

And the dashboard in question is a near-real-time view of the traffic on this very site!  Check back in an hour and you will be able to “see” yourself on the map (especially easy if you use one of the “rarer” browsers.)

Check out the technical walkthrough if you’re interested in doing something similar yourself.

Comments closed

Shrinking TempDB

Tara Kizer shows how to shrink a recalcitrant tempdb:

I came across this solution recently when I had to shrink tempdb. I tried shrinking each of the 8 data files plus CHECKPOINTs, repeatedly. It would not budge. I almost threw in the towel and emailed out that the space issue would be fixed during our next patching window, but then I found David Levy’s reply. DBCC FREEPROCCACHE worked like a charm.

Word of warning:  understand what FREEPROCCACHE does before running it.  In an emergency like the scenario Tara describes, the benefit outweighs the cost, but do be aware that there is a cost.

Comments closed

Columnstore Index Compression Delay

Niko Neugebauer found a “compression delay” option on columnstore indexes in SQL Server 2016 CTP 3.2:

In SQL Server 2016 the OLTP Systems have received a significant improvement – support for the Columnstore Indexes (disk-based Nonclustered Columnstore & In-memory based Clustered Columnstore).
In both cases we have as the base the underlying OLTP-style table, with a Delta-Store object (or Tail Row Group for InMemory tables), that will hold the new data being inserted or updated by the final users. The data that is being frequently updated in OLTP-style systems is called Hot Data. The data that just being inserted into your table is definitely a Hot Data.
The important moment for the table is when the data becomes Cold or mostly infrequently read-accessed, and meaning that it can be compressed into Columnstore format.

This does seem interesting and can be very helpful in using columnstore indexes across different data patterns.

Comments closed