Columnstore Parallelism

Sunil Agarwal shows how clustered columnstore indexes take advantage of parallelism:

Do you need to be concerned about that a delta rowgroup is scanned single threaded? The answer is NO for two reasons (a) most columnstore indexes have very few delta rowgroups (b) if you have multiple delta rowgroups, they can be scanned in parallel with one thread per delta rowgroup

I have a beef with (a), at least for SQL Server 2014, but that’s a story for another day.

Sunil has a follow-up post on parallel bulk import:

Recall that on rowstore tables (i.e. the tables organized as rows not as columnstore), SQL Server requires you to specify TABLOCK for parallel bulk import to get minimal logging and locking optimizations. One key difference for tables with clustered columnstore index is that you don’t need TABLOCK for getting locking/logging optimizations for bulk import. The reasons for this difference in behavior is that each bulk import thread can load data exclusively into a columnstore rowgroup. If the batch size < 102400, then the data is imported into a delta rowgroup otherwise a new compressed rowgroup is created and the data is loaded into it. Let us take two following interesting cases to show this bulk import behavior. Assume you are importing 4 data files, each with one bulk import thread, concurrently into a table with clustered columnstore index

The “don’t use TABLOCK” is interesting in comparison to rowstore tables.

NOCOUNT Performance Differences

Aaron Bertrand looks into whether SET NOCOUNT ON provides a performance improvement:

For years, I had been operating under the impression that SET NOCOUNT ON; was a critical part of any performance strategy. This was based on observations I had made in, arguably, a different era, and that are less likely to manifest today.

Check out the comments as well.  This is an interesting conundrum as there’s a lot of ingrained knowledge that SET NOCOUNT ON is faster (and I admit that I thought I remembered it being the case when going through loops), but people have had limited success in coming up with a scenario in which it makes an appreciable difference.

SSRS Report Automation

Jeffrey Verheul has a post showing how to automate Reporting Services report generation in .NET:

But SSRS can also have text-fields as input for your report. These can also be added to the URL. Just like the parameters above, you just add the parameter name and value to the URL: “http:// [servername] :80/ReportServer/Pages/ReportViewer.aspx?%2fTest%2fTestReport&From=2015-12-01&To=2015-12-08&FreeText=This is a test…&rs:Command=Render”.

After some testing I’ve found out that you can use any character in the text parameter you want to, except for the &-sign. If you use that, SSRS will think it’s a parameter or command and won’t accept the URL. And there’s also the (browser) limitation of the URL length. Testing proves that the limit is 7926-7931 characters. If your URL is below 7926 characters, it works like a charm. If you go above that (between 7926 and 7931) the behavior of SSRS gets buggy, and above 7931 characters SSRS will throw an exception.

The trick here is that SSRS has a nice web service, so once you’re familiar with it, generating reports is easy.

Maintaining SSISDB

Ginger Grant shows us how to manage SSISDB:

A client asked me recently why he should back up the SSISDB database. While you can recreate everything inside of the SSISDB, it will take time and you will have to remember exactly how all of your variables were set. Restoring the backup decreases this issue and having a backup allows a server to be redeployed quickly. When you do back up the database, make sure that you remember to backup the database certificate, which is created when the SSISDB is created as well, as you will need this to do a restore. By default. the recovery model of the SSISDB is set to Full. If the packages in SSISDB are changing minute by minute, full would make sense, but given that an SSISDB contains packages which are run on a scheduled basis, most likely the changes made are infrequent. Change the recovery model to simple.

SSISDB is a real database, just like ReportServer, so don’t neglect it just because you didn’t create it.

Specifying Statistic Row Counts

Daniel Hutmacher shows how to fake row counts using the UPDATE STATISTICS command:

In this case, the only change was that the query went parallel, which introduces a few more operators. However, a lot can change in a query when you scale the volume, and quite often, the entire layout of the plan can change dramatically.

Note that the query optimizer not only considers the rowcount, but often also the page count (which translates to how many megabytes of data need to be moved), so you may do well to include WITH PAGECOUNT as well.

This isn’t something I’ve ever done, but could be interesting in some scenarios, such as finding out how an application will run as the database grows.


Dave Ballantyne discusses the ASYNC_NETWORK_IO wait stat:

Simply put ASYNC_NETWORK_IO waits occur when SQL Server is waiting on the client to consume the output that it has ‘thrown’ down the wire.  SQL Server cannot run any faster, it has done the work required and is now waiting on the client to say that it has done with data.

Naturally there can be many reasons for why the client is not consuming the results fast enough , slow client application , network bandwidth saturation, to wide or to long result sets are the most common and in this blog I would like to show you how I go about diagnosing and demonstrating these issues.

Dave goes on to explain this using Management Studio examples, but the information also applies to other client applications.

DBCC Run Frequency

Erik Darling talks about DBCC CHECKDB frequency:

The shorter the period of time you keep backups, the more often you need to run DBCC CHECKDB. If you keep data for two weeks, weekly is a good starting point. If you take weekly fulls, you should consider running your DBCC checks before those happen. A corrupt backup doesn’t help you worth a lick. Garbage backup, garbage restore. If your data only goes back two weeks, and your corruption goes back a month, best of luck with your job search.

Erik provides some good guidelines on where to begin, but as always, your answer will depend upon your particular circumstances.

Faking Reads And Writes

Kendra Little shows us how to how to fake reads and writes:

Trainers and speakers need the code they write to be predictable, re-runnable, and as fast as possible. Faking writes can be useful for speakers and teachers who want to be able to generate some statistics in SQL Server’s index dynamic management views or get some query execution plans into cache. The “faking” bit makes the code re-runnable, and usually a bit faster. For writes, it also reduces the risk of filling up your transaction log.

I didn’t invent either of the techniques used below. Both patterns are very common and generic, and so simple that no origin is known.

This isn’t applicable to everyone, but if you’re giving a presentation and want to simulate data access, these are good techniques.

DBCC Scales Better In 2016

The CSS SQL Engineers have a new series called “It Just Runs Faster.”  In their first post, they discuss DBCC improvements:

Internally DBCC CHECK* uses a page scanning coordinator design (MultiObjectScanner.)  SQL Server 2016 changes the internal design to (CheckScanner), applying no lock semantics and a design similar to those used with In-Memory Optimized (Hekaton) objects, allowing DBCC operations to scale far better than previous releases.

Making DBCC checks significantly faster for large databases makes administration that much easier.  I’m looking forward to seeing this.  James Anderson, however, is concerned that things might be worse at the extreme low end.

Blocked Process Report Viewer

Michael J Swart has updated the Blocked Process Report Viewer:

The viewer can consume blocked process report events captured by any extended events session as long as that session has a target of ring_buffer or event_file. For example, if you set up your extended events session the way Jeremiah Peschka did in Finding Blocked Processes and Deadlocks using SQL Server Extended Events. Then you would use the viewer like this:

Check out the report.


February 2016
« Jan Mar »