2016-06-24 – Curated SQL

I will share a little secret with you – it’s all about the Batch Execution Mode in SQL Server 2014: all those Hash Match iterators are running in Batch Mode, even though we are not using Columnstore Index anywhere.
In SQL Server 2016 this old (since 2012) functionality has been removed and once you are running your queries in the compatibility level of 130 (SQL Server 2016), your queries that were taking advantage of it – will be running significantly slower.
There is a fast & brutal solution for that problem – set your compatibility level to 120, but do not go there until you have understood all the implications: some of the most important and magnificent improvements for the Batch Execution Mode are functioning only if your database is set to compatibility level 130: single threaded batch mode, batch sorting, window functions, etc.
From what I know, there is no way you can have all of those functionalities working together under the same hood and enjoy the old way of getting Batch Execution Mode without the presence of the Columnstore Index.

The conclusion is a bit of a downer. Read the whole thing.

Comments closed

Service Broker Networking

Published 2016-06-24 by Kevin Feasel

Colleen Morrow discusses endpoints and routes in Service Broker:

One of the first questions you might ask when distributing Service Broker solutions across multiple machines is “how does SQL Server know where the other service is?” And that’s where routes come in. When we distribute a Service Broker solution, we use routes to tell SQL Server the server name and endpoint port of a remote service on the network.
For example, in our taxes solution, we would create a route in the Taxpayer database that points to the IRS service, and a route in the IRS database that points to the Taxpayer service

Good stuff. A big part of Service Broker’s value is its ability to communicate across servers, not just databases on the same instance.

Comments closed

Taxi Rides

Published 2016-06-24 by Kevin Feasel

Mark Litwintschik has an ongoing taxi ride data analysis series. This time, he gives PostgreSQL a run:

For this workload the reporting speeds don’t line up well with the price differences between the RDS instances. I suspect this workload is biased towards R’s CPU consumption when generating PNGs rather than RDS’ performance when returning aggregate results. The RDS instances share the same number of IOPS each which might erase any other performance advantage they could have over one another.
As for the money spent importing the data into RDS I suspect scaling up is more helpful when you have a number of concurrent users rather than a single, large job to execute.

This is an interesting series Mark has going.

Comments closed

Data Protection Manager

Published 2016-06-24 by Kevin Feasel

Tom Roush discusses gotchas around Microsoft’s Data Protection Manager:

You’ve got DPM installed, and for the most part, configured. It’s working, but you have transaction log drives filling up on some of your servers, and it’s not really clear why.
Wanna know why?
Here’s the answer:
It’s because the UI is very unclear, because the documentation is unclear, (there was a hint of it on page 83) and because the things that would be obvious to a DBA simply aren’t mentioned.

Tom has a very detailed post on the topic, making it a must-read if you use this tool.

Comments closed

Hack Those P Values!

Published 2016-06-24 by Kevin Feasel

Ned Bicare provides us a sure-fire method for getting our academic papers published:

“If you torture the data long enough, it will confess.”

This aphorism, attributed to Ronald Coase, sometimes has been used in a disrespective manner, as if it was wrong to do creative data analysis.
In fact, the art of creative data analysis has experienced despicable attacks over the last years. A small but annoyingly persistent group of second-stringers tries to denigrate our scientific achievements. They drag psychological science through the mire.

Ned has a great tool to play around with as well, letting us Statistics our way to academic success.

Comments closed

Going From Trace To Extended Events

Published 2016-06-24 by Kevin Feasel

Erin Stellato shows that SQL trace events map pretty closely to Extended Events:

Every time I talk about migrating from Profiler and Trace to Extended Events (XE), I seem to add something to my presentation. It’s not always intentional, and I don’t know if it’s because I just can’t stop tweaking my demos and contents, or something else, but here in London at IEPTO2 this week, I added a query that helps you see what event in Extended Events is comparable to the event you’re used to using in Trace/Profiler. While most of the events in XE have a name similar to the event in Trace (e.g. sp_statement_completed for SP:StmtCompleted), some mappings aren’t so intuitive. For example, SP:Starting in Trace is module_start in XE, and SP:Completed in Trace is module_end in XE. That’s not so bad, but if you’re ever monitoring file growths, note that the database_file_size_change event in XE is the event for the following four events in trace: Data File Auto Grow, Data File Auto Shrink, Log File Auto Grow, and Log File Auto Shrink.

This is a helpful query to keep around until you get really familiar with Extended Events.

Comments closed

Database Snapshots

Published 2016-06-24 by Kevin Feasel

Kenneth Fisher discusses database snapshots:

Here is where it starts getting interesting. A snapshot initially takes up little to no space. As changes are made to the source database the snapshot grows in size. In fact the snapshot is the size of all of the pages changed in the source database since the creation of the snapshot. Basically as a page is changed in the source database a copy of the original page is made and stored in the snapshot, but only the first time. (Note: The files used to store these pages are called sparse files.) This means that if you change the same page over and over again it will only be written to the snapshot once. It then logically follows that the largest a snapshot can get is the size of the source database at the time the snapshot was taken. Since most of the time we change a very small portion of the database at any given point in time this means that snapshots tend to be much smaller than the source database. In fact you could load millions of rows into the source database (assuming they are mostly/all in new pages) and it will have little to no effect on the size of the snapshot.

My favorite use of database snapshots was so developers could test their changes in QA and then revert back to a pre-snapshot environment. That way, they could preserve data for future runs.

Comments closed

Power BI Sequences

Published 2016-06-24 by Kevin Feasel

Chris Webb shows how to create specific numerical and character sequences in Power BI:

It’s also possible to use this technique to create lists of characters. For example, the expression:
{“a”..”z”}
Returns a list containing all of the lowercase letters of the alphabet

There are a few interesting ways of generating sequences in M, some of them (as the first commenter notes) akin to Python’s sequence methods.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Day: June 24, 2016

Columnstore Batch Mode Changes

Service Broker Networking

Taxi Rides

Data Protection Manager

Hack Those P Values!

Going From Trace To Extended Events

Database Snapshots

Power BI Sequences