Press "Enter" to skip to content

Curated SQL Posts

Plan Comparison

Max Vernon looks at plan comparison in SQL Server Management Studio:

The table has 10,000,000 rows. I’ve create a non-clustered columnstore index on the table, which I’ll talk about in a future post. I’ve included it here because it provides a succinct difference in the two plans.

To compare the plans visually, side-by-side, you need to save the first plan by right-clicking on the plan window, clicking “Save Execution Plan As…”, and specifying a filename. Next, right-click on the plan window, and choose “Compare Showplan”:

I’ve only used this once or twice, but it is an interesting feature.

Comments closed

Hadoop In The Cloud

Peter Coates talks about pros and cons to Hadoop in the cloud:

Hadoop was developed for deployment over Linux running on bare metal. Cloud deployment implies virtual machines, and for Hadoop it’s a huge difference.

As detailed in other articles (for instance, Your Cluster Is an Appliance or Understanding Hadoop Hardware Requirements), bare-metal deployments have an inherent advantage over virtual machine deployments. The biggest of these is that they can use direct attached storage, i.e., local disks.

Not every Hadoop workload is storage I/O bound, but most are, and even when Hadoop seems to be CPU bound, much of the CPU activity is often either directly in service of I/O, i.e., marshaling, unmarshaling, compression, etc., or in service of avoiding I/O, i.e., building in-memory tables for map-side joins.

Read the whole thing.

Comments closed

String Splitting And Concatenation

Aaron Bertrand and Steve Hughes talk about string splitting, and Aaron also discusses string concatenation.  First Aaron:

That may not look like a massive simplification, but don’t forget about all the logic buried behind the table-valued function in the first example. And if you’re like several shops I know, if you look across your codebase and see all the messy uses you have for either of these methods, the benefits should be even more clear – and testing should bear that the performance savings compared to traditional, expensive methods are the sweetest part of the deal.

And Steve:

The STRING_SPLIT function will return a single column result set. The column name is “value”. The datatype will be NVARCHAR for strings that are NCHAR or NVARCHAR. VARCHAR is used for strings that are CHAR or VARCHAR types.

These two functions are small, but come in handy quite frequently.

Comments closed

Galaxy Classification With SQL Server

David Smith points out a nice Microsoft demo for classifying galaxies using SQL Server:

The SQL Server Blog has since published a step-by-step tutorial on implementing the galaxy classifier in SQL Server (and the code is also available on GitHub). This updated version of the demo uses the new MicrosoftML package in Microsoft R Server 9, and specifically the rxNeuralNet function for deep neural networks. The tutorial recommends using the Azure NC class of virtual machines, to take advantage of the GPU-accelerated capabilities of the function, and provides details on using the SQL Server interfaces to train the neural netowrk and run predictions (classifications) on the image database. For the details, follow the link below.

If you’re going to get into SQL Server R Services at any level of seriousness, I highly recommend R Tools for Visual Studio, as it will make building those external stored procedure calls much easier.

Comments closed

AT TIME ZONE And Reports

Rob Farley shows how to use AT TIME ZONE without sacrificing performance:

Because how am I supposed to know whether a particular date was before daylight saving started or after? I might know that an incident occurred at 6:30am in UTC, but is that 4:30pm in Melbourne or 5:30pm? Obviously I can consider which month it’s in, because I know that Melbourne observes daylight saving time from the first Sunday in October to the first Sunday in April, but then if there are customers in Brisbane, and Auckland, and Los Angeles, and Phoenix, and various places within Indiana, things get a lot more complicated.

To get around this, there were very few time zones in which SLAs could be defined for that company. It was just considered too hard to cater for more than that. A report could then be customised to say “Consider that on a particular date the time zone changed from X to Y”. It felt messy, but it worked. There was no need for anything to look up the Windows registry, and it basically just worked.

But these days, I would’ve done it differently.

Now, I would’ve used AT TIME ZONE.

Read on for the scenario.

Comments closed

Instant File Initialization In DMVs

Rodney Landrum shows off a couple new columns in SQL Server 2016 SP1 DMVs:

Microsoft announced many new features in SQL Server 2016 SP1 and the fanfare was mostly centered around the Enterprise features now available in SQL Server 2016 Standard Edition.  Many may have missed some hidden gems in the announcement.  Two of these are columns added to the existing DMVs, sys.dm_server_services and sys.dm_os_sys_info. The columns provide information for two specific features that previously had to be gathered by opening gpedit.msc and/or scrolling through SQL error logs. I am referring to Lock Pages in Memory and Instant File Initialization (enabled via Perform Volume Maintenance Tasks privilege).

It is now possible to simply query the DMVs to determine if these are being used for the running SQL Server instance.

Click through for the details.

Comments closed

Session Context

Ewald Cress looks at SESSION_CONTEXT() as a replacement for CONTEXT_INFO():

SESSION_CONTEXT() brings two major innovations. Firstly, it replaces a 128-byte scalar payload with a key-value structure that can accommodate 256kB of data. You can really go to town filling this thing up.

The second change is less glamorous, but possibly more significant: it is possible to set an entry to read-only, meaning that it can safely be used for the kind of contextual payload you don’t want tampered with. This makes me happy, not because I currently have a great need for it, but because it neatly ties in with things I have been thinking about a lot lately.

Read on for more.

Comments closed

Calculated Dimensions

Ginger Grant shows how calculated dimensions can solve the classic role-playing dimension problem in Analysis Services Tabular:

Working with role playing dimensions, which are found when you have say multiple dates in a table and you want to relate them back to a single date table, have always been problematic in SQL Server Analysis Services Tabular. Tabular models only allow one active relationship to a single column at a time. The picture on the left shows how tabular models represent a role playing dimension, and the model on the right is the recommended method for how to model the relationships in Analysis Services Tabular as then users can filter the data on a number of different date tables.

The big downside to this is one has to import the date table into the model multiple times, meaning the same data is imported again and again. At least that was the case until SQL Server 2016 was released. This weeks TSQL topic Fixing Old Problems with Shiny New Toys is really good reason to describe a better way of handling this problem.

Read on for how to implement calculated dimensions.

Comments closed