Press "Enter" to skip to content

Month: December 2016

Operator Elapsed Time

Kendra Little shows off a really cool feature in SQL Server 2016 & 2014 SP2:

SQL Server now shows Actual Elapsed CPU Time and Actual Elapsed Time (duration) for each operator in an Actual Execution Plan

For SQL Server 2016 and 2014 SP2 and higher, actual execution plans contain a bunch of new information on each operator, including how much CPU they burn, how long it takes, and how much IO is done by that operator. This was a little hard to use for a while because the information was only visible in the XML of the execution plan.

Check out Kendra’s post for more details, including a couple caveats.

Comments closed

New T-SQL Features

Slava Murygin looks at some new functions in the vNext CTP 1.1:

Since Microsoft introduced XML support in SQL Server, the most common string concatenation technique was use of “XML PATH(”)” like this:

SELECT SUBSTRING(
(SELECT ‘, ‘ + name FROM master.sys.tables
FOR XML PATH(”))
,3,8000);
GO

Now you can aggregate your strings by using function “STRING_AGG”:

SELECT STRING_AGG(name, ‘, ‘) FROM master.sys.tables;

Read on for the other three.  This aggregation function, however, would make some of my code a lot simpler and easier to explain to junior database developers.  I just want it to perform well is all.

Comments closed

Finding Database Backup History

Jason Brimhall has a script to determine when your databases have been backed up:

I have also set the script to accept a database name parameter. If a name is provided, then only the backup history for that database is returned. If the parameter is left NULL, then the backup history for all databases will be returned. Additionally, I added a number of days parameter to limit the scope of the report to a specific range of days.

Among the data points returned in this script, you will note there is the duration of the backup, the date, and even the size of the backup. All of these attributes can help me to forecast future storage requirements both for the backup storage as well as for the data volume. Additionally, by knowing the duration of the backup and the trend of that duration, I can adjust maintenance schedules accordingly.

These types of reports are quite useful.  Aside from giving you useful information on database backups, it should also be a reminder to have a process which deletes older backup data over time so that your msdb database doesn’t grow to unsustainable sizes.

Comments closed

The Story Behind SQL Server On Linux

Slava Oaks is back blogging and gives us some insight on how SQL Server on Linux came to be:

To give you an idea of the effort involved, the SQL Server RDBMS and other services that ship with it in the SQL Server product suite account for more than 40 million lines of C++ code. Even though SQL Server has a resource management layer called SQLOS, the codebase bleeds Win32 semantics throughout. This means a pure port could take years just to get compiling and booting let alone figuring out things like performance and feature parity with SQL Server on Windows. In addition, doing a porting project while other SQL Server innovation is happening in the same codebase would have been a daunting task and keep the team in a close to endless catch-up game.

In conclusion, even though the potential job-offer intrigued me, it felt like an impossible task for one to take on.

It’s a great story, one which I never would have thought possible six years ago.

Comments closed

New Powershell And SQL Server Previews For Linux

Max Trinidad notes that there are new versions of Powershell and SQL Server previews available for Linux users:

To download the latest PowerShell Open Source just go to the link below:

https://github.com/PowerShell/PowerShell

Just remember to remove the previous version, and any existing folders as this will be resolved later.

To download the latest SQL Server vNext just check the following Microsoft blog post as the new CTP 1.1 includes version both Windows and Linux:

SQL Server next version Community Technology Preview 1.1 now available

Max has additional links and resources in that post as well.

Comments closed

Connecting Apache Drill To Power BI

Bryan Smith shows how to connect Apache Drill to Power BI:

Clicking Next takes me to the From ODBC dialog.  Here, I click on the Advanced options item, ignoring the Data Source Name (DSN) drop-down, and enter a connection string with the appropriate substitution for the host parameter:

driver={MapR Drill ODBC Driver};connectiontype=Direct;host=maprcluster-3xrrusnk-node0.westus.cloudapp.azure.com;port=31010;authenticationtype=No Authentication

Notice the connection string employs a Direct connection type, indicating that the app will speak directly to one of the nodes in the cluster (as identified by the host parameter) and not to the ZooKeeper service. ZooKeeper is in use on the cluster but is not exposed externally, given the network security group changes made during my earlier deployment.  Even if ZooKeeper were exposed, it tracks the nodes of the cluster using their internal names so that any app outside the virtual network containing the cluster would not be able to leverage the information in ZooKeeper to form a connection.  The only option that works here is the Direct connection type.

It’s worth reading the whole thing, as well as checking out the UserVoice suggestion for implementing full Apache Drill support.

Comments closed

Debugging Spark In HDInsight

Sajib Mahmood gives various methods for debugging Spark applications running on an HDInsight cluster:

Spark Application Master

To access Spark UI for the running application and get more detailed information on its execution use the Application Master link and navigate through different tabs containing more information on jobs, stages, executors and so on.

These methods also apply for on-prem Spark clusters, although the resource locations might be a little different.

Comments closed

Parallel Stats Sampling

SQL Scotsman shows which statistics-building operations are parallel and which are single-threaded:

“Starting with SQL Server 2016, sampling of data to build statistics is done in parallel, when using compatibility level 130, to improve the performance of statistics collection. The query optimiser will use parallel sample statistics whenever a table size exceeds a certain threshold.”

As per the previous demos, prior to SQL Server 2016, the gathering of sampled statistics are serial, single-threaded operations.  Only statistic operations gathered using a full scan would qualify for parallelism.  Now in SQL Server 2016, all automatic and manual statistic operations qualify for parallelism.

He also has a neat trick for invalidating stats on a large table, so check out this article-length blog post.

Comments closed

Understanding Azure SQL Elastic Pool

Vincent-Philippe Lauzon explains how SQL Elastic Pools work and why we might want to use them in Azure:

Along came Elastic Pool.  Interestingly, Elastic Pools brought back the notion of a centralized compute shared across databases.  Unlike on premise SQL Server on premise though, that compute doesn’t sit with the server itself but with a new resource called an elastic pool.

This allows us to provision certain compute, i.e. DTUs, to a pool and share it across many databases.

His example is using a large number of small databases, where the total load is never the sum of individual expected loads.  Another reason to use a pool is for cross-database queries in Azure.

Comments closed