Press "Enter" to skip to content

Curated SQL Posts

What’s New With In-Memory OLTP In SQL Server 2019

Ned Otter gives us two things to look forward to with SQL Server 2019:

So far, there’s been only one publicly announced enhancement for In-Memory OLTP in SQL 2019: system tables in TempDB will be “Hekatonized”. This will forever solve the issue of system table contention in TempDB, which is a fantastic use of Hekaton. I’m told it will be “opt in”, so you can use this enhancement if you want to, but you can also back out of it, which would require a restart of the SQL Server service.

But there’s at least one other enhancement that’s not been announced, although the details of its implementation are not yet known.

Read on to learn about this not-yet-announced update.

Comments closed

Deep Dive On Index Spools

Hugo Kornelis takes a look at index spools:

The Index Spool operator is one of the four spool operators that SQL Server supports. It retains a copy of all data it reads in an indexed worktable (in tempdb), and can then later return subsets of these rows without having to call its child operators to produce them again.

The Index Spool operator is quite similar to Table Spool, except that Index Spool indexes its data, giving it the option to return a subset, and Index Spool lacks the option to read data from a spool created by another operator. The other two spool operators are quite different: Row Count Spool is optimized for specific cases where the rows to be returned are empty, and Window Spool is used to support the ROWS and RANGE specifications of windowing functions.

Eager and Lazy Spool operators rank high on my list of “troublesome when I see them” operators.  The reason is not so much that eager or lazy spools are inherently bad—they’re not, as they are efficient ways to perform a particular query given the constraints of that query—but if I see one of them in conjunction with a slowly-performing query, it’s a good sign that I want to optimize away the need for spooling.

Comments closed

Optimizing M Function Calls With Function.ScaleVector()

Chris Webb shows us how we can batch calls to M-driven web services:

One of the most common issues faced when calling web services in M is that the the easiest way of doing so – creating a function that calls the web service, then calling the function once per row in a table using the Invoke Custom Function button – is very inefficient. It’s a much better idea to batch up calls to a web service, if the web service supports this, but doing this forces you to write more complex M code. It’s a problem I wrestled with last year in my custom connector for the Cognitive Services API, and in that case I opted to create functions that can be passed lists instead (see here for more information on how functions with parameters of type list work); I’m sure the developers working on the new AI features in dataflows had to deal with the same problem. This problem is not limited to web services either: calculations such as running totals also need to be optimised in the same way if they are to perform well in M. The good news is that there is a new M function Function.ScalarVector() that allows you to create functions that combine the ease-of-use of row-by-row requests with the efficiency of batch requests.

As Chris notes in the post and in the comments, this is mostly useful when you can batch together individual calls to improve performance.  For functions which operate serially (like opening Excel workbooks), you won’t see much (if any) gain.

Comments closed

Monitoring When Databases Go Offline

Jason Brimhall shows how you can create an extended event to track whenever databases go offline:

The other day, I shared an article showing how to audit database offline events via the default trace. Today, I will show an easier method to both audit and monitor for offline events. What is the difference between audit and monitor? It largely depends on your implementation, but I generally consider an audit as something you do after the fact. Monitor is a little more proactive.

Hopefully, a database being taken offline is a known event and not a surprise. Occasionally there are gremlins, in the form of users with too many permissions, that tend to do very strange things to databases and database servers.

Click through for the script.

Comments closed

Working With The Databricks API Via Powershell

Gerhard Brueckl has a Powershell module for interacting with Databricks, either Azure or AWS:

As most of our deployments use PowerShell I wrote some cmdlets to easily work with the Databricks API in my scripts. These included managing clusters (create, start, stop, …), deploying content/notebooks, adding secrets, executing jobs/notebooks, etc. After some time I ended up having 20+ single scripts which was not really maintainable any more. So I packed them into a PowerShell module and also published it to the PowerShell Gallery (https://www.powershellgallery.com/packages/DatabricksPS) for everyone to use!

This looks like a pretty good module if you work with Databricks.

Comments closed

Management Studio 18 Preview 5 Released

Dinakar Nethi announces a new public preview of SQL Server Management Studio 18:

We are very excited to announce the release of Public Preview 5 of SQL Server Management Studio (SSMS) 18.0. This release has a number of new features and capabilities and several bug fixes across SQL Server Management Objects (SMO), UI, etc.

You can download SSMS 18.0 Public Preview 5 here.

The most interesting thing in it for me is probably the menu item for CREATE OR ALTER with scripts.

Comments closed

Avoid Key Lookups On Clustered Columnstore Indexes

Joey D’Antoni points out a potential big performance problem with clustered columnstore indexes:

In the last year or so, with a large customer who makes fairly heavy use of this pattern, I’ve noticed another concern. Sometimes, and I can’t figure out what exactly triggers it, the execution plan generated, will do a seek against the nonclustered index and then do a key lookup against the columnstore as seen below. This is bad for two reasons–first the key lookup is super expensive, and generally columnstores are very large, secondly this key lookup is in row execution  mode rather than batch and drops the rest of the execution plan into row mode, thus slowing the query down even further.

Joey also has a UserVoice item as well, so check it out.

Comments closed

Managing Powershell Core On Non-Windows Machines

Max Trinidad shows us how to grab the latest version of Powershell Core if you aren’t using Windows:

So, if PowerShell Core isn’t available in the package repository, with a few steps you can download and install PowerShell. But, the first thing I do is to remove it before installing.

Ubuntu

## - When PowerShell Core isn't available in their repository: (download and execute install)
cd Downloads
wget https://github.com/PowerShell/PowerShell/releases/download/v6.1.1/powershell_6.1.1-1.ubuntu.18.04_amd64.deb
sudo dpkg -i powershell_6.1.1-1.ubuntu.18.04_amd64.deb

## - When available in Apt/Apt-Get repository:
sudo apt install -y powershell #-> Or, powershell-preview

Click through for demos of CentOS (or any other yum-based system) and MacOS X.

Comments closed

Power BI Request: Subtotal Details At The Bottom Of A Section

Imke Feldmann points out a problem with trying to use Power BI to generate financial reports:

Although this might not be what the inventors of Power BI had in mind, large lots of folks are trying to create classical financial statements in it. And putting aside the afford that might go into getting the numbers right, there is still a major drawback to swallow:

Click through for a depiction of the problem and then go vote for this on Power BI Ideas.

Comments closed

Using Query Store To Diagnose Implicit Conversion Issues

Tom Norman shares a case study of using Query Store to fix a nasty implicit conversion problem:

A while ago, we contracted with a third party to start using their software and database with our product.  We put the database in Azure but within a year, the database grew to over 250 gigs and we had to keep raising the Azure SQL Database to handle performance issues.  Due to the cost of Azure, a decision was made to bring the database back on-premise.  Before putting the database on the on-premise SQL Server, the server was running with eight CPUs.  In production, we are running SQL Server 2016 Enterprise Edition. When we put the vendor database into production, we had to dramatically increase our CPUs in production, ending up with twenty-eight CPUs. Even with twenty-eight CPUs, during most of the production day, CPUs were running persistently at seventy-five percent. But why?

Tom takes us from symptom (high CPU utilization) to diagnosis and is able to provide the third-party vendor enough information to improve their product.

Comments closed