Press "Enter" to skip to content

Author: Kevin Feasel

Memory Requirements For Columnstore Rebuild/Reorg

Niko Neugebauer looks at memory requirements for rebuilding and reorganizing columnstore indexes:

To spare all the Wows & how can’s, Microsoft was well aware of this problem and has delivered a solution with Cumulative Update 3 for SQL Server 2016 with Service Pack 1:
FIX: SQL Server 2016 consumes more memory when you reorganize a columnstore index, and here it is – a new trace flag 6404 (documented in the link and thus should be supported), that will allow you to lower the memory requirements for the ALTER INDEX … REORGANIZE command.
Let’s take it for the test, by once again running the setup workload for the FactOnlineSales_Reindex table and then executing the following command, enabling the Trace Flag 6404 and then reorganising our Clustered Columnstore Index:

This is a rather interesting post and once again makes me wish that clustered columnstore indexes could be rebuilt online.

Comments closed

Side Effects Of Selects

Paul Randal describes a few things that can change behind the scenes when you run a SELECT query:

Statistics Update

If the database property Auto Update Statistics is set to True, when a query is being compiled and a necessary statistic is determined to be out-of-date, it will be automatically updated before optimization continues, thus changing the database. Your SELECT statement could cause this to happen. Additionally, if the Auto Update Statistics Asynchronously property is enabled, the statistic will be automatically updated, but after the optimization process (so the compiling query doesn’t have to wait).

Read on for a few more activities.

Comments closed

Trace Flags Used With Query Store

Erin Stellato describes two Query Store trace flags:

Microsoft maintains a list of supported trace flags and I noticed that there are two new ones related to Query Store: 7745 and 7752.  The descriptions for these Query Store Trace Flags are pretty straight-forward, but for those of you not familiar with Query Store, I thought I’d provide some context and details.

Click through for the descriptions of these two trace flags.

Comments closed

Using sparklyr

Hossein Falaki and Xiangrui Meng show how to use sparklyr on a Databricks Spark cluster:

We collaborated with our friends at RStudio to enable sparklyr to seamlessly work in Databricks clusters. Starting with sparklyr version 0.6, there is a new connection method in sparklyr: databricks. When calling spark_connect(method = "databricks") in a Databricks R Notebook, sparklyr will connect to the spark cluster of that notebook. As this cluster is fully managed, you do not need to specify any other information such as version, SPARK_HOME, etc.

I’d lean toward sparklyr over SparkR because of the former’s tidyverse-centric view.

Comments closed

Play Axis Custom Visual

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Play Axis Power BI Custom Visual.  The Play Axis visual works like a dynamic slicer that animates your other report visuals without needing to click every time you want to change your filter value.

This is a valuable custom visual when dealing with time series data, but as Devin shows, you can iterate through other sets, like a set of employee names.

Comments closed

Linux Administrative Basics For The SQL Server DBA

David Klee continues his SQL Server on Linux series with a discussion of basic Linux installation and usage:

You’ll want to learn the syntax for one of the console-based text editors. My personal favorite is ‘vi‘. It’s quick, streamlined, but does have a significant learning curve. Emacs is another editor that works great. Many others are out there, and your options open even more if you’re using a GUI. You’ll need an editor to edit configuration files.

The folder structure of Linux is one of the biggest changes. Whereas Windows is based off of an arbitrary drive-letter assignment system that dates back to the DOS era, Linux is is based off of a tree structure. All folders and files are based on a single point, ‘/’ or the root folder, and everything is based off of folders from this point. Certain folders from Windows, such as C:\Windows, C:\Users\username, or %WINDOWSTEMP%, are mapped to certain folders within the Linux operating system.

This is really high-level stuff; if you’re looking at administering a Linux box in a production environment, I’d highly recommend taking the time to learn Linux in detail.

Comments closed

Using WITH With OPENJSON

Jovan Popovic points out the performance difference in using the WITH clause in an OPENJSON query:

 

Here are results of the queries:

SQL Server Execution Times:
 CPU time = 656 ms, elapsed time = 651 ms.

SQL Server Execution Times:
 CPU time = 204 ms, elapsed time = 197 ms.

As you can see, WITH clause specify that OPENJSON should immediately return properties from the JSON array without second parsing. Performance of the queries might be increased 3 times if you avoid double parsing.

That’s a pretty big difference when you specify the relevant data model elements.

Comments closed

Making Calls With IoT Hub

Rolf Tesmer combines Azure IoT Hub with Twilio to make phone calls based on incoming messages:

When the IoT Hub is created you will get an endpoint hosted in Azure.  This is the target for the JSON events being generated from the mobile device.

Azure IoT Hubs are more complex than an Azure Event Hub, perform a lot more device based functions and also have stronger security capability.  However, operationally they work pretty much the same.  

If you want to learn more about the differences between the two Hubs then this is a great article – https://docs.microsoft.com/en-us/azure/iot-hub/iot-hub-compare-event-hubs

It’s a neat tutorial for a fun weekend project.

Comments closed

Reporting Services Report Schedules

Jason Brimhall has a doozy of a query for figuring out SQL Server Reporting Services report schedules:

In pulling the data together from the two sources, I opted to return two result sets. Not just two disparate result sets, but rather two result sets that each pertained to both the agent job information as well as the ReportServer scheduling data. For instance, I took all of the subscriptions in the ReportServer and joined that data to the job system to glean information from there into one result set. And I did the reverse as well. You will see when looking at the query and data. One of the reasons for doing it this way was to make this easier to assimilate into an SSRS style report.

There’s a 680-line script ahead.

Comments closed

Scaling CheckTable With Respect To CPUs

Lonny Niederstadt has a couple of posts on scaling DBCC CHECKTABLE based on degree of parallelism.  First, he looks at running the command with physical_only:

So we can use this formula when dop, elapsed_ms, and cpu_ms are known:

DOP * elapsed_ms = cpu_ms + idle_ms

That allows the 8 checktable operations to be summarized in this graph.  From DOP1 to DOP 8 the cpu_ms of the operation is extremely steady.  From DOP 1 to DOP 4 there are significant decreases in elapsed time as dop increases.  After dop 4, reduction in elapsed time is slight.  Throughout the tested range, idle_ms increased at a nearly linear rate.

In his second post, he looks at full CHECKTABLE runs and not just physical_only:

So the good news for today is that checktable operations without the physical_only option scale farther/better on my test tables than checktable with physical_only.  While with physical_only scaling benefits in elapsed time are primarily seen only to dop 4, without the physical_only option elapsed time benefits to increasing dop extend at least to dop 8.
And we saw that the shape of scalability graphs is pretty volatile 🙂  That’s largely because modest changes in elapsed time are multiplied by dop in this calculation to arrive at the idle_ms number – that idle_ms number is the one that changes shape most readily.

These are prologue posts to a discussion on the OLEDB wait type.

Comments closed