Press "Enter" to skip to content

Curated SQL Posts

Offline Installation Of SQL Server 2017 ML Services

Jan Mulkens shows how to install SQL Server 2017 Machine Learning Services when your the server hosting SQL Server doesn’t have outbound internet access:

That’s when you remember it… Your server isn’t connected to the internet!
Pretty normal, but in your enthusiasm you completely forgot that SQL Server needs to download some binaries for the R and Python components you so desperately want on your precious machine!

Luckily, the installer comes to your rescue and shows you where to download those binaries it needs.
Turns out however… This link only is for one R component and the installer won’t let you pass to the next screen!

Read on for the answer.

Comments closed

Actual Execution Plan Enhancements

Pedro Lopes points out some additional data available in the properties section when you generate an actual execution plan:

Looking at the actual execution plan is one of the most used performance troubleshooting techniques. Having information on elapsed CPU time and overall execution time, together with session wait information in an actual execution plan allows a DBA to use showplan to troubleshoot issues away from the server, and be able to correlate and compare different types of waits that result from query or schema changes.

A few months ago we had introduced exposed in SSMS some of the per-operator statistics, such as CPU and elapsed time per thread. More recently, we have introduced overall query CPU and elapsed time tracking for statistics showplan xml (both in ms). These can be found in the root node of an actual plan. Available using the latest versions of SSMS v17, when used with SQL Server 2012 SP4SQL Server 2016 SP1 and SQL Server 2017. For SQL Server 2014 it will become available in a future Service Pack.

Also be sure to check out Geoff Patterson’s Connect item asking that the execution plan results show the top ten waits in descending order rather than ascending order.  That’s the appropriate ordering in my mind:  show me the most important things first.

Comments closed

Mislabeled Column In dm_os_sys_memory

Lonny Niederstadt points out that the definition of a column in the sys.dm_os_sys_memory DMV is incorrect:

Based on the column names and values above, seems natural to think:
total_page_file_kb – available_page_file_kb = used page file kb
11027476 kb – 3047668 kb = 7979808 kb

Holy cow! Is my laptop using nearly as much paging space as there is RAM on the laptop??
Weird. If something forced that much paging space use relative to RAM on the laptop… I certainly wouldn’t expect system_memory_state_desc = ‘Available physical memory is high’!!

Read on for an explanation of what the columns actually mean.

Comments closed

The Difficulties Of Memory-Optimized Tables

Michael J. Swart relays a cautionary table around using In-Memory OLTP:

We’re leaving the feature behind for a few reasons. There’s an assumption we relied on for the sardine servers: Databases that contain no data and serve no activity should not require significant resources like disk space or memory. However, when we turned on In Memory OLTP by adding the filegroup for the memory-optimized data, we found that the database began consuming memory and disk (about 2 gigabytes of disk per database). This required extra resources for the sardine servers. So for example, 1000 databases * 2Gb = 2Tb for a server that should be empty.

Another reason is that checkpoints began to take longer. Checkpoints are not guaranteed to be quick, but on small systems they take a while which impacts some of our Continuous Integration workflows.

Read the whole thing.  This technology definitely does not fit all use cases, and there are some painful limitations.  If it does fit, however, you’ll wonder how you lived without it.

Comments closed

Handling IoT Traffic With SQL Server

Perry Skountrianos builds a reference architecture for handling nearly one and a half million rows per second with SQL Server:

The following sample demonstrates the high scale and performance of SQL Database, with the ability to insert 1.4 million rows per second by using a non-durable memory-optimized table to speed up data ingestion, while managing the In-Memory OLTP storage footprint by offloading historical data to a disk-based Columnstore table for real time analytics. One of the customers already leveraging Azure SQL Database for their entire IoT solution is Quorum International Inc., who was able to double their key database’s workload while lowering their DTU consumption by 70%.

If you hit on the right scenario, memory-optimized tables can be great.

Comments closed

Reviewing The 2017 Data Breach Investigations Report

Jen Underwood picks out some interesting tidbits from the Verizon 2017 Data Breach Investigations Report:

Each year Verizon, in conjunction with the VERIS Community Database initiative, releases the annual data breach investigations report. This year’s report is based on analysis of 42,068 security incidents, including 1,935 confirmed data breaches. Within this free report, readers are provided incident analysis universally and by industry, detailed insights, and tips to mitigate cyber security threats. For data professionals, the data breach report is one of those “must at least skim” resources to understand the changing nature of threats that you are most likely to face to help you prepare and prevent them.

Click through for Jen’s summary, and I recommend you check out the report as well.

Comments closed

TensorFlow Tutorial

Ashish Bakshi has a TensorFlow tutorial:

As shown in the image above, tensors are just multidimensional arrays, that allows you to represent data having higher dimensions. In general, Deep Learning you deal with high dimensional data sets where dimensions refer to different features present in the data set. In fact, the name “TensorFlow” has been derived from the operations which neural networks perform on tensors. It’s literally a flow of tensors. Since, you have understood what are tensors, let us move ahead in this TensorFlow tutorial and understand – what is TensorFlow?

The sample here is Python, though there is an R library as well.

Comments closed

Backup-Related Instance Settings

Monica Rathbun explains a few instance-level backup properties:

Default backup media retention in days. Now the first things that comes to my mind is that “hey this is a cleanup job” SCORE! Thinking that maybe this will auto delete old backups. After all isn’t that what retention means? NOPE, not in this case.

In this case it’s just a number of days before that a backup media can be OVERWRITTEN. If the DBA goes to overwrite the media before those days it will give a warning message. You’ll note in every back up action you do the RETAINDAYS option is filled in. In this case it will always reflect to 90 now that we have changed it. In general, this a pointless option to me. I don’t normally OVERWRITE backup media. To me this was more relevant when Tapes were used and disk were harder to come by, so I leave it alone.

Read on for more settings.

Comments closed

Interleaved Execution And Compatibility Levels

Arun Sirpal gives us some helpful information regarding interleaved execution in SQL Server 2017:

I have read-only T-SQL that references the MSTVF. I did have some code that use both data modifications and cross apply but interleaved execution does not occur in those scenarios.

So on my SQL Server 2017 instance I set the database to 110 compatibility mode and set query store on where then I execute my code.

Note that 110 is the compatibility mode for SQL Server 2012.  That becomes an important part of Arun’s story.

Comments closed

Columnstore Memory Pressure And Bulk Loading

Niko Neugebauer shows what happens when you try to bulk load data into a columnstore index but don’t have enough memory available:

After waiting for the 25 seconds (notice the difference between the request_time and grant_time is exactly 25 seconds), the engine decides to grant some minimum amount of memory anyway, allowing the process to carry on, without being cancelled, but the penalisation is very heavy – the inserts will not go into the compressed row groups, but into the Delta-Stores, making this operation not-minimally-logged and in other words, painfully slow and inefficient.
To confirm the final results, let’s check on the Row Groups of our tables, given that we have canceled the inserts into the 2 first tables, we expect 1 row group for the [dbo].[FactOnlineSales_Stage3] table and 1 row group for the [dbo].[FactOnlineSales_Stage4] table, corresponding to the 3rd and 4th threads of data loading:

As Niko points out, this could be the difference between a well-behaved, single compressed rowgroup load versus dumping a million rows into the deltastore.

Comments closed