Press "Enter" to skip to content

Curated SQL Posts

T-SQL Tuesday 117 Roundup

Steve Jones has the roundup for this month’s T-SQL Tuesday:

The summary from my fourth T-SQL Tuesday hosting for #117. This time I was scrambling a bit, but since I’ve worked with a few customers in the last year that use MOT tables, I thought this might be a good topic.

Either everyone is on vacation or not many people think about them. Or maybe they aren’t interesting. In any case, here’s the roundup.

I’d like to say it’s because everyone’s on vacation, but I think In-Memory OLTP is an underutilized technology. Granted, there are reasons why it’s not used as much as it should be—early versions were too limiting and could have weird consequences on your servers—but if you’re on SQL Server 2017, it’s worth another look.

Comments closed

Enabling Database-Level Change Tracking

Tim Weigel continues a series on change tracking:

If you don’t provide a retention period, SQL Server’s default is 2 days. Auto-cleanup defaults to ON unless you tell it otherwise.

Easy!

The table level commands aren’t any more complicated. Before we get started, please note that change tracking requires a primary key on the table you want to track. This is reasonable – you need some kind of unique identifier to tell you which row has changed.

Read on for the scripts and further explanation.

Comments closed

Blank Rows and DAX

Alberto Ferrari explains how different DAX functions treat blank rows differently:

DAX offers two functions to retrieve the list of values of a column: VALUES and DISTINCT. The difference between the two is subtle. To understand it better, we first need to introduce the concept of the blank row. The blank row is a special row added to a table, in case the table is on the one-side of a strong relationship and the relationship is invalid. Let us elaborate on this.

Click through for the detailed explanation, along with plenty of examples.

Comments closed

Calculated Columns and Memory Usage in Analysis Services

Teo Lachev troubleshoots a customer issue:

Scenario: A client reports a memory spike during processing. They have a Tabular semantic model deployed to Azure Analysis Services. They fully process the model daily. The model normally takes less than 50 GB RAM but during processing, it spikes five times and Azure Analysis Services terminates the processing task complaining that it “reached the maximum allowable memory in our pricing tier”. Normally, fully processing the model should take about twice the memory but five times?

Teo gives us the explanation for this problem as well as a recommendation on how to fix it.

Comments closed

Waiting on Stats Refreshes

Erik Darling looks at a new wait type:

I mean ever really wondered just how long a query of yours waited on stats to automatically update before running?

Have you ever been absolutely, positively sure that those gosh dang darn stats updates were putting your query on the trolley to Slowsville?

Your wishes will be 100% granted in SQL Server 2019.

This is a wait type that I’d consider useful but hopefully uncommon.

Comments closed

Requirements for DevOps Success

Grant Fritchey lays out what you need to succeed at DevOps:

The project managers and others flip. Delivery is slipping. The amount of code being written has changed. Stuff is happening that wasn’t on the schedule. The implementation of DevOps is shut down quickly.

You have to get buy-in from management before you attempt to implement DevOps or it will fail. They have to understand what you’re doing, why you’re doing it, and the measurable benefits it will bring.

Click through for some good thoughts, none of which is “use this software.”

Comments closed

Calculating AUC in R

Andrew Treadway shows how you can calculate Area Under the Curve in R:

AUC is an important metric in machine learning for classification. It is often used as a measure of a model’s performance. In effect, AUC is a measure between 0 and 1 of a model’s performance that rank-orders predictions from a model. For a detailed explanation of AUC, see this link.

Since AUC is widely used, being able to get a confidence interval around this metric is valuable to both better demonstrate a model’s performance, as well as to better compare two or more models. For example, if model A has an AUC higher than model B, but the 95% confidence interval around each AUC value overlaps, then the models may not be statistically different in performance. We can get a confidence interval around AUC using R’s pROC package, which uses bootstrapping to calculate the interval.

There are plenty of ways to calculate this useful metric, but this is definitely one of the easier methods. H/T R-bloggers

Comments closed

Performance Tuning Neural Network Training

Sean Owen takes us through a few techniques for speeding up neural network model training:

Step #2: Use Early Stopping
Keras (and other frameworks) have built-in support for stopping when further training appears to be making the model worse. In Keras, it’s the EarlyStopping callback. Using it means passing the validation data to the training process for evaluation on every epoch. Training will stop after several epochs have passed with no improvement. restore_best_weights=True ensures that the final model’s weights are from its best epoch, not just the last one. This should be your default.

Sean focuses here on Keras + TensorFlow on Spark, but several of the tips are cross-product and generally applicable.

Comments closed

Azure Data Studio August Release

Alan Yu announces a new version of Azure Data Studio:

The key highlights to cover this month include:
– SandDance integration—A new way to interact with data
– Notebook improvements
– SQL Server Dacpac extension can support Azure Active Directory
– SQL Server 2019 extension
– Visual Studio Code merge 1.37
– Bug fixes

Being able to add a new cell inline is nice, especially when you’re dealing with larger notebooks.

Comments closed

Isolation Levels and Dynamic SQL

Max Vernon points out how transaction isolation levels work when combined with sp_executesql:

Imagine you have a piece of code where you don’t care about the downsides to the “read uncommitted” isolation level, and do your due diligence by adding SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED; at the start of your code. The code following that statement will run under that isolation level, as expected. However, if you call dynamic T-SQL with sys.sp_executesql, and set the isolation level there, the dynamic code will run under the READ UNCOMMITTED, however the isolation level will not be changed for the calling code. In other words, be careful about where you set the isolation level.

Click through for a demonstration of this.

Comments closed