Press "Enter" to skip to content

Category: T-SQL

Exchange Spill Wait Stats

Erik Darling looks at exchange spills:

There are quite high waits on PAGEIOLATCH_EX, SLEEP_TASK, and SLEEP_BPOOL_STEAL rounding out the top five. This is quite interesting, because I’ve never explicitly thought of PAGEIOLATCH_EX waits in the context of exchange spills. Normally, I think of them when queries read pages from disk into memory for modification.

Going down the line, SLEEP_TASK is familiar from our time spent with hash spills, but SLEEP_BPOOL_STEAL is so far undocumented anywhere.

Erik also does the math on this query and recommends that you not write a query like this one.

Comments closed

Avoiding WHILE 1=1 Loops

Aaron Bertrand does not believe in the power of the infinite loop:

A short time ago a colleague had an issue with a Microsoft SQL Server stored procedure. They were using our recommended approach for batching updates, but there was a small problem with their code that led to the procedure “running forever.” I think we’ve all made a mistake like this at one point or another; here’s how I try to avoid the situation altogether.

The argument isn’t “don’t use WHILE loops” or “don’t use batching logic,” but instead to ensure that you have a break condition somewhere. It’s reasonable to ask for an end state before you begin processing something, after all.

Comments closed

LAG() vs All

Kathi Kellenberger shows the power of the LAG() function in T-SQL:

Microsoft introduced the first window (aka, windowing or windowed) functions with SQL Server 2005. These functions were ROW_NUMBERRANKDENSE_RANKNTILE, and the window aggregates. Many folks, including myself, used these functions without realizing they were part of a special group. In 2012, Microsoft added several more: LAG and LEADFIRST_VALUE and LAST_VALUEPERCENT_RANK and CUME_DISTPERCENTILE_CONT, and PERCENTILE_DISC. They also added the ability to do running totals and moving calculations.

These functions were promoted as improving performance over older techniques, but that isn’t always the case. There were still performance problems with the aggregate functions introduced in 2005 and the four of the functions introduced in 2012. In 2019, Microsoft introduced Batch Mode on Row Store, available on Enterprise and Developer Editions, that can improve the performance of window aggregates and the four statistical functions from 2012.

I started writing this article to compare some window function solutions to traditional solutions. I found that there were so many ways to write a query that includes a column from another row that this article is dedicated to the window functions LAG and LEAD.

In these sorts of circumstances, LAG() is extremely efficient at its job. Click through to see just how efficient.

Comments closed

Modifying Base Tables from Table Expressions

Itzik Ben-Gan continues a series on table expressions:

This article is the eleventh part in a series about table expressions. So far, I’ve covered derived tables and CTEs, and recently started the coverage of views. In Part 9 I compared views to derived tables and CTEs, and in Part 10 I discussed DDL changes and the implications of using SELECT * in the view’s inner query. In this article, I focus on modification considerations.

As you probably know, you’re allowed to modify data in base tables indirectly through named table expressions like views. You can control modification permissions against views. In fact, you can grant users permissions to modify data through views without granting them permissions to modify the underlying tables directly.

You do need to be aware of certain complexities and restrictions that apply to modifications through views. Interestingly, some of the supported modifications can end up with surprising outcomes, especially if the user modifying the data isn’t aware they’re interacting with a view. You can impose further restrictions to modifications through views by using an option called CHECK OPTION, which I’ll cover in this article. As part of the coverage, I’ll describe a curious inconsistency between how the CHECK OPTION in a view and a CHECK constraint in a table handle modifications—specifically ones involving NULLs.

As always, it’s well worth the read.

Comments closed

More on SARGability

Erik Darling continues a series on SARGability. First up, max datatype columns aren’t going to cut it:

No matter how much you scream, holler, and curse, when you try to filter data in a column with a max type, that predicate can’t be pushed to when you touch the index.

Leaving aside that max data types can’t be in the key of an index, because that would be insane, even if you stick them in the includes you’re looking at a lot of potential bulk, and not the good kind that makes you regular.

Read on for an example of Erik’s point, and then go to the next post, which covers fixing functions:

The bottom line on scalar UDFs is that they’re poison pills for performance.

They’re bad enough in the select list, but they get even worse if they appear in join or where clause portions of the query.

The example query we’re going to use doesn’t even go out and touch other tables, which can certainly make things worse. It does all its processing “in memory”.

Both of these are worth checking out.

Comments closed

When in Doubt, Stop Counting

Chad Callihan looks at the SET NOCOUNT ON option:

You may have a stored procedure that completes in an acceptable amount of time for the dozen or so times a day it gets called. Maybe it returns results in a few seconds and that makes the users calling it happy enough that you can move onto more pressing matters. But what about a stored procedure being called millions of times a day? The definition of acceptable can be drastically different when you consider the speed and traffic that type of stored procedure produces. When every millisecond matters, it’s worth checking to see what your setting is for SET NOCOUNT.

Click through for a demo and what you can realistically expect from SET NOCOUNT ON. This works best with big loops, and incidentally, one pattern I like to use is to combine SET NOCOUNT ON with an occasional RAISERROR('%i iterations run...', 10, 1, @loopvar) WITH NOWAIT. That way, you can still see progress on the screen, but instead of printing results every single run, you might see one every 100 runs.

Comments closed

Optimizing String Split and Search

Daniel Hutmacher needs things to go faster:

One of the things that sp_ctrl3 does is plaintext database search. If you pass a string to the procedure that does not match an existing object, it’ll just perform a plaintext search of all SQL modules (procedure, views, triggers, etc) for that string. The search result includes line numbers for each result, so it needs to split each module into lines.

I’ve found that this takes a very long time to run in a database with large stored procedures, so here’s how I tuned it to run faster.

Read the whole thing.

Comments closed

Getting Distinct Values before STRING_AGG

Greg Dodd shows how to remove duplicate values from a list before passing them to the STRING_AGG() function:

SQL introduced the new STRING_AGG feature in SQL 2017, and it works just like it suggests it would: it’s an aggregate function that takes all of the string values and joins them together with a separator. To see how it works, I’m using the StackOverflow users table, and let’s say we want to create a list of Display Names and we’re going to group it based on Location:

Click through for two methods, one of which is considerably better than the other.

Comments closed

From API Call to ML Services Prediction

Tomaz Kastrun continues a series:

From the previous two blog posts:

Creating REST API for reading data from Microsoft SQL Server in web browser

Writing Data to Microsoft SQL Server from web browser using REST API and node.js

We have looked into the installation process of Node.js, setup of Microsoft SQL Server and made couple of examples on reading the data from database through REST API and how to insert data back to database.

In this post, we will be looking the R predictions using API calls against a sample dataset.

Click through to see it in action.

Comments closed

Eliminating False Positives when Searching for NOLOCK

Aaron Bertrand carries a lantern around Athens, searching for an honest NOLOCK:

In Part 1 of this series, I showed how to identify “NOLOCK in DML” patterns within a single statement using a Visitor pattern. In Part 2, I showed how to find those patterns in procedure and trigger bodies across multiple databases and instances. I still need to show how to eliminate false positives without tedious visual inspection.

Read on to see how we can narrow down the likelihood of a false positive when searching for the worst sorts of NOLOCK queries.

Comments closed