Press "Enter" to skip to content

Category: T-SQL

Using Query Labels in Azure Synapse Analytics

Gauri Mahajan shows one of the pieces of functionality in Azure Synapse Analytics dedicated SQL pools that I’d like to see on-premises:

Azure Synapse supports a concept known as “query labels” that allows tagging any DDL or DML queries that are executed on the dedicated SQL pool. These labels can be queried using the dynamic management views (DMVs). One can use these labels to describe the purpose of the query or add any metadata to the query being executed and the same can be used later for instrumenting the queries, specifically to identify the queries that meet the desired search criteria. Let’s walk through a step-by-step exercise to understand this concept practically.

Click through for the process.

Comments closed

Concatenating in SQL Server

Lee Markup takes us through a pair of very useful functions in SQL Server:

SQL Server concatenation methods have been enhanced in modern versions of SQL Server. SQL Server 2012 introduced the CONCAT() function. In SQL Server 2017 we get CONCAT_WS().

A common usage of concatenation, or joining column values together in a string, is combining a FirstName and LastName column into a FullName column.  Another common usage might be for creating an address column that pulls together building number, street, city and zip code.

Read on to learn more. CONCAT() and CONCAT_WS() are also extremely helpful for change detection in ETL processes. For example, you might have a queue table to process and only want to update records in which relevant source fields changed, ignoring the ones which don’t exist in your destination. A combination of HASHBYTES() and CONCAT_WS() will do the trick quite nicely.

Comments closed

Deleting Duplicate Records

Chad Callihan shows one of the best ways to remove duplicate records from a table:

Have you ever needed to delete duplicate records from a table? If there is only one duplicate in a table then simply running something like DELETE FROM Table WHERE ID = @DupRecord will do the trick. What happens when the problem is found after multiple records are duplicated? Will tracking them all down be more time consuming? Here are a few different options for quickly clearing out duplicate records.

There’s the best way, and then there are the other ways.

Comments closed

Exchange Spill Wait Stats

Erik Darling looks at exchange spills:

There are quite high waits on PAGEIOLATCH_EX, SLEEP_TASK, and SLEEP_BPOOL_STEAL rounding out the top five. This is quite interesting, because I’ve never explicitly thought of PAGEIOLATCH_EX waits in the context of exchange spills. Normally, I think of them when queries read pages from disk into memory for modification.

Going down the line, SLEEP_TASK is familiar from our time spent with hash spills, but SLEEP_BPOOL_STEAL is so far undocumented anywhere.

Erik also does the math on this query and recommends that you not write a query like this one.

Comments closed

Avoiding WHILE 1=1 Loops

Aaron Bertrand does not believe in the power of the infinite loop:

A short time ago a colleague had an issue with a Microsoft SQL Server stored procedure. They were using our recommended approach for batching updates, but there was a small problem with their code that led to the procedure “running forever.” I think we’ve all made a mistake like this at one point or another; here’s how I try to avoid the situation altogether.

The argument isn’t “don’t use WHILE loops” or “don’t use batching logic,” but instead to ensure that you have a break condition somewhere. It’s reasonable to ask for an end state before you begin processing something, after all.

Comments closed

LAG() vs All

Kathi Kellenberger shows the power of the LAG() function in T-SQL:

Microsoft introduced the first window (aka, windowing or windowed) functions with SQL Server 2005. These functions were ROW_NUMBERRANKDENSE_RANKNTILE, and the window aggregates. Many folks, including myself, used these functions without realizing they were part of a special group. In 2012, Microsoft added several more: LAG and LEADFIRST_VALUE and LAST_VALUEPERCENT_RANK and CUME_DISTPERCENTILE_CONT, and PERCENTILE_DISC. They also added the ability to do running totals and moving calculations.

These functions were promoted as improving performance over older techniques, but that isn’t always the case. There were still performance problems with the aggregate functions introduced in 2005 and the four of the functions introduced in 2012. In 2019, Microsoft introduced Batch Mode on Row Store, available on Enterprise and Developer Editions, that can improve the performance of window aggregates and the four statistical functions from 2012.

I started writing this article to compare some window function solutions to traditional solutions. I found that there were so many ways to write a query that includes a column from another row that this article is dedicated to the window functions LAG and LEAD.

In these sorts of circumstances, LAG() is extremely efficient at its job. Click through to see just how efficient.

Comments closed

Modifying Base Tables from Table Expressions

Itzik Ben-Gan continues a series on table expressions:

This article is the eleventh part in a series about table expressions. So far, I’ve covered derived tables and CTEs, and recently started the coverage of views. In Part 9 I compared views to derived tables and CTEs, and in Part 10 I discussed DDL changes and the implications of using SELECT * in the view’s inner query. In this article, I focus on modification considerations.

As you probably know, you’re allowed to modify data in base tables indirectly through named table expressions like views. You can control modification permissions against views. In fact, you can grant users permissions to modify data through views without granting them permissions to modify the underlying tables directly.

You do need to be aware of certain complexities and restrictions that apply to modifications through views. Interestingly, some of the supported modifications can end up with surprising outcomes, especially if the user modifying the data isn’t aware they’re interacting with a view. You can impose further restrictions to modifications through views by using an option called CHECK OPTION, which I’ll cover in this article. As part of the coverage, I’ll describe a curious inconsistency between how the CHECK OPTION in a view and a CHECK constraint in a table handle modifications—specifically ones involving NULLs.

As always, it’s well worth the read.

Comments closed

More on SARGability

Erik Darling continues a series on SARGability. First up, max datatype columns aren’t going to cut it:

No matter how much you scream, holler, and curse, when you try to filter data in a column with a max type, that predicate can’t be pushed to when you touch the index.

Leaving aside that max data types can’t be in the key of an index, because that would be insane, even if you stick them in the includes you’re looking at a lot of potential bulk, and not the good kind that makes you regular.

Read on for an example of Erik’s point, and then go to the next post, which covers fixing functions:

The bottom line on scalar UDFs is that they’re poison pills for performance.

They’re bad enough in the select list, but they get even worse if they appear in join or where clause portions of the query.

The example query we’re going to use doesn’t even go out and touch other tables, which can certainly make things worse. It does all its processing “in memory”.

Both of these are worth checking out.

Comments closed

When in Doubt, Stop Counting

Chad Callihan looks at the SET NOCOUNT ON option:

You may have a stored procedure that completes in an acceptable amount of time for the dozen or so times a day it gets called. Maybe it returns results in a few seconds and that makes the users calling it happy enough that you can move onto more pressing matters. But what about a stored procedure being called millions of times a day? The definition of acceptable can be drastically different when you consider the speed and traffic that type of stored procedure produces. When every millisecond matters, it’s worth checking to see what your setting is for SET NOCOUNT.

Click through for a demo and what you can realistically expect from SET NOCOUNT ON. This works best with big loops, and incidentally, one pattern I like to use is to combine SET NOCOUNT ON with an occasional RAISERROR('%i iterations run...', 10, 1, @loopvar) WITH NOWAIT. That way, you can still see progress on the screen, but instead of printing results every single run, you might see one every 100 runs.

Comments closed