Press "Enter" to skip to content

Category: T-SQL

Strategies for De-Functionizing Your T-SQL

Tom Zika continues the 12-step program for scalar function abusers:

In the previous posts, we have learned why Scalar Functions (UDFs) are bad for parallelism and performance and what the options are for their removal.

The only remaining question is where to start. You’ve probably guessed there isn’t one true way to approach this. So instead, I offer several strategies that you can combine.

Plus Tom has an offer for you at the bottom of the post: conversion from Scalar UDFs to Inline Table-Valued Functions, which tend to perform much better.

Comments closed

Creating Repeatable Test Data

Louis Davidson repeats himself:

In order to test graph structures, I needed a large set of random data. In some ways, this data will resemble the IMDB database I will include later in this chapter, but to make it one, controllable in size and two, random, I created this random dataset. I loaded a set of values for an account into a table and a set of interests. I wanted then to be able to load a set of random data into edge, related one account to another (one follows the other), and then an account to a random set of interests.

In this article, I will discuss a few techniques I used, starting with the simplest method using ordering by NEWID(), then using RAND() to allow the code to generate a fixed set of data.

There’s a lot of code needed to do it but if it’s something you’ve got to do, that’s the cost of doing business.

Comments closed

Non-Updating Updates

Michael J. Swart has an update for us:

According to Microsoft’s documentation, an UPDATE statement “changes existing data in a table or view”. But what if the values don’t actually change? What if affected rows are “updated” with the original values? Some call these updates non-updating. This leads to a philosophical question: “If an UPDATE statement doesn’t change any column to a different value, has the row been updated?”

I answer yes to that question. I consider all affected rows as “updated” regardless of whether the values are different. I think of the UPDATE statement as more of an OVERWRITE statement. I also think of “affected rows” rather than “changed rows”. In most cases SQL Server thinks along the same lines.

I list some features and areas of SQL Server and whether non-updating updates are treated the same or differently than other updates:

Click through for multiple scenarios and to see whether SQL Server is smart enough to ignore non-updating records in an UPDATE statement.

Comments closed

MERGE in Dedicated SQL Pools

Emily Tehrani notes an addition to Azure Synapse Analytics:

We are thrilled to announce that the MERGE T-SQL command for Azure Synapse Dedicated SQL pools is now Generally Available! MERGE has been a highly requested addition to the Synapse T-SQL library that encapsulates INSERTs/UPDATEs/DELETEs into a single statement, drastically easing migrations and making table synchronization a breeze.

If you do decide to use this, I’d expect it to have the same bugs which make its use on-premises a mess. As always, MERGE responsibly.

Comments closed

Pagination in SQL Server

Chad Callihan is paging us:

Pagination (or paging) is the process used to return a set of results in pages. When it comes to pagination in SQL Server, you have a couple of different options. Today, we’ll look at three of them. The first being pagination with the use of FETCH/OFFSET, the second using a CTE, and the third with the use of ROW_NUMBER. Let’s look at a brief example of how each works.

For each example, I’m using @PageSize to determine how many results to return and @PageNumber to determine what page of results to return from the StackOverflow2013 database.

Chad has three techniques listed here. Pagination is one of those things which should be simple but all of the simple options have major performance drawbacks once you get to datasets whose sizes require pagination.

Comments closed

Modulus Computations on Large Numbers

Daniel Hutmacher does the math:

The modulus is the remainder of a division of two integers*. Suppose you divide 12 by 4, the result is 3. But divide 11 by 4, and the result is 2.75. This could also be expressed by saying that 11/4 is 2 with a remainder of 3. Computing that 3 is the work of the modulo operator, which in T-SQL is represented by the % operator.

Let’s explore how to compute the modulus of large numbers in SQL Server, and how this is useful in the real world.

Daniel’s example includes IBAN validation, though I think he’s secretly working on breaking asymmetric encryption…

Comments closed

Careful Batching

Michael J. Swart follows up on an older post:

When I wrote Take Care When Scripting Batches, I wanted to guard against a common pitfall when implementing a batching solution (n-squared performance). I suggested a way to be careful. But I knew that my solution was not going to be universally applicable to everyone else’s situation. So I wrote that post with a focus on how to evaluate candidate solutions.

But we developers love recipes for problem solving. I wish it was the case that for whatever kind of problem you got, you just stick the right formula in and problem solved. But unfortunately everyone’s situation is different and the majority of questions I get are of the form “What about my situation?” I’m afraid that without extra details, the best advice remains to do the work to set up the tests and find out for yourself.

Definitely read the original article first. My normal approach is the naive + index method, so I’ll have to try out Michael’s method as well next time I need to delete a chunk of records.

Comments closed

Deleting Rows in Order

David Fowler understand the order of things:

This was an interesting question that I was asked yesterday and something that I’d never really thought of before. Can you delete the top x number of rows based on an ORDER BY?

Why would you want to do that? Well let’s just assume that we have a ‘people’ table and some strange bug in the application has cause the need to delete the top 10 oldest females for whatever reason (I know, it’s very contrived example and not the situation that my colleague was facing but it’ll do to illustrate the point).

Click through for one way which doesn’t work and two ways which do.

Comments closed

Date Calculation (and Calendar Tables) in SQL Server

Aaron Bertrand makes the case for calendar tables:

In a previous tip, Simplify Date Period Calculations in SQL Server, I described how to easily calculate certain dates, like the first day of a given month, quarter, or year. Another common scenario is trying to find the last weekday of the month, or the nth Monday or Wednesday. In this tip, I will show ways to simplify these calculations, both with and without a calendar table.

Click through for Aaron’s approach to the problem. I have a blunter approach in creating an expansive calendar table and using it. You do the heavy lifting one time and are good for life on that server.

Comments closed