Press "Enter" to skip to content

Category: T-SQL

Batching Data Copy Operations in SQL Server

Jana Sattainathan performs a data migration:

The purpose of this post is to give you a skeleton of TSQL code on how to perform a copy of a large table (say from one database to another within the same SQL Server Instance) by breaking up the data into manageable chunks. I have blogged about how to break-up any Oracle or SQL Server table data into chunks in this blog post.

Read on to learn why you don’t want to perform the operation as a single query, and one alternative that’s available. The query is a good bit more complex, but that complexity is for a good cause.

Comments closed

Counting Path Elements in T-SQL

Steven Sanderson switches over to T-SQL for a bit:

Yesterday I was working on a project that required me to create a SQL query to generate a table of accounts receivables pathways. I thought it would be interesting to share the SQL code I wrote for this task. The code is as follows:

Click through for the code. I was playing a bit of code golf in my mind, seeing how I might modify things. One big example would be to replace the STUFF() + FOR XML PATH combo with STRING_AGG(), assuming the SQL Server instance is 2017 or later, or if the database is in Azure SQL DB or SQL MI. The count of a specific character is an interesting approach, and this is the first time I’ve had to wonder why there isn’t a helper function in T-SQL to do that. But considering that this is the first time I’ve had to ask that question, maybe that’s an answer in itself.

Comments closed

Replacing DISTINCT with EXISTS

Andy Brownsword makes a switch:

The DISTINCT clause in a query can help us quickly remove duplicates from our results. Sometimes it can be beneficial to stop and ask why. Why do we need to use the clause, why are we receiving duplicates from our data?

I see this typically due to a JOIN being used where we don’t really want all of those results. This could be a ‘does something exist’ check such as if a customer has ever ordered before. The issue comes when there are multiple rows returned like a frequent customer in this example.

As an alternative to this, Andy shows how you can use the EXISTS clause to find records matching some criterion.

Comments closed

Explaining FIRST_VALUE() and LAST_VALUE() Execution Plans

Hugo Kornelis wraps up a mini-series on window functions:

In part twenty-eight of the plansplaining series, I’ll wrap up the six-part mini-series on window functions. After covering the basicsfast-track optimizationwindow frames ending at UNBOUNDED FOLLOWINGwindow frames specified with RANGE instead of ROWS, and LAG and LEAD, we will look at the LAST_VALUE and FIRST_VALUE analytical functions, and find that a function we would have expected to be available as an internal aggregate function does not exist at all! We’ll also find out how SQL Server works around that.

Click through to unravel that mystery.

Comments closed

What ANSI_NULLS Does

Chad Callihan answers one of the mysteries of the universe:

Have you ever seen ANSI_NULLS being set to ON or OFF in a script or stored procedure? I’ve seen or heard it mentioned a few times recently and thought it would be worth demonstrating an example of what this setting does.

Read on for the answer, including a truth table. But definitely keep ANSI_NULLS set to ON, as several important bits of functionality in SQL Server expect it that way.

Comments closed

Pagination in Stored Procedures

Erik Darling hits on a bugbear of mine:

A common-enough practice to limit search results is to write paginated queries. You may give users the ability to specify how many rows per page they want to see, or you may have a set number of rows per page.

But the end goal is to allow users to very quickly get a smaller number of rows returned to them. Almost no one needs to get many thousands of rows back, unless they’re planning on exporting the data.

Pagination is such a common activity that I wish there were a way to say, “Here is my data. Hang onto it in memory and quickly retrieve a subset of rows upon request” without doing all kinds of shenanigans on my end. Something like a data snapshot that remains in memory as long as the session is active, until the service restarts, until there is memory pressure, or until the caller manually evicts the data. That would make OFFSET and FETCH really useful instead of barely usable for most pagination scenarios because you wouldn’t need to re-run the entire query for every offset/fetch page.

There are ways to make pagination efficient, but the efficient ones aren’t easy or intuitive.

Comments closed

New Challenge: 2D Interval Packing

Itzik Ben-Gan has a new challenge for the new year:

Packing intervals is a classic SQL task that involves packing groups of intersecting intervals to their respective continuous intervals. In mathematics, an interval is the subset of all values of a given type, e.g., integer numbers, between some low value and some high value. In databases, intervals can manifest as date and time intervals representing things like sessions, prescription periods, hospitalization periods, schedules, or numeric intervals representing things like ranges of mile posts on a road, temperature ranges, and so on.

An example for an interval packing task is packing intervals representing sessions for billing purposes. If you conduct a web search of the term packing intervals SQL, you’ll find scores of articles on the subject, including my own.

Read on for more information about this challenge and one solution to it.

Comments closed

Math Operations in T-SQL

Daniel Hutmacher builds a few functions:

As part of spending waaaaaay to much time trying to solve the 2023 Advent of Code challenges, I came across multiple instances where I had to dust off some old math that I hadn’t paid attention to since I went to school back in the 90ies.

So for my own convenience, and yours, I’ve built functions for some common math that you might perhaps encounter at some point. I found this whole experience to be a great way to familiarize myself with a lot of the new functionality in SQL Server 2022, including GENERATE_SERIES(), LEAST(), GREATEST() and more. The Github repo contains a SQL Server 2019 version where I’ve built drop-in versions of the 2022 functions, but they probably won’t perform as well as the built-in stuff.

Click through for demonstrations of determining whether something is a prime number, finding the greatest common divisor and least common multiple, factorization, factorials, and even a bit of combinatorics.

Comments closed