Category: Syntax

Tracking Progress on Deletion

Published 2023-12-27 by Kevin Feasel

Alternate title: How do I tell how far I am on that command?

This little command (frequently with a WHERE clause) is a great way to tell how far along you are on any given command. Specifically, I’ve been running a bunch of DELETEs recently. I’ve got something like 5-50 million rows of data to delete and I’d like to know how far along I am. A common way to do this would be just a simple count.

Click through for Kenneth’s technique, as well as good information from Jeff Moden in the comments.

Not mentioned in this post is that hopefully, your massive delete operations are running in batches, as trying to delete 5 million or more rows in a single DELETE command is…resource-intensive.

Comments closed

TRY: CAST, CONVERT, and PARSE

Published 2023-12-26 by Kevin Feasel

Andy Brownsword tries and tries and tries again:

In the previous post we looked at how ISNUMERIC and TRY_CAST work and why we may want to utilise the latter when building validation for our data. When SQL Server 2012 rolled around it wasn’t only TRY_CAST which was added, we also had TRY_CONVERT and TRY_PARSE introduced too.

Here we’re going to look at how those function and the differences in the outputs which they can provide. We’ll start with the sample data below as the basis for these demonstrations:

Andy focuses mostly on the behavioral differences, where I like TRY_PARSE() a lot, especially because you can control locale for dates and times. When it comes to performance, I’ve found TRY_CAST() and TRY_CONVERT() to be essentially the same performance-wise. Maybe there’s a tiny difference between the two but there’s no guarantee of it.

TRY_PARSE(), however, calls into the .NET CLI for each execution and will be considerably slower. If you’re parsing a few or a few dozen values, you probably won’t notice the difference. If you’re parsing tens of thousands of values (or more), I assure you that you’ll notice the difference.

Comments closed

Advent of Code Day 6

Published 2023-12-22 by Kevin Feasel

Kevin Wilkie continues the advent of code series. The first part builds a small tally table and a loop:

Today we’re going racing! Sadly, it’s so not F1 or NASCAR racing. Snail racing is more like it since we’re moving millimeters by the end, but at least we’re closer to getting snow back to the elves, so let’s go racing!

Given a few numbers that are times and current record distances, this actually doesn’t look too bad to work with. First, as always, we have to load our data into SQL Server. This time, I loaded all of it into one table.

The second part goes back to the big tally table:

Sadly, this does make our numbers rather large, so we’re back to using the big ole Tally table we created for Walking Through Advent of Code Day 5.

This time I made it a little simpler on myself and just removed all of the spaces myself and placed the data in variables (one for time and one for distance). I thought this was an excellent idea since only one number would come out of all of this work.

Comments closed

Validating Numbers in T-SQL

Published 2023-12-21 by Kevin Feasel

Andy Brownsword asks if this thing is a number:

Data validation is key when ingesting from external sources. As we can’t always be certain of data quality we inevitably find bad data which needs to be handled. Here I wanted to look at a couple of options for validating numeric data.

Here’s the scenario – we’ve got data which may have been received via a flat file or passed into our database, and it should be a numeric value. How can we weed out the valid from invalid data?

Read on for the wrong answer (at least, the wrong answer given our expectations as developers or data platform specialists), followed by a good answer.

Comments closed

Window Function Execution Plans with RANGE

Published 2023-12-20 by Kevin Feasel

Hugo Kornelis continues a series on explaining the execution plans for window functions:

This is part twenty-six of the plansplaining series. And already the fourth episode about window functions. The first of those posts covered basic window functions; the second post focused on fast-track optimization for running aggregates, and the third post explained how the optimizer works around the lack of execution plan support for UNBOUNDED FOLLOWING.

But all of those were about OVER specifications that use the ROWS keyword. Let’s now look at the alternative, the RANGE keyword.

Click through to see how the various options work with RANGE. By the way, I still want range intervals, like how Postgres implements them, where you can define an interval of X days/hours/minutes/whatever rather than a specific number of rows. Maybe one of these versions…

Comments closed

Fun with WAITFOR

Published 2023-12-14 by Kevin Feasel

Aaron Bertrand plays red light, green light:

WAITFOR is a very useful command to prevent further operations from occurring until a certain amount of time has passed (WAITFOR DELAY) or until the clock strikes a specific time (WAITFOR TIME). These commands are covered quite well (including pros and cons) in a previous tip, “SQL WAITFOR Command to Delay SQL Code Execution.”

WAITFOR DELAY is also commonly used to “let SQL Server breathe.” For example, when performing some operations in a loop, on a resource-constrained system, an artificial delay can potentially allow for more concurrency, even if it makes the intensive operation take longer.

But these commands can also be used to synchronize multiple query windows for local testing and debugging, and I thought I would share a recent example.

Click through for some of the ways you can use WAITFOR in your scripts.

Comments closed

Microsoft Fabric SQL Endpoints and REST API

Published 2023-12-12 by Kevin Feasel

Tomaz Kastrun continues a series on Microsoft Fabric. Day 6 covers the SQL Analytics endpoint:

SQL Analytics endpoint in lakehouse is a SQL-based experience for lakehouse delta tables. By using standard T-SQL language, you can write queries to analyze data in delta tables, create functions, procedures, views and even apply security over the objects. There are some of the functionalities missing from your standard T-SQL language, but the experience is the same.

Besides the SQL experience, you can always use the corresponding items in the workspace view of Lakehouse explorer, use SQL in notebooks, or simply use SQL analytics endpoint

Day 7 looks at what subset of T-SQL syntax you can use against SQL Analytics endpoints:

You get the gist, and there are some other limitations; computed columns, indexed views, any kind of indexes, partitioned tables, triggers, user-defined types, sparse columns, surrogate keys, temporary tables and many more. Essentially, all the commands that are not supported in distributed processing mode.

The most biggest annoyance (!) is case sensitivity! Ughh.. This proves that the SQL operates like API on top of delta tables, which is translated either into PySpark commands or not directly to Spark since Spark is not case-sensitive. So, the first one will work and the second statement will be gracefully terminated.

Day 8 covers the Lakehouse REST API:

Now that we explored the lakehouse through the interface and workspaces, let’s check today, how can we use REST API. Microsoft Fabric Rest API defines a unified endpoint for operations.

Comments closed

Pivoting and Unpivoting in T-SQL

Published 2023-12-11 by Kevin Feasel

Chad Callihan pivots to a new idea:

How can you PIVOT or UNPIVOT data in SQL Server? What do these terms mean in SQL Server?

Let’s take a look a few examples using PIVOT and UNPIVOT.

Admission time: I can never remember the syntax for PIVOT or UNPIVOT. I always need to look it up.

1 Comment

Continuing the Advent of Code

Published 2023-12-08 by Kevin Feasel

Kevin Wilkie has been busy. Here’s Day 1 Part 2:

Today, I want to review part 2 of Day 1 of the Advent of Code series. Hopefully, everyone was able to complete part 1 with no troubles, or at least understood what I did to get there.

For part 2, they added a slight wrinkle to the part 1 puzzle. They spell out the numbers into actual words! How do you find them as well as find the numbers? Well, my friend, let’s go through that process, shall we?

After that is Day 2 Part 1:

On day 2, we are asked to gather data from a series of games and to see which of those are possible given a specific number of dice for a few colors. Fun times!

And then there’s Day 2 Part 2:

Today, we’ll be working on the next in the series using the data and processes that we found yesterday in Day 2 Part 1 – found here.

Thankfully, we were smart when we began working through the data and we have the data for each of our dice in separate tables, so breaking the data apart has definitely paid off! Now we can do just a little bit of work with the data from yesterday and we’ll be ready to give the results!

Comments closed

Joining on Overlapping Date Ranges in T-SQL

Published 2023-12-07 by Kevin Feasel

Daniel Hutmacher crosses the streams:

You can get into a situation where you have two tables with values associated with date ranges. What’s worse, those date ranges don’t necessarily have to align, which can make joining them a seemingly complex task, but it is surprisingly simple when you learn how to think of overlapping date ranges, along with this relatively simple T-SQL join pattern.

This problem gets even more challenging if you have the possibility of multiple overlaps and you want to find the combination with the biggest overlap for each individual item.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30