Press "Enter" to skip to content

Category: T-SQL

Continuing the Advent of Code

Kevin Wilkie has been busy. Here’s Day 1 Part 2:

Today, I want to review part 2 of Day 1 of the Advent of Code series. Hopefully, everyone was able to complete part 1 with no troubles, or at least understood what I did to get there.

For part 2, they added a slight wrinkle to the part 1 puzzle. They spell out the numbers into actual words! How do you find them as well as find the numbers? Well, my friend, let’s go through that process, shall we?

After that is Day 2 Part 1:

On day 2, we are asked to gather data from a series of games and to see which of those are possible given a specific number of dice for a few colors. Fun times!

And then there’s Day 2 Part 2:

Today, we’ll be working on the next in the series using the data and processes that we found yesterday in Day 2 Part 1 – found here.

Thankfully, we were smart when we began working through the data and we have the data for each of our dice in separate tables, so breaking the data apart has definitely paid off! Now we can do just a little bit of work with the data from yesterday and we’ll be ready to give the results!

Comments closed

Joining on Overlapping Date Ranges in T-SQL

Daniel Hutmacher crosses the streams:

You can get into a situation where you have two tables with values associated with date ranges. What’s worse, those date ranges don’t necessarily have to align, which can make joining them a seemingly complex task, but it is surprisingly simple when you learn how to think of overlapping date ranges, along with this relatively simple T-SQL join pattern.

This problem gets even more challenging if you have the possibility of multiple overlaps and you want to find the combination with the biggest overlap for each individual item.

Comments closed

Finding the First and Last Number of a String in SQL and Excel

Kevin Wilkie does some soul searching, or at least string searching:

To enjoy these puzzles, you will need to go to the Official Advent of Code website, sign up for their leaderboards and whatnot if you choose to, and then continue to 2023 and Day 1. Today, we’ll start with Day 1 – since it is the first of our programming puzzles and work our way up from there…

We’re asked – given a string – to find the first (and last) number in that string. We are then to concatenate them, add them all up, and provide the result. It should be pretty simple, but let’s see…

Click through for Kevin’s two answers.

Comments closed

Set-Based vs Row-Based Code Considerations

Kevin Hill explains a concept:

In SQL Server, the terms “set-based” and “row-based” refer to different approaches or styles of writing SQL code to manipulate data. These styles have implications for performance, readability, and the way queries are processed. Let’s explore the differences between set-based and row-based code:

Click through for Kevin’s thoughts. One thing I’d re-emphasize (because Kevin did make this point), especially for people coming to SQL Server from Oracle, is that set-based operations are going to be more efficient about 95-99% of the time than their row-based equivalents. Oracle has a large number of optimizations to make cursor-style code efficient and T-SQL has very few of those, as set-based is the more natural expression of SQL.

One quick example of this is, prior to SQL Server 2012 and its extended support of window functions, the fastest officially supported way to calculate a running total was to build a cursor. The other alternatives, including self-joins, were much less efficient. There was an unsupported but much faster technique that relied on a peculiarity of how SQL Server sorts clustered indexes (the “quirky update” method), but because it relied on internals that could change with any patch, it was a risky maneuver.

Comments closed

Microsoft Fabric and Tabular Editor

Johnny Winter is excited:

Why the excitement on my part? Well to take advantage of all the great features in Tabular Editor, you really need to be able to connect and write via XMLA, be that for doing CI/CD pipelines or by making edits directly on the dataset.

What great new features does Tabular Editor unlock that you can’t just do in the online Power BI modelling experience in Fabric… tons!

Read on to see how Tabular Editor plays with Microsoft Fabric.

Comments closed

Indexing for Substring Searches

Daniel Hutmacher prepares the bloom filter:

A question from a client got me thinking. Relational databases (at least the ones I know and love) can’t really index for queries that use LIKE queries for a substring of a column value. If you want to search for strings beginning with a given string, a regular rowstore index will have you covered. If you’re looking for entire words or sentences, a full text index might be a good call. But because of the very way indexes work, you’ll never get great performance searching for just arbitrary parts of a string.

So today I’ll put on my lab coat and do a little rocket surgery, just to prove to the world that it can be done.

The suffix tree approach was an interesting one. I’ve also seen people attack this problem using bloom filters (as I alluded to in the link text) and n-grams. A commenter does note n-grams (specifically, tri-grams) as a viable solution as well.

Comments closed

Self-Join Optimizations and Index Intersection

Daniel Hutmacher shows off a possibility:

This blog post started as a “what if” contemplation in my head: Suppose you have a reasonably large table with a clustered index and a number of non-clustered indexes. If your WHERE clause filters by multiple columns covered by those non-clustered indexes, could it potentially be faster to rewrite that WHERE clause to use those non-clustered indexes?

The answer might surprise you.

To be honest, when I learned about the concept originally, I expected that there would be a great deal of use cases for it. But SQL Server rarely comes up with this answer on its own and I think that’s because in most scenarios, we’d need to do additional sorting or other expensive operations to get the multiple indexes aligned just right to make this the faster option.

Comments closed

Combining Window Functions and GROUP BY

Andy Brownsword aggregates some data:

We revisited window functions last week for T-SQL Tuesday. As we’re in that area there’s another example I thought was worth exploring. Can we group data whilst applying window functions in the same query?

Andy comes up with a final query that works perfectly fine, but there’s actually an easier answer in terms of code readability: the DISTINCT operator.

SELECT DISTINCT
    FinancialQuarter,
    QuarterAvg = AVG(SalesValue) OVER (PARTITION BY FinancialQuarter),
    YearAvg = AVG(SalesValue) OVER (PARTITION BY FinancialYear)
FROM
    #MonthlySales;

The FinancialQuarter column is unique so we can perform the window operation for averaging sales value over financial quarter and then by financial year. To remove the “duplicate” rows, we run DISTINCT and get the same results.

That said, the execution plan for this is a little more complex, as we have to go through a lazy spool on two separate occasions rather than the one that Andy’s solution comes up with. For sufficiently large datasets, that could make a difference, so as usual, choose the option that works better for your situation.

Comments closed

T-SQL Tuesday 168 Round-Up

Steve Jones lagged a bit:

I didn’t get much of a chance to check out the posts as I was at the PASS Data Community Summit, but I came home and started to work through them.

This was the 8th one I’ve hosted, which makes sense as I’ve taken over managing the party from Adam Machanic and there have been a few places I’ve had to fill in for missing hosts. In any case, here’s the roundup. I’m going in order of the comments as I see them on the blog.

Click through for this month’s list of entrants.

Comments closed

Query Tuning via Window Function

Rob Farley eliminates a self-join:

Sometimes query tuning involves taking a different approach to a problem. And given that other tuning options might be creating index(es) or redesigning tables – both of which are much more permanent changes to an environment – rewriting a query can often be just right.

Window functions seem to pop up quite often when rewriting queries, and an example around this would be appropriate for this month’s T-SQL Tuesday, hosted by Steve Jones (@way0utwest at X/Twitter).

Read on for the all-too-common scenario and how Rob improves an existing query.

Comments closed