Press "Enter" to skip to content

Category: T-SQL

Indexing for Substring Searches

Daniel Hutmacher prepares the bloom filter:

A question from a client got me thinking. Relational databases (at least the ones I know and love) can’t really index for queries that use LIKE queries for a substring of a column value. If you want to search for strings beginning with a given string, a regular rowstore index will have you covered. If you’re looking for entire words or sentences, a full text index might be a good call. But because of the very way indexes work, you’ll never get great performance searching for just arbitrary parts of a string.

So today I’ll put on my lab coat and do a little rocket surgery, just to prove to the world that it can be done.

The suffix tree approach was an interesting one. I’ve also seen people attack this problem using bloom filters (as I alluded to in the link text) and n-grams. A commenter does note n-grams (specifically, tri-grams) as a viable solution as well.

Comments closed

Self-Join Optimizations and Index Intersection

Daniel Hutmacher shows off a possibility:

This blog post started as a “what if” contemplation in my head: Suppose you have a reasonably large table with a clustered index and a number of non-clustered indexes. If your WHERE clause filters by multiple columns covered by those non-clustered indexes, could it potentially be faster to rewrite that WHERE clause to use those non-clustered indexes?

The answer might surprise you.

To be honest, when I learned about the concept originally, I expected that there would be a great deal of use cases for it. But SQL Server rarely comes up with this answer on its own and I think that’s because in most scenarios, we’d need to do additional sorting or other expensive operations to get the multiple indexes aligned just right to make this the faster option.

Comments closed

Combining Window Functions and GROUP BY

Andy Brownsword aggregates some data:

We revisited window functions last week for T-SQL Tuesday. As we’re in that area there’s another example I thought was worth exploring. Can we group data whilst applying window functions in the same query?

Andy comes up with a final query that works perfectly fine, but there’s actually an easier answer in terms of code readability: the DISTINCT operator.

SELECT DISTINCT
    FinancialQuarter,
    QuarterAvg = AVG(SalesValue) OVER (PARTITION BY FinancialQuarter),
    YearAvg = AVG(SalesValue) OVER (PARTITION BY FinancialYear)
FROM
    #MonthlySales;

The FinancialQuarter column is unique so we can perform the window operation for averaging sales value over financial quarter and then by financial year. To remove the “duplicate” rows, we run DISTINCT and get the same results.

That said, the execution plan for this is a little more complex, as we have to go through a lazy spool on two separate occasions rather than the one that Andy’s solution comes up with. For sufficiently large datasets, that could make a difference, so as usual, choose the option that works better for your situation.

Comments closed

T-SQL Tuesday 168 Round-Up

Steve Jones lagged a bit:

I didn’t get much of a chance to check out the posts as I was at the PASS Data Community Summit, but I came home and started to work through them.

This was the 8th one I’ve hosted, which makes sense as I’ve taken over managing the party from Adam Machanic and there have been a few places I’ve had to fill in for missing hosts. In any case, here’s the roundup. I’m going in order of the comments as I see them on the blog.

Click through for this month’s list of entrants.

Comments closed

Query Tuning via Window Function

Rob Farley eliminates a self-join:

Sometimes query tuning involves taking a different approach to a problem. And given that other tuning options might be creating index(es) or redesigning tables – both of which are much more permanent changes to an environment – rewriting a query can often be just right.

Window functions seem to pop up quite often when rewriting queries, and an example around this would be appropriate for this month’s T-SQL Tuesday, hosted by Steve Jones (@way0utwest at X/Twitter).

Read on for the all-too-common scenario and how Rob improves an existing query.

Comments closed

Last Observation Carried Forward in SQL Server 2022

Barney Lawrence shows off a nice enhancement to T-SQL in SQL Server 2022:

With SQL Server 2022 came a much requested additional feature added from the SQL standards – IGNORE_NULLS. You can probably guess what it does. Drop in IGNORE_NULLS after your function and you can blur the non null values over those gaps giving us results like this:

Read on for the pre-2022 version of the query and what it does, versus the version with IGNORE_NULLS specified. This small flag is extremely helpful in time series statistical analysis and I’m glad it’s in SQL Server now.

Comments closed

A Good Use of LEAD()

Chad Callihan gets the lead out:

Imagine we want to find gaps in post history for users. If a user is posting every day, we’re happy and can assume they’re happy. If they’re only posting once every few weeks, we want to investigate why that is.

Read on to see how you can do this with the LEAD() function.

Comments closed

Execution Plans for Window Functions

Hugo Kornelis digs in:

In his invitation, Steve specifically asks for examples where Window functions provided a neat solution to a real world problem. Well, sorry Steve, but I am not going to do that. But your invitation did inspire to me write about the execution plans for these window functions. And there is, in fact, so much to write about it, that this is just the first part.

So that makes this post not only a T-SQL Tuesday contribution, but also part 23 in my ongoing plansplaining series: blog posts where I take an in depth look at execution plans to explain how exactly they work, and point out often overlooked details. In this post, I will look at the basics of window functions, as they have existed for over 10 years now. I will point out a few interesting performance gotchas. And there are some links to feedback items that you can upvote (if you are so inclined) to pressure Microsoft to make some changes to the optimizer and the execution engine, to get some small but still welcome performance gains.

Read on to get an in-depth understanding of how those window functions actually operate once the database engine gets ahold of your query.

Comments closed

Using the WINDOW Operator in SQL Server 2022

Steve Jones cuts out the duplication:

I am the host for T-SQL Tuesday this month, and I hope that a lot of people like the topic. This idea actually came to me earlier this year when I happened to see someone ask about a T-SQL problem and get an answer using a Window function. This person mentioned they hadn’t used the window function before, and I wondered how many people haven’t even tried using the OVER() clause with a window function.

Read on to see how you can use WINDOW to replicate window frames, which can be quite helpful for larger queries. It doesn’t change the performance profile but does make the code more readable.

Comments closed

Operations Auditing with LAG() and LEAD()

Deborah Melkin builds a report query:

I’ve been doing a lot of work recently where I have long running processes where I need to have visibility on where we are during the process as well as being able to provide a report we can use to find out if there were any errors, how long each step took, and other useful information that we can use later. My solution was to create an audit log table where I simply added a row before and after each step, recording the information I had available. When the processes are done, I create a query to run over the table returning the values I want to see.

Deb also throws in the only ordered set function in SQL Server, STRING_AGG().

Comments closed