The issue here is that SQL is a declarative language: unlike procedural languages, there is no guarantee on the ordering of the operations, because optimizers. And SQL Server decides to do something other than what we’d expect: it tries to evaluate the value “Apu” as a date. But by using a CASE expression we can force the optimizer to take the input and match it to the expression (in this case, when a value is a date then convert it to a date) before checking if the value is older than 7 days.
This does work most of the time, but there are exceptions, so as always, test your code.
1. It can let you access data in the columns of those tables, to use in predicates or expressions.
2. It can let you filter the data in the base table, by only allowing rows which match, such as when using an inner join or right outer join.
3. It can cause rows in the base table to be returned multiple times, if multiple rows in the joined table match a single row in the base table.
4. It can introduce NULL rows, if a full or right outer join is being done (or a left outer join with the base table second) and there are rows in the joined table that don’t match any rows in the base table.
This is a useful bit of T-SQL-specific syntax, but it’s a sharper edge than most UPDATE statements. For a look back in history, Hugo Kornelis wanted to deprecate this syntax with the release of SQL Server 2008 (though MERGE has its own bugs and “Won’t Fix” problems, so in retrospect, perhaps it’s best that we still have UPDATE FROM).
In this way I can more easily see in the first example I’m joining two tables/views/CTEs together. If I want to know more about the details of one of those items, I can easily look up and see the CTE at the beginning.
However when I want multiple CTEs, how does this work?
The answer is simple but powerful. Once you’ve read up on CTEs, you start to see the power of chaining CTEs. And then you go CTE-mad until you see the performance hit of the monster you’ve created. Not that I’ve ever done that…nope…
A CTE is probably best described as a temporary inline view – in spite of its official name, it is not a table, and it is not stored (like a #temp table or @table variable). It operates more like a derived table or subquery, and can only be used for the duration of a single SELECT, UPDATE, INSERT, or DELETE statement (though it can be referenced multiple times within in that statement).
This is a great article on CTEs; give it a read, even if you’re familiar with them.
Specifically, how is it evaluated when your where clause says “WHERE This AND That OR Something AND that”, without any clarifying parenthesis?
Let’s play around with this. The simplest test scenario is a SELECT 1. If I get a 1 back, that means my WHERE clause evaluated to true, right? Right.
Parentheses should clarify statements. If I see an “AND” and an “OR” in a WHERE clause, I want to see parentheses, even if you’ve gotten it right. It’s too easy to misinterpret precedence.
There is another consequence of not using the
NOEXPANDhint, which I mentioned in passing a couple of years ago in my article, Optimizer Limitations with Filtered Indexes:
NOEXPANDhints are needed even in Enterprise Edition to ensure the uniqueness guarantee provided by the view indexes is used by the optimizer.
If you use indexed views in your environment, read this article.
I try to parameterize as many stored procedures as possible. This not only minimizes the amount of procedures I need to maintain, it in my opinion is a much cleaner way to code. It disturbs me when I see multiple stored procedures that pull the exact same data, but may have slight differences between them. Whether it be a sort, a where clause, or even just an extra field or two that makes it different, some developers think you need a different procedure for each one . Why not consolidate and parameterize?
The next step might be using dynamic SQL to build a query if there’s as much overlap as we see in Monica’s example.
I actually use them this way on a somewhat regular basis when I don’t want to have to tie back into sys.databases to just get the database name. And also in the WHERE clause when I want to restrict based on a database name.
It’s important to know where you are. Also, where your keys are.
The point is that there is an awful lot more going on inside SQL Server than is exposed in execution plans. Hopefully some of the details discussed in this rather long article will be interesting or even useful to some people.
It is good to have expectations of performance, and to know what plan shapes and properties are generally beneficial. That sort of experience and knowledge will serve you well for 99% or more of the queries you will ever be asked to tune. Sometimes, though, it is good try something a little weird or unusual just to see what happens, and to validate those expectations.
Optimizing update queries seems trivial at first, but as Paul shows, we have a few more tools at our disposal than is apparent at first glance.
The random sample that TABLESAMPLE provides is based on the number of data pages, not the number of records. If you want the number of rows to be specifically limited you will need to use Top(n) as well. I’ve written all my samples based upon AdventureWorksDW so you can run them for yourself later. I’ve listed the variety of ways to call TABLESAMPLE and shown the number of records returned.
TABLESAMPLE is useful for spelunking, but is somewhat limited otherwise.