So there you have it. With XML tricks and window functions, we have more opportunity for kicking out any need for functions. To use this code, you’d just swap out the select statement that supplied my samples to the routine, for the lists that you want to deduplicate. Sure, this sort of job will never be quick because there are still correlated subqueries in there to upset the CPU! I am intrigued that there are such different ways of doing a solution for this task in SQL server. Are there yet other ways of doing it?
Cf. Aaron Bertrand’s tally table method. Bonus points if you’re mentally screaming “CLR!”
Michelle’s code uses INSERT…EXEC to populate a temporary table with the VLF info, and the addition of this extra column breaks the original script. Glenn’s versions of the scripts handle this issue easily since they are version-specific – in the SQL 2012/2014/2016 versions of the script, the temp table declaration is modified to include the extra RecoveryUnitID column, which allows the rest of the script to function as designed.
My problem is I wanted a version of the script that could be used across versions 2005+, and this presented a problem. At first I tried to add an IF…ELSE block to the start of the script to handle the differing CREATE TABLE statements:
This is a good example of working around a problem rather than simply giving up.
I’ve long been a proponent of not caring about which naming standards you use, but I do find it very important that your standards follow these three basic rules:
The conventions make sense. You should be able to argue why the chosen convention is better than an alternative, and it can’t just be because you like it better. This doesn’t mean you have to win that argument, just that you should be arguing for something tangible.
The entire team is on board. You should all agree on a standard before implementation, and any changes you make over time should be by committee.
You don’t make exceptions. Even if you’re a one-person team, if you’re going to bother having a standard, it needs to be used consistently. It’s amazing how quickly exceptions can become the new rules.
If you want to talk subjectivity, I disagree with the idea that tables should be plural, as I tend to think terms of an entity (e.g., Person) which contains attributes, rather than the collection of entities which contain a specific combination of attributes. Regardless, “set a standard and stick to it” is important advice.
Make sure that this is the ONLY code in your window or that you are protected by a RETURN or SET EXECUTION OFF at the top of your screen. I have this put in place by default on new query windows. This protects you from running too much code by accident.
Make a habit of checking what instance you are connected to before running any ad-hoc code. Running code meant for a model or test environment in production can be a very scary thing.
This is good advice.
So, we can clearly and without any doubt say that both COUNT(*) & COUNT(1) are same and equivalent.
Both of these are different from COUNT(SomeColumnName), though.
The below XML has data nested in different levels that requires the nodes method to join them together. The nodes method accepts a XML string and returns a rowset. That rowset can then be used with CROSS APPLY to effectively link your way down.
nodes (XQuery) as Table(Column)
The tabular format I need requires data from 3 different levels of this XML gob and I need to wade through 5 “tables” to get there.
Shredding XML is something you occasionally need to do.
The issue here is that SQL is a declarative language: unlike procedural languages, there is no guarantee on the ordering of the operations, because optimizers. And SQL Server decides to do something other than what we’d expect: it tries to evaluate the value “Apu” as a date. But by using a CASE expression we can force the optimizer to take the input and match it to the expression (in this case, when a value is a date then convert it to a date) before checking if the value is older than 7 days.
This does work most of the time, but there are exceptions, so as always, test your code.
1. It can let you access data in the columns of those tables, to use in predicates or expressions.
2. It can let you filter the data in the base table, by only allowing rows which match, such as when using an inner join or right outer join.
3. It can cause rows in the base table to be returned multiple times, if multiple rows in the joined table match a single row in the base table.
4. It can introduce NULL rows, if a full or right outer join is being done (or a left outer join with the base table second) and there are rows in the joined table that don’t match any rows in the base table.
This is a useful bit of T-SQL-specific syntax, but it’s a sharper edge than most UPDATE statements. For a look back in history, Hugo Kornelis wanted to deprecate this syntax with the release of SQL Server 2008 (though MERGE has its own bugs and “Won’t Fix” problems, so in retrospect, perhaps it’s best that we still have UPDATE FROM).
In this way I can more easily see in the first example I’m joining two tables/views/CTEs together. If I want to know more about the details of one of those items, I can easily look up and see the CTE at the beginning.
However when I want multiple CTEs, how does this work?
The answer is simple but powerful. Once you’ve read up on CTEs, you start to see the power of chaining CTEs. And then you go CTE-mad until you see the performance hit of the monster you’ve created. Not that I’ve ever done that…nope…
A CTE is probably best described as a temporary inline view – in spite of its official name, it is not a table, and it is not stored (like a #temp table or @table variable). It operates more like a derived table or subquery, and can only be used for the duration of a single SELECT, UPDATE, INSERT, or DELETE statement (though it can be referenced multiple times within in that statement).
This is a great article on CTEs; give it a read, even if you’re familiar with them.