The SWITCH statement can instantly ‘move’ data from one table to another table. It does this by updating some meta data, to say that the new table is now the owner of the data instead of the old table. This is very useful as there is no physical data movement to cause the stresses mentioned earlier. There are a lot of rules enforced by SQL Server before it will allow this to work. Essentially each table must have the same columns with the same data types and NULL settings, they need to be in the same file group and the new table must be empty. See here for a more detailed look at these rules.
If you can take a downtime, this is pretty easy. Otherwise, making sure that the two tables are in sync until the switchover occurs is a key problem to keep in mind.
Medians as a concept are simple enough. If you have a large number of values, like a range of statistical values, you want to pick the middle one. The median, as opposed to the average is useful for a number of reasons, one of them that you can reduce the effect of so-called outlier values.
The fact that SQL Server doesn’t have a fast, built-in median function surprises me, to be honest. The best alternative I’ve found was a CLR function in SQL#.
So here is the thing. When you change one you change them all. That means if you don’t specify a precision when you can then you get the default. That’s not exactly a common problem though. Usually what you are changing is the precision (or possibly the datatype). What is a common mistake is not specifying the nullability.
When modifying DDL, make sure that you keep it consistent and complete.
There are many ways of doing it in SQL. (and Rosetta Code is a good place to view solutions in various other languages). I believe that Peter Larsson holds the record for the fastest calculation of the Luhn test for a sixteen-digit credit card, with this code. As it stands, it isn’t a general solution, but it can be modified for different lengths of bank card.
Phil has two interesting T-SQL functions in the code and wants to find more.
However I wanted to add some tests. Does this really work? What if I don’t have a backslash? I thought the best way to do this was with a few tSQLt tests, which I quickly built. The entire process was 5-10 minutes, which isn’t a lot longer than if I had been running random tests myself with a variety of strings.
The advantage of tests is that if I come up with a new case, or another potential bug, I copy the test over, change the string and I have a new test, plus all the regressions. I’m not depending on my memory to run the test cases.
I first put the code in a function, which makes it easier to test.
tsqlt is a great tool for database unit testing.
As a conclusion: You CAN use “OVER” clause to do the aggregation in three following cases:
1. When data set is extremely small and fits in just one 8 Kb page;
2. When you want to hide your logic from any future developer or even yourself to make debugging and troubleshooting a nightmare;
3. When you really want to kill your SQL Server and its underlying disk system;
That conclusion’s rather pessimistic for my tastes, mostly because Slava’s trying to do the same thing with a window function that he’s doing with a GROUP BY clause and has multiple functions across different windows (including calculations). Using SUM() OVER() is powerful when you still need the disaggregated values—for example, running totals.
If the check constraint is trustable, it can be used by the query optimizer. For example, if the check constraint avoid values below 100 in a field and a query for 50 arrives, the query optimizer uses the check constraint to stop the query.
The query optimizer can only use the check constraint if it’s trustable, otherwise it could exist in the table records with values below 100, according to our example, and the query would loose these records.
Dennes then goes on to show how you can have non-trustworthy constraints and how to fix the issue.
If I want to depend on these values being exactly the same, I can’t count on the default values.
Default constraints will fill in the correct value, but as Michael notes, “the correct value” is calculated each time. Also, note that his results are about a millisecond off, so if you’re just using DATETIME, the frequency of observation of this occurrence will be lower, as DATETIME is only good to 3 milliseconds. That’s not a good reason to use DATETIME, though.
It has long been a habit that I name my constraints, and even if it wasn’t useful for database comparisons, it just helps me to see the database structure all that much eaiser. The fact that I as I get more experience writing SQL and about SQL, I have grown to habitually format my code a certain way makes it all the more interesting to me that I had never come across this scenario to not name constraints.
Temp tables are special. There’s another reason to have non-named constraints on temp tables inside stored procedures: it allows for temp table reuse, as shown on slide 21 in this Eddie Wuerch slide deck from SQL Saturday 75 (incidentally, the first SQL Saturday I ever attended).
So there you have it. With XML tricks and window functions, we have more opportunity for kicking out any need for functions. To use this code, you’d just swap out the select statement that supplied my samples to the routine, for the lists that you want to deduplicate. Sure, this sort of job will never be quick because there are still correlated subqueries in there to upset the CPU! I am intrigued that there are such different ways of doing a solution for this task in SQL server. Are there yet other ways of doing it?
Cf. Aaron Bertrand’s tally table method. Bonus points if you’re mentally screaming “CLR!”