Names have been changed to protect the innocent.
In the above delete statement which table will have rows deleted from it?
C: Both Table1 and Table2
D: Neither Table1 and Table2
Got it in one. I like having this syntax available to me when I need it, even though it’s not ANSI standard.
CURRENT_TIMESTAMPis the ANSI-equivalent of
GETDATE(). ANSI is an acronym for the American National Standards Institute, and sometimes vendors will include ANSI functions in their products so they can say that they’re ANSI-compliant (which is not a bad thing, in most cases).
There are three main problems with
- No brackets. It goes against the rules about functions. So much for standards!
- It’s functionally equivalent to
GETDATE(), which uses
DATETIME, which we previously identified is old and bad.
- It’s too similar to the poorly-named
TIMESTAMPdata type, which has nothing to do with dates and times and should be called
Bottom line: don’t use
At one point I used
GETDATE() with the thought of portability in mind. Since then, my thoughts on code portability have changed and regardless, as Randolph mentions, it’s better to use
DATETIME2 functions to avoid precision issues with
Creating a database
Very easy to create something like the below – a S2 database.CREATE DATABASE [MeeTwoDB] ( MAXSIZE = 5GB, EDITION = 'standard', SERVICE_OBJECTIVE = 'S2' ) ; --checking details SELECT Edition = DATABASEPROPERTYEX('MeeTwoDB', 'Edition'), ServiceObjective = DATABASEPROPERTYEX('MeeTwoDB', 'ServiceObjective')
Read on for several more examples.
One day, it’s going to happen. I’m going to hear some crazy theory about how SQL Server works and I’m going to literally explode. Instead of some long silly rant with oddball literary & pop culture references you’ll get a screed the size of Ulysses (and about as much fun to read). However, for the moment, like Robin Williams describing a dance move, I’m going to keep it all inside. Here’s our query:
123456789 SELECT soh.SalesOrderNumber,sod.OrderQty,sod.UnitPrice,p.NameFROM Sales.SalesOrderHeader AS sohJOIN Sales.SalesOrderDetail AS sodON sod.SalesOrderID = soh.SalesOrderIDJOIN Production.Product AS pON p.ProductID = sod.ProductID;
No, no where clause because we have to compare this to this, our view:
Grant used up much of his strategic reserve of GIFs in that post, so check it out.
Say you’ve got a table with millions or billions of rows, and you need to delete some rows. Deleting ALL of them is fast and easy – just do TRUNCATE TABLE – but things get much harder when you need to delete a small percentage of them, say 5%.
It’s especially painful if you need to do regular archiving jobs, like deleting the oldest 30 days of data from a table with 10 years of data in it.
The trick is making a view that contains the top, say, 1,000 rows that you want to delete:
Read on for a demo.
Someone had tried to be clever. Looking at the code running, if you’ve been practicing SQL Server for a while, usually means one thing.
A Scalar Valued Function was running!
In this case, here’s what it looked like:
123456789101112 CREATE OR ALTER FUNCTION dbo.BadIdea ( @uid INT )RETURNS BIGINTWITH RETURNS NULL ON NULL INPUT, SCHEMABINDINGASBEGINDECLARE @BCount BIGINT;SELECT @BCount = COUNT_BIG(*)FROM dbo.Badges AS bWHERE b.UserId = @uidGROUP BY b.UserId;RETURN @BCount;END;
Someone had added that function as a computed column to the Users table:
Spoilers: this was a bad idea.
This function is available starting with SQL 2016 and is currently only able to escape JSON characters. To me it’s not super useful just yet but hopefully they will add more types soon.
I haven’t had the need to use
STRING_ESCAPE yet, but one additional function I’d add is
This week, I had a problem where I needed to find and eliminate from the results of my query, data with overlapping ranges. I have written about this topic before, in my database design book book, in regards to building a trigger to avoid overlapping ranges. But even though I have written on the topic there, I still use Google just like you to get quick help (I use books when I want to learn, or expand my knowledge on a topic in depth, blogs when I need a simple answer to a simple or complex question.)
The problem at hand is most often associated with date based data, such as effective dates for a row in a temporal/type 2 dimension table, or other cases like appointment times, etc. But the algorithm is the same with numbers and is a bit easier to read since we don’t have the same issues with roundoff and decimal places (the query is complex enough on its own to show in a blog post). From a progression of start and end values in each row, we are going to look at how to check to make sure that there are no two rows that are in conflict (no range should contain another ranges value at all).
This feels like the type of thing which could be rewritten with window function to be a little smoother, but I’d have to think about it more. Louis does provide a good solution and explanation to a fairly common but tricky problem in T-SQL.
As you may already know, when SQL Server optimizes a query, it evaluates multiple candidate plans, and eventually picks the one with the lowest estimated cost. The estimated plan cost is the sum of all the operators’ estimated costs. In turn, each operator’s estimated cost is the sum of the estimated I/O cost and estimated CPU cost. The cost unit is meaningless in its own right. Its relevance is in the comparison that the optimizer makes between candidate plans. That is, the costing formulas were designed with the goal that, between candidate plans, the one with the lowest cost will (hopefully) represent the one that will finish more quickly. A terribly complex task to do accurately!
The more the costing formulas adequately take into account the factors that truly affect the algorithm’s performance and scaling, the more accurate they are, and the more likely that given accurate cardinality estimates, the optimizer will choose the optimal plan. At any rate, if you want to understand why the optimizer chooses one algorithm versus another you need to understand two main things: one is how the algorithms work and scale, and another is SQL Server’s costing model.
So back to the plan in Figure 1; let’s try and understand how the costs are computed. As a policy, Microsoft will not reveal the internal costing formulas that they use. When I was a kid I was fascinated with taking things apart. Watches, radios, cassette tapes (yes, I’m that old), you name it. I wanted to know how things were made. Similarly, I see value in reverse engineering the formulas since if I manage to predict the cost reasonably accurately, it probably means that I understand the algorithm well. During the process you get to learn a lot.
Our query ingests 1,000,000 rows. Even with this number of rows, the I/O cost seems to be negligible compared to the CPU cost, so it is probably safe to ignore it.
As for the CPU cost, you want to try and figure out which factors affect it and in what way.
I give this my highest recommendation.
Let’s say that you execute the following T-SQL:SET CONTEXT_INFO 1234; SELECT CONVERT(INT, CONTEXT_INFO());
The result will be:0
Why is that?
There is an answer which makes sense, but it’s easy to be caught unawares.