Itzik Ben-Gan has started a new series on T-SQL bugs, pitfalls, and best practices. The first post in this series is on non-deterministic behavior:
A nondeterministic function is a function that given the same inputs, can return different results in different executions of the function. Classic examples are SYSDATETIME, NEWID, and RAND (when invoked without an input seed). The behavior of nondeterministic functions in T-SQL can be surprising to some, and could result in bugs and pitfalls in some cases.
Many people assume that when you invoke a nondeterministic function as part of a query, the function gets evaluated separately per row. In practice, most nondeterministic functions get evaluated once per reference in the query.
This is the start to a great series.
The locking story is not the same as with the primary and unique key constraints. First, there’s one extra piece: the transition will block access to
dbo.LookupTableas well as the table we create the constraint on. That’s to keep us from deleting rows in our lookup table before the key is in place.
Second, the locks begin as soon as we hit F5. Even
SELECTstatements get blocked requesting a
LCK_M_SCH_Slock. Bad news, people.
So what can we do to get around this problem? Two routes: the ineffectual way and the ugly way.
Despite my being a ray of sunshine here, you should still check this out. It’s shorter than the average Russian novel, at least.
In a traditional gaps and islands problem, the goal is to identify groups of continuous data sequences (islands) and groups of data where the sequence is missing (gaps).
While many people encounter gaps and islands problems when dealing with ranges of dates, and recently I did too but with an interesting twist:
How do you determine gaps and islands of data that has overlapping date ranges?
Check out Bert’s explanation of the solution; it’s a good one.
I recently published a post detailing the new Scalar UDF Inlining feature in SQL 2019 here. That post introduced the new feature in a way that I used to compare performance to the other function types, continuing the performance evaluation of functions that I had previously posted here and here. In the Scalar UDF Inlining post, I used a function to strip all non-numeral values from a string, and to return the result. This used the FOR XML output option.
In thinking about how scalar functions are commonly used, I’ve decided to revisit this feature with a simpler function. I will still compare it to all the other types of functions to see how Scalar UDF Inlining compares to the others.
Wayne’s results are music to the product team’s ears, I’m sure.
Two different T-SQL methods were used for testing. The FOR XML PATH solution incorporates the reverse and stuff functions along with FOR XML PATH. The String_agg solution is much simpler, but is only available on SQL Server 2017 currently.
The FOR XML PATH method has been documented thoroughly over the years in many articles, so these explanations will not be repeated here. However, two points are worth mentioning. The first one is that the @strSeparator (in this case ‘ or ’) string will reside at the end of the constructed string (in this case @strFilter). Since the last occurrence of @strSeparator must be removed to prevent a syntax error, characters must be dropped from the end of the string. Although it may seem obvious that four characters (the length of @strSeparator) should be removed, actually only three need be removed because the end of the constructed string is “r” not “ “. Therefore, the length argument of the left function specification is the length of @strFilter minus three.
STRING_AGG a lot, primarily because I can remember it and I can explain it to people relatively new to SQL Server. Read on to see how they both compare in terms of performance.
Now you need to install ODBC Driver for CosmosDB on the computer where you have SQL Server installed. I’m using Microsoft Azure Cosmos DB ODBC 64-bit.msi for 64-bit Windows – 64-bit versions of Windows 8.1 or later, Windows 8, Windows 7, Windows Server 2012 R2, Windows Server 2012, and Windows Server 2008 R2.
Once you install this driver, you should setup ODBC source in system DSN and test the connection:
If you’re running SQL Server 2019, you can follow Jovan’s first two steps and then create an external data source and table with PolyBase to get to the same results.
In order to make this work, one of the considerations is to eliminate cross database dependencies, as you can’t reference objects that don’t exist in views, and even in stored procedures, which offer delayed resolution of objects, you can’t test the code without the database it is referencing.
In addition, and somewhat more important to the process, is dealing with three part names that reference the name of the database your object is in. During the comparison process, the database can be created with a name that is different from your target database to compare to (referred to as a shadow database.) So if you are in database X and have references to X.schema.table, but the database is generated as X_Shadow, the X. is now a cross database reference rather than the local reference you are desiring.
Four part names to linked servers are a different sort of nightmare, but one that is (hopefully) exceedingly rare. The queries presented will help with this as well.
Louis has a few scripts to help you find these. If your code is in source control already, you could also build a regular expression to search through it.
I tend to feel that a lot of people who use triggers don’t really understand them. That said, every now and again you have to deal with them. And in particular (for this post) you might need to disable and then re-enable them. Enabling and disable are identical commands in this case so I’m just going to use the DISABLEversion and you can just replace it with ENABLE as needed.
To be fair, the world might be a better place if we disabled a majority of triggers…
The realm of More Significant Changes is not where you often want to be. There’s a lot of scaffolding code you need to write. Basically, suppose you want to make a repair on the 5th story exterior of an 8-story building. You have a couple of options: the YOLO option is to kick everybody out of the building and have people rappel from the top of the building down to the 5th story to make their changes. They need all of the people out of the building because shut up it’s a strained analogy. This approach is kind of inconvenient: people have to stay out of your building until your people are done repairing the exterior. That’s blocking in the database world.
On the other side, you can build a bunch of scaffolding and attach it to the exterior of the building, perform your repairs, and tear down that scaffolding. While the scaffolding is up, people come and go like normal and don’t even think about it. As you tear the scaffolding down, you temporarily block the door for a moment as you’re finishing your work. This is much more convenient for end users and fits the “near-zero downtime” approach we’re taking.
Strained analogies aside, this is a long post on making a series of table-related changes without your end users noticing.
Versioning a procedure is pretty simple: you create a new procedure with alterations you want. Corporate naming standards where I’m at have you add a number to the end of versioned procedures, so if you have
dbo.SomeProcedure, the new version would be
dbo.SomeProcedure01. Then, the next time you version, you’ll have
dbo.SomeProcedure02and so on. For frequently-changing procedures, you might get up to version 05 or 06, but in practice, you’re probably not making that many changes to a procedure’s signature. For example, looking at a directory with exactly 100 procedures in it, I see 7 with a number at the end. Two of those seven procedures are old versions of procedures I can’t drop quite yet, so that means that there are only five “unique” procedures that we’ve versioned in a code base which is two years old. Looking at a different part of the code with 879 stored procedures, 95 have been versioned at least once in the 15 or so years of that code base’s existence. The real number is a bit higher than that because we’ve renamed procedures over time and renamings tend to start the process over as we might go from
dbo.SomeNewProcedurewhen we redesign underlying tables or make other drastic architectural changes.
The secret is, I’m always versioning.