This is going to be a bit of a brain storming post that comes from an interesting question that I was asked today…
“I’ve got a table with a ID code field, now some of the rows have a value in that field and some are NULL. How can I go about filling in those NULL values with a valid code but at the same time avoid introducing duplicates?”
Click through for David’s solution.
Pedro Lopes who is now on the SQL Server dev team wrote the blog post “T-SQL Misconceptions – JOIN ON vs. WHERE” which does a good job of explaining the “it depends” around the newer style of joins. Why link this article? Microsoft deprecated the OUTER JOIN operators in SQL Server 2008. Two other sources of information on this are here and here (the first is Ward Pond’s old technet blog, and sadly will probably go away soon). If you’re keeping score at home, WHERE clauses are not deprecated except if you’re using *= and =*). The changes people made were wholly unnecessary and as the author, the newer stuff is harder to decipher than what I originally did. They were putting their own biases onto things.
I personally do not like ANSI-89 syntax because it blurs the lines between filters and join criteria. Those are separate things serving different purposes and keeping them separate clarifies queries more than it obscures. Allan’s example doesn’t have any filters but in a more complex scenario with several filters and several join criteria, it can require extra care figuring out what’s going on, especially with multi-column join criteria and filters mixed in (meaning it’s not join criteria and then filters, but a mishmash of the two).
The next step is to write the code to capture the counter values and insert the data it the temporary table created above. Because we need to capture the values over a period of time, the WAITFOR DELAY is used. In this case the delay is 10 seconds, although you can change this to suit your needs. Of course, don’t forget to increment the counter variable. You will need to determine what counters you would like to capture. Notice in the WHERE clause, we are looking for an instance_name of ” or ‘_total’. This will allow the code to only capture one row for each counter. The number 10 is the number of times we want to capture the counter values. If you want to capture the data more frequently, simpley modify the number of seconds in the WAITFOR DELAY. Here is link to my post in this topic, WAITFOR.
Dynamic pivoting in SQL is unnecessarily difficult, especially compared to languages like R.
I’m getting close to the end of my series on near-zero downtime deployments. This latest post involves identity column changes:
There are some tables where you create an identity value and expect to cycle through data. An example for this might be a queue table, where the data isn’t expected to live permanently but it is helpful to have a monotonically increasing function to determine order (just watch out for those
wrap-aroundsand you’re fine). An example of reseeding is below:
DBCC CHECKIDENT('dbo.MyTable', RESEED, 1);
This operation needs to take a
LCK_M_SCH_Mlock, otherwise known as a schema modification lock. Any transactions which are writing to the table will block your transaction but so will any readers unless you have Read Committed Snapshot Isolation turned on or the reader is in
If you are using RCSI and don’t have extremely long-running transactions, this is an in-and-out operation, so even though there’s a little bit of blocking, it’s minimal.
Not all changes are this easy, though.
Itzik Ben-Gan has started a new series on T-SQL bugs, pitfalls, and best practices. The first post in this series is on non-deterministic behavior:
A nondeterministic function is a function that given the same inputs, can return different results in different executions of the function. Classic examples are SYSDATETIME, NEWID, and RAND (when invoked without an input seed). The behavior of nondeterministic functions in T-SQL can be surprising to some, and could result in bugs and pitfalls in some cases.
Many people assume that when you invoke a nondeterministic function as part of a query, the function gets evaluated separately per row. In practice, most nondeterministic functions get evaluated once per reference in the query.
This is the start to a great series.
The locking story is not the same as with the primary and unique key constraints. First, there’s one extra piece: the transition will block access to
dbo.LookupTableas well as the table we create the constraint on. That’s to keep us from deleting rows in our lookup table before the key is in place.
Second, the locks begin as soon as we hit F5. Even
SELECTstatements get blocked requesting a
LCK_M_SCH_Slock. Bad news, people.
So what can we do to get around this problem? Two routes: the ineffectual way and the ugly way.
Despite my being a ray of sunshine here, you should still check this out. It’s shorter than the average Russian novel, at least.
In a traditional gaps and islands problem, the goal is to identify groups of continuous data sequences (islands) and groups of data where the sequence is missing (gaps).
While many people encounter gaps and islands problems when dealing with ranges of dates, and recently I did too but with an interesting twist:
How do you determine gaps and islands of data that has overlapping date ranges?
Check out Bert’s explanation of the solution; it’s a good one.
I recently published a post detailing the new Scalar UDF Inlining feature in SQL 2019 here. That post introduced the new feature in a way that I used to compare performance to the other function types, continuing the performance evaluation of functions that I had previously posted here and here. In the Scalar UDF Inlining post, I used a function to strip all non-numeral values from a string, and to return the result. This used the FOR XML output option.
In thinking about how scalar functions are commonly used, I’ve decided to revisit this feature with a simpler function. I will still compare it to all the other types of functions to see how Scalar UDF Inlining compares to the others.
Wayne’s results are music to the product team’s ears, I’m sure.
Two different T-SQL methods were used for testing. The FOR XML PATH solution incorporates the reverse and stuff functions along with FOR XML PATH. The String_agg solution is much simpler, but is only available on SQL Server 2017 currently.
The FOR XML PATH method has been documented thoroughly over the years in many articles, so these explanations will not be repeated here. However, two points are worth mentioning. The first one is that the @strSeparator (in this case ‘ or ’) string will reside at the end of the constructed string (in this case @strFilter). Since the last occurrence of @strSeparator must be removed to prevent a syntax error, characters must be dropped from the end of the string. Although it may seem obvious that four characters (the length of @strSeparator) should be removed, actually only three need be removed because the end of the constructed string is “r” not “ “. Therefore, the length argument of the left function specification is the length of @strFilter minus three.
STRING_AGG a lot, primarily because I can remember it and I can explain it to people relatively new to SQL Server. Read on to see how they both compare in terms of performance.
Now you need to install ODBC Driver for CosmosDB on the computer where you have SQL Server installed. I’m using Microsoft Azure Cosmos DB ODBC 64-bit.msi for 64-bit Windows – 64-bit versions of Windows 8.1 or later, Windows 8, Windows 7, Windows Server 2012 R2, Windows Server 2012, and Windows Server 2008 R2.
Once you install this driver, you should setup ODBC source in system DSN and test the connection:
If you’re running SQL Server 2019, you can follow Jovan’s first two steps and then create an external data source and table with PolyBase to get to the same results.