Press "Enter" to skip to content

Category: T-SQL

Date And Time Functions To Avoid

Randolph West shares his thoughts on three functions he’d rather you avoid:

CURRENT_TIMESTAMP is the ANSI-equivalent of GETDATE(). ANSI is an acronym for the American National Standards Institute, and sometimes vendors will include ANSI functions in their products so they can say that they’re ANSI-compliant (which is not a bad thing, in most cases).

There are three main problems with CURRENT_TIMESTAMP:

  • No brackets. It goes against the rules about functions. So much for standards!
  • It’s functionally equivalent to GETDATE(), which uses DATETIME, which we previously identified is old and bad.
  • It’s too similar to the poorly-named TIMESTAMP data type, which has nothing to do with dates and times and should be called ROWVERSION.

Bottom line: don’t use CURRENT_TIMESTAMP.

At one point I used CURRENT_TIMESTAMP over GETDATE() with the thought of portability in mind.  Since then, my thoughts on code portability have changed and regardless, as Randolph mentions, it’s better to use DATETIME2 functions to avoid precision issues with DATETIME.

Comments closed

Useful Azure SQL Database T-SQL Statements

Arun Sirpal has a few T-SQL statements to help manage an Azure SQL Database database:

Creating a database

Very easy to create something like the below – a S2 database.

CREATE DATABASE [MeeTwoDB]
( MAXSIZE = 5GB, EDITION = 'standard', SERVICE_OBJECTIVE = 'S2' ) ;  

--checking details
SELECT Edition = DATABASEPROPERTYEX('MeeTwoDB', 'Edition'),
       ServiceObjective = DATABASEPROPERTYEX('MeeTwoDB', 'ServiceObjective')

Read on for several more examples.

Comments closed

Views Don’t Improve Performance

Grant Fritchey lays down the law on views:

One day, it’s going to happen. I’m going to hear some crazy theory about how SQL Server works and I’m going to literally explode. Instead of some long silly rant with oddball literary & pop culture references you’ll get a screed the size of Ulysses (and about as much fun to read). However, for the moment, like Robin Williams describing a dance move, I’m going to keep it all inside. Here’s our query:

No, no where clause because we have to compare this to this, our view:

Grant used up much of his strategic reserve of GIFs in that post, so check it out.

Comments closed

Deleting A Small Number Of Rows From A Big Table

Brent Ozar has a tip for deleting a relatively small percentage of data from a very large table:

Say you’ve got a table with millions or billions of rows, and you need to delete some rows. Deleting ALL of them is fast and easy – just do TRUNCATE TABLE – but things get much harder when you need to delete a small percentage of them, say 5%.

It’s especially painful if you need to do regular archiving jobs, like deleting the oldest 30 days of data from a table with 10 years of data in it.

The trick is making a view that contains the top, say, 1,000 rows that you want to delete:

Read on for a demo.

Comments closed

Scalar Function Blocking

Erik Darling notes that scalar functions can cause multi-table blocking:

Someone had tried to be clever. Looking at the code running, if you’ve been practicing SQL Server for a while, usually means one thing.

A Scalar Valued Function was running!

In this case, here’s what it looked like:

Someone had added that function as a computed column to the Users table:

Spoilers:  this was a bad idea.

Comments closed

Recently Added String Functions

Lori Brown covers a few string functions added to SQL Server in the past two versions:

STRING_ESCAPE (https://docs.microsoft.com/en-us/sql/t-sql/functions/string-escape-transact-sql)

This function is available starting with SQL 2016 and is currently only able to escape JSON characters. To me it’s not super useful just yet but hopefully they will add more types soon.

I haven’t had the need to use STRING_ESCAPE yet, but one additional function I’d add is CONCAT_WS.

Comments closed

Finding Overlapping Data Ranges

Louis Davidson shows how to find groups of data which overlap:

This week, I had a problem where I needed to find and eliminate from the results of my query, data with overlapping ranges. I have written about this topic before, in my database design book book, in regards to building a trigger to avoid overlapping ranges. But even though I have written on the topic there, I still use Google just like you to get quick help (I use books when I want to learn, or expand my knowledge on a topic in depth, blogs when I need a simple answer to a simple or complex question.)

The problem at hand is most often associated with date based data, such as effective dates for a row in a temporal/type 2 dimension table, or other cases like appointment times, etc. But the algorithm is the same with numbers and is a bit easier to read since we don’t have the same issues with roundoff and decimal places (the query is complex enough on its own to show in a blog post). From a progression of start and end values in each row, we are going to look at how to check to make sure that there are no two rows that are in conflict (no range should contain another ranges value at all).

This feels like the type of thing which could be rewritten with window function to be a little smoother, but I’d have to think about it more.  Louis does provide a good solution and explanation to a fairly common but tricky problem in T-SQL.

Comments closed

Reverse Engineering The Stream Aggregate Algorithm

Itzik Ben-Gan has started a series of articles on optimizing queries which use grouping and aggregating with a reverse-engineering of the stream aggregate algorithm:

As you may already know, when SQL Server optimizes a query, it evaluates multiple candidate plans, and eventually picks the one with the lowest estimated cost. The estimated plan cost is the sum of all the operators’ estimated costs. In turn, each operator’s estimated cost is the sum of the estimated I/O cost and estimated CPU cost. The cost unit is meaningless in its own right. Its relevance is in the comparison that the optimizer makes between candidate plans. That is, the costing formulas were designed with the goal that, between candidate plans, the one with the lowest cost will (hopefully) represent the one that will finish more quickly. A terribly complex task to do accurately!

The more the costing formulas adequately take into account the factors that truly affect the algorithm’s performance and scaling, the more accurate they are, and the more likely that given accurate cardinality estimates, the optimizer will choose the optimal plan. At any rate, if you want to understand why the optimizer chooses one algorithm versus another you need to understand two main things: one is how the algorithms work and scale, and another is SQL Server’s costing model.

So back to the plan in Figure 1; let’s try and understand how the costs are computed. As a policy, Microsoft will not reveal the internal costing formulas that they use. When I was a kid I was fascinated with taking things apart. Watches, radios, cassette tapes (yes, I’m that old), you name it. I wanted to know how things were made. Similarly, I see value in reverse engineering the formulas since if I manage to predict the cost reasonably accurately, it probably means that I understand the algorithm well. During the process you get to learn a lot.

Our query ingests 1,000,000 rows. Even with this number of rows, the I/O cost seems to be negligible compared to the CPU cost, so it is probably safe to ignore it.

As for the CPU cost, you want to try and figure out which factors affect it and in what way.

I give this my highest recommendation.

Comments closed

Query Tuning With The APPLY Operator

Daniel Janik walks through using the APPLY operator to tune a couple of queries:

Recently we were doing a project that heavily focused on query tuning and many tables had various outer joins. My co-worker pointed out that many of these could be converted to an apply rather than a join.

Apply gives you both CROSS and OUTER. Think of CROSS APPLY like an INNER JOIN and OUTER APPLY like an OUTER JOIN.

Let’s compare some code to see how APPLY stacks up.

I like the APPLY operator so much that I created an entire presentation on it.  It’s not a cure-all by any means, but if you understand the intent, you can find places where it improves your code significantly.

Comments closed