Press "Enter" to skip to content

Category: Syntax

Finding Substrings in a String with T-SQL

Kevin Wilkie avoids a regex:

Continuing on with our series from last time – see here if you somehow missed it – let’s have some more fun with the different functions we can use with strings.

This time, let’s focus on looking for different items we can use to find a string within a string.

With T-SQL not natively supporting regular expressions—though you can use a CLR module to do this—click through to see what Kevin uses.

Comments closed

Implementing GREATEST in SQL Server 2019

Ronen Ariely is on a mission to be the greatest:

The function GREATEST returns the maximum value from a list of one or more expressions. It returns the data type with the highest precedence from the set of types passed to the function.

This function was added to Azure SQL. At this time, it is supported in Azure SQL Database, Azure SQL Managed Instance and Azure Synapse Analytics serverless. 

Unfortunately, it not yet supported on SQL Server on premises and synapse dedicated sql pool.

Click through for a pair of alternative constructs while we wait for GREATEST on-premises.

Comments closed

Use TOP instead of SET ROWCOUNT

Jared Poche explains why the TOP clause is superior to using SET ROWCOUNT:

I was presenting on how to use the TOP clause to break down large operations into short, fast, bite-sized operations. The mechanics are things I learned from writing processes that do garbage collection, backfill new columns, and anonymizing PII data on existing tables. I’ve just posted the slides and example scripts here if you are interested.

ARE THEY THE SAME?

The question was whether the SET ROWCOUNT command would work just the same, and the answer is sometimes yes but largely no.

Read on to see what Jared means.

Comments closed

Using GREATEST and LEAST in Azure SQL DB

Aaron Bertrand preps us for SQL Server 2022:

In an earlier tip, “Find MAX value from multiple columns in a SQL Server table,” Sergey Gigoyan showed us how to simulate GREATEST() and LEAST() functions, which are available in multiple database platforms but were – at least at the time – missing from Transact-SQL. These functions are now available in Azure SQL Database and Azure SQL Managed Instance, and will be coming in SQL Server 2022, so I thought it was a good time to revisit Sergey’s methods and compare.

Read on to see how the workaround compares.

Comments closed

CONVERT and Binary Styles

Abayomi Obawomiye has style:

I recently had a requirement to load some data from a source table to another destination table. The destination columns were exactly the same as the source columns with the same data types and length. The only difference was that some columns on the destination table must be encrypted. The task was to use the SHA2_512 encryption algorithm to encrypt the “sensitive” data. I will talk more about the encryption algorithm in another post.

To achieve this, I needed to use the HASHBYTES function in SQL Server. The challenge was that this function used with the SHA2_512 encryption algorithm will return a fixed character length of 64 characters which will be longer than the character length on my destination table. As a result, SQL Server will throw a truncation error. I will demonstrate this below.

One really important point: SHA is not encryption; it’s a hash (which is why the function is HASHBYTES() instead of something like EncryptByKey() as column-level security uses). Hashes are intended to be a one-way trip, whereas encryption implies an ability to decrypt if you have the relevant key details. Here, the use looks to be obfuscating the text of sensitive data fields, perhaps for loading in a dev/test environment, and so the actions themselves are quite reasonable.

By the way, the styles Abayomi talks about are all listed in this Docs page. Turns out that if you’re using a money datatype, you can use CONVERT() to display the end result with commas.

Comments closed

Powershell Functions and Return Type Oddities

Dave Mason goes down the rabbit hole:

As a result of some struggles trying to automate a process, I’ve learned some things about PowerShell. After getting to the bottom of a time-consuming problem, I thought it worth a blog post that might save someone else some time and heartache.

Let’s begin with this simple function named Get-RandomDate. It generates and returns a random date that is between today and X days ago. It has an input parameter $DaysAgo, which is of type [System.Int32]–it is a mandatory parameter.

That’s all straightforward, but then things get weird.

Comments closed

Ordered Set Functions in SQL Server

I continue a series on window functions in SQL Server:

As of SQL Server 2019, there is only one ordered set function: STRING_AGG(). I like STRING_AGG() a lot, especially because it means my days of needing to explain the STUFF() + FOR XML PATH trick to concatenate values together in SQL Server are numbered.

STRING_AGG() is interesting in that we categorize it as a window function and yet it violates my first rule of window functions: there isn’t an OVER() clause. Instead, it accepts but does not require a WITHIN GROUP() clause. Let’s see it in action.

Click through for a look at that, as well as a little hint that maybe we’ve seen ordered set functions before in a different guise.

Comments closed

FAST_FORWARD and Cursors

Joe Obbish skips past the commercials:

If you’re like me, you started your database journey by defining cursors with the default options. This went on until a senior developer or DBA kindly pointed out that you can get better performance by using the FAST_FORWARD option. Or maybe you were a real go-getter and found Aaron Bertrand’s performance benchmarking blog post on different cursor options. I admit that for many years I didn’t care to know why FAST_FORWARD sometimes made my queries faster. It had “FAST” in the name and that was good enough for me.

Recently I saw a production issue where using the right cursor options led to a 1000X performance improvement. I decided that ten years of ignorance was enough and finally did some research on different cursor options. This post contains a reproduction and discussion of the production issue.

I thought everybody knew how this works: the database streams the data tape from the supply pully to the play shaft by using a sprocket to rotate the gear in the cassette at a fixed speed. The FAST_FORWARD cursor option engages the fast forward idler in the VCR database and causes rotation to occur more rapidly than normal.

Comments closed

Statistical Window Functions in SQL Server

I continue a series on window functions in SQL Server:

CUME_DIST() doesn’t show 0 for the smallest record. The reason for this is in the definition: CUME_DIST() tells us how far along we are in describing the entire set—that is, what percentage of values have we covered so far. This percentage is always greater than 0. By contrast, PERCENT_RANK() forces the lowest value to be 0 and the highest value to be 1.

Another thing to note is ties. There are 117 values for customer 1 in my dataset. Rows 5 and 6 both have a percent rank of 0.0344, which is approximately rank 4 (remembering that we start from 0, not 1). Both rows 5 and 6 have the same rank of 4, and then we move up to a rank of 6. Meanwhile, for cumulative distribution, we see that rows 5 and 6 have a cumulative distribution of 6/117 = 0.5128. In other words, PERCENT_RANK() ties get the lowest possible value, whereas CUME_DIST() ties get the highest possible value.

Click through for much more detail, including examples galore.

Comments closed