Press "Enter" to skip to content

Category: T-SQL

Calling a REST Endpoint from Azure SQL Database

Meagan Longoria makes a call:

External REST endpoint invocation in Azure SQL DB went GA in August 2023. Whereas before, we might have needed an intermediate technology to make a REST call to an Azure service, we can now use an Azure SQL Database to call a REST endpoint directly.

One use case for this would be to retrieve a file from blob storage. I explain how to set this up below.

Read on to learn more about the process, as well as a few ideas on when you might use it.

Comments closed

Translating Numbers to Words in T-SQL

Sebastiao Pereira needs no number:

Sometimes, it is necessary to have numbers spelled out in words, like when writing a sentence (“Two hundred sixty-one victims were hospitalized.”). How do you express numbers in words with T-SQL code?

I could see a bit more recursion in the solution making it tidier, but in fairness, a bit more recursion would make it a lot slower in T-SQL.

For moderate difficulty mode, try it in German. For hard mode, switch to French.

Comments closed

Removing Leading Zeroes from a String in T-SQL

Steve Stedman gets rid of leading zeroes:

When working with data in SQL Server, there may be times when you need to remove leading zeros from a string. This task can be particularly common when dealing with numerical data stored as strings, such as ZIP codes, product codes, or other formatted numbers. In this blog post, we’ll explore several methods to remove leading zeros in SQL Server.

I’m not sure I see the reason to use anything other than CAST() (or, better yet, TRY_CAST()), but Steve does show two other methods.

2 Comments

Unicode Characters in SQL Queries

Chad Baldwin does some interesting things with Unicode characters:

I don’t want to get too far into the weeds explaining encodings, code points, etc. Mostly because you can just Google it, but also because it’s very confusing. Despite all the hours I’ve spent trying to learn about it, I still don’t get a lot of it. There’s also a lot of nuance regarding encodings when it comes to SQL Server, different collations, and different SQL versions. However, I did come across this blog post that seems to break it down well.

Click through for a few scenarios, including things like tracking progress or marking data in a manner that’s easier to see then “True” and “False” values in columns.

Comments closed

Document Templates in SQL

Sebastiao Pereira fills in the blanks:

In some industries, there is a series of repeated content stored about a specific topic, i.e., contracts, terms and conditions, legal agreements, lease agreements, bills, etc., with only minor differences. With these documents, there is a need to just replace specific keywords. What is the most efficient way to maintain consistency, data accuracy, and optimize storage?

This is a technique I’ve used a few times, and there’s no real trick to it: just pick something as your tag identifier that will never appear in the text itself.

Comments closed

Execution Plans for Keyset Cursors

Hugo Kornelis talks about a cursor I’d never heard of before:

Welcome to plansplaining, part 32, where we once more look at cursors. We already discussed the basics, and looked at static cursors and dynamic cursors. It is now time to cast our gaze upon the keyset cursor. The keyset cursor is sort of in between the static cursor (which presents a snapshot of the data is at was when the cursor was opened and disregards future changes) and the dynamic cursor (that always shows the current data). To be precise, a keyset cursor sees changes made to already existing rows, but does not see new rows that were added after the cursor was opened.

Read on to learn more about it.

Comments closed

Performance Tuning XML Operations in SQL Server

Ed Pollack does a bit of tuning:

SQL Server provides a variety of ways to tune XML so that it provides consistent performance, consumes less space, all while ensuring efficient access to critical data.

At its core, the metadata-styled XML format runs counter to the data that SQL Server is optimized to manage. Therefore, additional features were added to SQL Server over time that allowed for XML data to be indexed and compressed.

While these features are critical for managing XML data as it becomes large, it is important to remember what XML is intended for and why it is (loosely) structured as it is. Many data professionals have used shortcuts when XML was small, such as storing and analyzing it in string format, only to be forced to reckon with performance challenges when scanning large strings become agonizingly slow.

Read on for the full article.

Comments closed

Approximate Percentiles in SQL Server 2022

Chad Callihan tries out a big improvement:

How do you go about finding the median percentile of a data set? What if you need the top x percentile? Both the APPROX_PERCENTILE_CONT and APPROX_PERCENTILE_DISC functions can be used to solve these questions.

Let’s look at how we can use each and what makes them unique.

The approximate percentiles are guaranteed to be accurate to within a certain percentage, something like 3-5%, if I remember correctly–it’s higher than HyperLogLog’s ~2.5% but not so large as to be of low value. If you’ve ever tried to calculate a median or other percentile like the 75th or 95th percentile, you might have used PERCENTILE_CONT() in the past. At least until you get a few million rows in the table, at which point you stopped using it. My joke is, once you reach a certain table size, PERCENTILE_CONT() becomes so slow that it’s faster to install and configure SQL Server ML Services, learn R or Python, and send in the data to calculate a percentile than to wait for PERCENTILE_CONT() to complete.

The APPROX_PERCENTILE_* series is way, way faster. On reasonable-sized test cases of a couple million rows or so, my recollection is two orders of magnitude better performance, so long as you can deal with being off by a few percentage points. One of the best scenarios for something like this is calculating 95th percentile response times. Does it really matter that the actual response time was 187.5ms and SQL Server said 192.6 or 181.4? Probably not—you get a good idea of the magnitude, and that’s the important part here.

Comments closed

UNISTR() and || in Azure SQL Database

Abhiman Tiwari announces a new function and a new operator:

We are excited to announce that the UNISTR intrinsic function and ANSI SQL concatenation operator (||) are now available in public preview in Azure SQL Database. The UNISTR function allows you to escape Unicode characters, making it easier to work with international text. The ANSI SQL concatenation operator (||) provides a simple and intuitive way to combine characters or binary strings. These new features will enhance your ability to manipulate and work with text data. 

Click through to learn more about both. Honestly, I’d rather stick with CONCAT() versus using || because of how CONCAT() handles NULL without me having to check every operand first.

Comments closed

Generating Data in SQL Server based on Distributions

Rick Dobson builds some data:

I support a data science team that often asks for datasets with different distribution values in uniform, normal, or lognormal shapes. Please present and demonstrate the T-SQL code for populating datasets with random values from each distribution type. I also seek graphical and statistical techniques for assessing how a random sample corresponds to a distribution type.

This is an interesting article, though if you want a set-based version of generating data according to a normal distribution, I have a blog post where I translated the RBAR version into something that performs a bit better. Converting to log-normal form also makes a lot of intuitive sense.

Comments closed