Press "Enter" to skip to content

Category: Syntax

Row Pattern Recognition in PostgreSQL

Tatsuo Ishii makes me jealous:

Row pattern recognition (RPR) is a feature defined in the SQL standard. It allows to search for a sequence of rows by pattern.  Since I am working on this feature for PostgreSQL, I would like to give a brief introduction to RPR.

This is something I’ve wanted to see in SQL Server for the past few years. It’s not something people might use every day, but it solves a particular class of problem extremely well.

Comments closed

Execution Plans for Keyset Cursors

Hugo Kornelis talks about a cursor I’d never heard of before:

Welcome to plansplaining, part 32, where we once more look at cursors. We already discussed the basics, and looked at static cursors and dynamic cursors. It is now time to cast our gaze upon the keyset cursor. The keyset cursor is sort of in between the static cursor (which presents a snapshot of the data is at was when the cursor was opened and disregards future changes) and the dynamic cursor (that always shows the current data). To be precise, a keyset cursor sees changes made to already existing rows, but does not see new rows that were added after the cursor was opened.

Read on to learn more about it.

Comments closed

Enumerations and Ordering in Postgres

Christoph Schiessl sorts things out:

Custom ENUM types in PostgreSQL are an excellent tool for enforcing certain database constraints, but you must be careful if you use SELECT queries and want to ORDER BY these columns. Recently, I had to fix a bug whose root cause was a misunderstanding of this behavior. It’s just a contrived example, but imagine a table of people with their marital status, which is implemented as a custom ENUM type.

Read on to learn more about the misunderstanding and some of the unexpected trickiness involved in getting a good query plan.

Comments closed

SELECT FOR UPDATE in Postgres

Semab Tariq notes some syntax to assist with performing updates in a MVCC world:

In critical environments like banking, healthcare, and online retail, ensuring safe data modifications is crucial to prevent data corruption and maintain system integrity. PostgreSQL offers a robust solution for this with its row-level locking mechanism, which ensures that the data being modified is protected from concurrent changes. One key feature of PostgreSQL is the SELECT FOR UPDATE clause, which locks the selected rows against concurrent updates. In this blog, we will explore how to implement the SELECT FOR UPDATE clause in PostgreSQL and discuss its real-world use cases.

Read on to learn more about how it works.

Comments closed

Approximate Percentiles in SQL Server 2022

Chad Callihan tries out a big improvement:

How do you go about finding the median percentile of a data set? What if you need the top x percentile? Both the APPROX_PERCENTILE_CONT and APPROX_PERCENTILE_DISC functions can be used to solve these questions.

Let’s look at how we can use each and what makes them unique.

The approximate percentiles are guaranteed to be accurate to within a certain percentage, something like 3-5%, if I remember correctly–it’s higher than HyperLogLog’s ~2.5% but not so large as to be of low value. If you’ve ever tried to calculate a median or other percentile like the 75th or 95th percentile, you might have used PERCENTILE_CONT() in the past. At least until you get a few million rows in the table, at which point you stopped using it. My joke is, once you reach a certain table size, PERCENTILE_CONT() becomes so slow that it’s faster to install and configure SQL Server ML Services, learn R or Python, and send in the data to calculate a percentile than to wait for PERCENTILE_CONT() to complete.

The APPROX_PERCENTILE_* series is way, way faster. On reasonable-sized test cases of a couple million rows or so, my recollection is two orders of magnitude better performance, so long as you can deal with being off by a few percentage points. One of the best scenarios for something like this is calculating 95th percentile response times. Does it really matter that the actual response time was 187.5ms and SQL Server said 192.6 or 181.4? Probably not—you get a good idea of the magnitude, and that’s the important part here.

Comments closed

Updating Records in a Kusto Database

Vincent-Philippe Lauzon has an announcement:

Kusto databases, either in Fabric (KQL Database) or in Azure (Azure Data Explorer), are optimized for append ingestion.

In recent years, we’ve introduced the .delete command, allowing you to selectively delete records.

In February, we introduced the .update command in public preview.  This command allows you to update records by deleting existing records and appending new ones in a single transaction.

Today, the .update is Generally Available (GA)!

Click through for more details, including a link to the documentation, where you can see several examples of the syntax.

Comments closed

The CLEAN Block in Powershell

Mike Robbins takes us through some relatively new functionality:

PowerShell, a powerful scripting language and automation framework, provides features that enhance script development and execution. Among these features is the clean block, a lesser-known yet beneficial component in PowerShell functions. This article explores the clean block, its purpose, and how to use it effectively in PowerShell scripts.

Read on to learn more about the block and how it works.

Comments closed

Emulating the FILTER Clause in Oracle

Lukas Eder notes a bit of ANSI SQL:

The following aggregate function computes the number of rows per group which satifsy the FILTER clause:

SELECT  COUNT(*) FILTER (WHEREBOOK.TITLE LIKE'A%'),  COUNT(*) FILTER (WHEREBOOK.TITLE LIKE'B%'),  ...FROMBOOK

This is useful for pivot style queries, where multiple aggregate values are computed in one go. For most basic types of aggregate function, it can be emulated simply by using CASE expressions, because standard aggregate functions ignore NULL values when aggregating. 

Lukas shows how you can also implement this logic using JSON_TRANSFORM() though I think I’d just as soon stick with COUNT(CASE WHEN BOOK.TITLE LIKE 'A%' THEN 1 END) and try hard not to think about shredding JSON.

Comments closed

UNISTR() and || in Azure SQL Database

Abhiman Tiwari announces a new function and a new operator:

We are excited to announce that the UNISTR intrinsic function and ANSI SQL concatenation operator (||) are now available in public preview in Azure SQL Database. The UNISTR function allows you to escape Unicode characters, making it easier to work with international text. The ANSI SQL concatenation operator (||) provides a simple and intuitive way to combine characters or binary strings. These new features will enhance your ability to manipulate and work with text data. 

Click through to learn more about both. Honestly, I’d rather stick with CONCAT() versus using || because of how CONCAT() handles NULL without me having to check every operand first.

Comments closed

Using the CONVERT() Function in T-SQL

Joe Gavin shows how to use a function:

A common task while working with data in Microsoft SQL Server is converting from one data type to another. Most often, it’s done to change the way data is presented, but sometimes it is needed to make sure the right data types are being used for comparisons, joins, or sorting.

The SQL CONVERT function, which has long been part of the SQL language, and as the name implies, can be used to convert a value of one data type into a specified data type with optional formatting attributes. CONVERT gives you the ability to format, whereas the ISO Compliant CAST function does not.

My very strong recommendation for 99% or so of the audience: use TRY_CONVERT() instead. TRY_CONVERT() came out in SQL Server 2012 (sorry for the 1% stuck pre-2012) and has the same performance profile as CONVERT(), except that, when conversion fails, TRY_CONVERT() returns NULL rather than throwing an error.

There is also a TRY_CAST() that does exactly what you think it would.

Comments closed