Press "Enter" to skip to content

Category: Syntax

What’s Common in Regular Expressions

John Cook muses on regular expression libraries:

The most frustrating aspect of regular expressions is that implementations vary. Features supported in one tool may not be supported at all in another tool, or they may be supported with slightly different syntax.

I learned regular expressions in the context Perl, a maximalist regex environment. This led to frustration when features I expect to work are missing [1]. One way around this is to use Perl analogs of other tools, but this is very non-standard. I want to be able to send colleagues and clients code that works out of the box.

Click through for some thoughts about the lowest common denominator for what products tend to support around regex. This is one of several tricky things when working with regular expressions: you may know a great way to solve a specific class of problem, but does the particular engine you’re using actually support that method?

Leave a Comment

What “Filtering Early” Really Means

Louis Davidson lays out the facts:

Which brings me to the point. There is a myth that goes around that you need to place filters in your SQL statements as early in the statement as possible. Most of this is due to the wild misunderstanding of how a query is executed (versus how your query is processed, which I covered last week.) The actual issue here is that the concept of filtering early is actually true, but certainly not in the way it has been taught.

SQL is a fourth-generation language and implementations approach it. With fourth-generation languages, the actual query you write is not the thing that runs, and there is an entire process to interpret what you wrote and execute operations that meet the intent of your query in the most efficient manner.

Now, this is where someone chimes in and gives all of the circumstances in which T-SQL (or pick your variant) fails to live up to its fourth-generation heritage, such as particularly complex queries, nested views with multiple joins, you using mechanisms that force a specific plan, etc. This is because real life is messy, as Louis shows in some of the examples.

So what’s the point of the first paragraph, then? Because I never miss an opportunity to talk about language generations.

Leave a Comment

Postgres NULLs and NOT IN

Radim Marek lays out a common issue people experience in PostgreSQL:

NOT IN query can return the wrong answer without telling you. It is valid SQL, it runs without an error, and it hands back a perfectly well-formed result set that happens to be empty when it should not be. No warning, no hint, nothing in the logs: just zero rows where you expected hundreds, and a database that considers it correct.

Almost always the cause is a single NULL sitting somewhere you forgot to look, combined with two keywords you have typed a thousand times: NOT IN. None of it is a Postgres bug. This is exactly what the SQL standard mandates, implemented faithfully. That is precisely what makes it so easy to walk into, and why the planner could not safely optimize around it for the better part of Postgres’s history. It comes down to one if statement in the parser.

This is a Postgres-specific problem, as the same code runs successfully in SQL Server. But if you are working with Postgres, it’s good to keep track of this behavior, and Radim has solid advice for a proper workaround.

2 Comments

Query Execution vs Query Processing

Louis Davidson disambiguates a pair of terms:

There have been a lot of posts on LinkedIn of late about the “logical execution order” of a query that all really miss some really big points. I was corrected myself in some terminology because I mistook the term “processing” to mean the same as “order” in these discussions when I was explaining why logical execution order is not what people expected.

Click through for a good explanation, as well as a plan to have your company pay for your Disney World trip.

I also appreciate how Louis still calls it SQL Sentry Plan Explorer because that’s what it always will be in my heart.

Leave a Comment

ORDER BY COALESCE() in PostgreSQL (and SQL Server)

Laetitia Avrot digs in:

I was reading Markus Winand’s latest post on ORDER BY history last week. If you haven’t read it yet, go read it. Markus is one of the best writers on SQL standards, and this post is no exception.

One line stopped me cold. The compatibility table for “expressions on selected columns.” Postgres: partial. PostgreSQL 18: still partial.

That itch needed scratching.

The basic version of this is that you cannot use the alias of a computed expression in a function in the ORDER BY clause in either PostgreSQL or SQL Server. In other words, the following fails:

SELECT a + b AS x
FROM t
ORDER BY COALESCE(x, 0);

Read on for an explanation of why this is the case in PostgreSQL. I’d imagine that the reasoning is about the same for SQL Server.

Comments closed

The Pain of NULL

Louis Davidson explains the unknown:

There is no simpler topic in relational comparisons than three valued logic. I am being mostly facetious about this, but in reality, it seems so simple that people don’t think about how a NULL works, and make mistakes all of the time. I was reading a post about this the other day on LinkedIn (which by no means could one ever find again!) where one of the comments chastised the author of the post for not understanding “the fundamentals” of relational theory. The original poster wasn’t completely right (and my post may not be completely perfect either, though I will back most of what I write with code.)

In this post I want to point out a few of the key basics that one really should understand.

Click through for a primer on what NULL means and doesn’t mean. And by the time you’re done, I’d like to interest you in the power of 6th normal form, where you can effectively banish NULL into the abyss (at least until you join the bits back together).

Comments closed

Working with PIVOT and UNPIVOT in SQL Server

Ed Pollack explains a pair of operators:

There are few operators in T-SQL that cause developers to scramble for documentation more than PIVOT and UNPIVOT. Beyond documentation, transforming columns into rows (and vice-versa) can often be confusing and frustrating for those of us tasked with reformatting data for use by an application.

This article walks through PIVOT and UNPIVOT, providing examples of simple use cases for both – as well as some more complex scenarios we can run into in real-world data. These can be extraordinarily useful ways to reformat data efficiently and quickly with less code than the alternatives. So, there is no need to fear them again!

Click through for Ed’s article. I definitely don’t fear either PIVOT or UNPIVOT and they can be quite useful. But if you locked me in a room and I couldn’t leave until I came up with the proper syntax for both from memory, well, I’d be in that room for a while.

Comments closed

The Power of COALESCE()

Lukas Vileikis shows off a bit of ANSI SQL syntax:

When the need to deal with NULL values arises, multiple queries come onto the scene. The SQL COALESCE function is one of them. In simple terms, the SQL COALESCE function is a ‘fallback’ mechanism for missing data. Its only task is to return the first non-NULL value from a list of values.

I used to be a big believer in COALESCE() all of the time, but it turns out that ISNULL() is faster if you only have two things to compare. Granted, it’s not a huge difference in speed, as I recall, but the difference is there.

Comments closed

Avoid JOIN USING in SQL Scripts

Lukas Eder covers an esoteric bit of syntax:

Some SQL operators are as esoteric as they’re powerful. One of the oldest operator that you’ve likely hardly ever used in real world applications is NATURAL JOIN which is the default in relational algebra. We’ve covered a funky use-case for NATURAL JOIN earlier on this blog.

Click through for a similar example using JOIN USING. It’s best to be specific in your SQL queries, at least the permanent ones that you add to scripts you expect to re-run in the future or make part of applications.

Comments closed