Press "Enter" to skip to content

Curated SQL Posts

IF Statements and DAX

Marco Russo and Alberto Ferrari talk in hypotheticals:

DAX is a functional language. This means that – no matter how complicated it is – a measure is just ONE function call. Then, functions call other functions, creating the intricacies of a sophisticated DAX expression. However, there is always just one function at the top level. This is, at the same time, beautiful and painful, elegant and complex to understand. It is fair to say that being functional is what makes DAX so fascinating.

However, when a DAX formula is executed, it loses its functional nature. Indeed, in the end it needs to be transformed into a set of simpler queries executed by one of the engines of DAX: either the storage engine or the formula engine. During this step, the function execution is transformed, and it becomes much simpler.

Click through to see how the IF() function works in such a world.

Comments closed

The Downsides of SELECT FOR UPDATE in PostgreSQL

Laurenz Albe explains why SELECT FOR UPDATE is rarely the right call:

Recently, while investigating a deadlock for a customer, I was again reminded how harmful SELECT FOR UPDATE can be for database concurrency. This is nothing new, but I find that many people don’t know about the PostgreSQL row lock modes. So here I’ll write up a detailed explanation to let you know when to avoid SELECT FOR UPDATE.

Click through for the full explanation.

Comments closed

Result Set Caching in Microsoft Fabric Data Warehouse

Emily Tehrani makes an announcement:

Result Set Caching is now available in preview for Microsoft Fabric Data Warehouse and Lakehouse SQL analytics endpoint. This performance optimization works transparently to cache the results of eligible T-SQL queries. When the same query is issued again, it directly retrieves the stored result, instead of recompiling and recomputing the original query. This operation drastically cuts execution time for complex queries. The cache is then automatically managed on the user’s behalf. This lightweight performance boost is most beneficial for workloads like reports, that issue many repetitive T-SQL queries to the DW and SQL analytics endpoint.

This is something I’ve wished we had on-premises for years and years, especially for data warehouses where you know the data only changes once every x hours or days. You can, of course, do this yourself with the cache-aside pattern and some caching solution, but that implies you have a layer between your end user and the data source that you fully control.

Comments closed

Useful Query Store Metrics

Jared Poche gives us five:

Query Store is my favorite way to gather information about problem queries and plans, and I wanted to share some information on the useful metrics I use most.

The first two are obvious, but there’s a difference between them. The last two are not obvious but offer an unusual utility. I also wanted to explain why I use logical reads and mostly ignore physical reads.

Read on for Jared’s list.

Comments closed

Comparing Oracle and PostgreSQL Physical Architectures

Kellyn Gorman continues a series on learning PostgreSQL for Oracle DBAs:

In the previous post, I covered some high-level areas around installation and architecture, but for this post, we’re going to go a little deeper.  For the seasoned Oracle DBA, this should feel like we’re stepping into a familiar landscape with just a few different rules. While both PostgreSQL and Oracle Database are robust, feature-rich systems, their physical architecture and internal mechanics diverge in key areas, especially around storage structures, memory architecture, and background processing.

In this post, we’ll break down these differences so Oracle DBAs can feel more comfortable with the shift when they transition between the two.

Click through to see how the two differ.

Comments closed

Azure Data Factory Publishing Everything instead of Incremental Changes

Ed Elliott troubleshoots an issue:

I recently encountered an interesting issue with ADF where the publish feature suddenly attempted to republish every single object, claiming they were new, despite having incrementally published changed objects for some time.

We were using the publish feature where you work on a branch until you are happy, then you raise a PR to main, merge to main, and then switch back to ADF and click publish to push the changes to the adf_publish branch.

Click through for the answer. I also love how Ed’s tl;dr is “too bad, read it anyhow.”

Comments closed

Azure Data Factory Data Flow Logging

Rayis Imayev does a bit of logging:

Azure Data Factory is no exception when it comes to logging options. All your debug or triggered pipeline executions—their parameters passed during execution, statuses, timings, durations, and more, can be monitored natively in Azure Data Studio. Once you immerse yourself in the realm of previously executed pipelines and start seeing all activities, passed input values, processed output results, and variables being transformed into something else that can only be understood by examining internal expressions and many other details, you begin to feel like an investigator meticulously analyzing everything.

Read on to see what kinds of logging options are available and how you can work with them.

Comments closed