Press "Enter" to skip to content

Curated SQL Posts

T-SQL Tuesday 184 Round-Up

Deborah Melkin casts a wide net:

There were a lot of themes that I noticed throughout everyone’s posts. First were the number of people who mentioned that mentoring doesn’t have to be formal or even a 1:1 relationship. Mentoring isn’t just for adults and careers, but for the next generation too. Mentoring has helped their careers or become part of a core tenant in their company and how they run their business. It’s a place to grow our community, and not just for those who look like us. We all talked about how we have grown from mentoring, not just as mentees but as mentors.

Click through for a dozen-and-a-half responses to the T-SQL Tuesday call.

Comments closed

Improving the Microsoft Fabric Copy Job

Krishnakumar Rukmangathan makes a copy:

Copy Job has been a go-to tool for simplified data ingestion in Microsoft Fabric, offering a seamless data movement experience from any source to any destination. Whether you need batch or incremental copying, it provides the flexibility to meet diverse data needs while maintaining a simple and intuitive workflow.

We continuously refine Copy Job based on customer feedback, enhancing both functionality and user experience. In this update, we’re introducing three key UX improvements designed to streamline your workflow and boost efficiency.

Read on for those three improvements.

Comments closed

An Explanation of PostgreSQL’s Citus Extension

Craig Kerstiens covers a misunderstood extension:

Citus is in a small class of the most advanced Postgres extensions that exist. While there are many Postgres extensions out there, few have as many hooks into Postgres or change the storage and query behavior in such a dramatic way. Most that come to Citus have very wrong assumptions. Citus turns Postgres into a sharded, distributed, horizontally scalable database (that’s a mouthful), but it does so for very specific purposes.

Read on to learn when Citus can work well, when it isn’t a good fit, and a few architecture and design recommendations around using the extension.

Comments closed

Understanding Availability Zones in Azure

Mika Sutinen explains some of the nuance around Azure availability zones:


Azure Availability Zones
 help provide resiliency to your database services within an Azure Region. I simply love it how simple Microsoft has made building geographically dispersed database services. If you’ve ever designed and deployed multi-site, highly available database services in on-premises, you know what I am talking about.

However, with the Availability Zones in Azure, there are a couple of things to know. I’ve learned my lessons the hard way, so in this post I am providing some tools and guidance on how to avoid some pitfalls when building multi-zone database services.

Click through for that guidance.

Comments closed

Linting SQL with SQLFluff

Josephine Bush busts out a linter:

I thought I didn’t care about linting, and lately, I haven’t written a lot of SQL, but for the SQL I do write, I have SQLFluff to help me format it. A friend of mine is big into SQLFluff and finally talked me into installing and using it. For more information about SQLFluff itself, visit here.

Josephine shows off some of the configuration for PostgreSQL’s psql as well as SQL Server’s T-SQL.

Comments closed

Regular Expression Matches in PostgreSQL

Tobias McNulty now has two problems:

regexp_matches() and regexp_match() are two similar string functions that support regular expression matching directly in the PostgreSQL database. regexp_matches() was added in PostgreSQL 8.3, and regexp_match() was added in PostgreSQL 10 (keep reading to see how ChatGPT struggled to answer this question).

Read on for that as well as plenty more information on how the two work, and even a bonus snippet on another regular expression function.

Comments closed

Foreign Key Relationships in Microsoft Fabric Data Warehouses

Jared Westover looks at key constraints:

In late 2024, I noticed a comment on the Microsoft Learn site stating that foreign keys could improve query performance on tables in a Fabric warehouse. That claim immediately caught my attention. I wanted to answer a simple question: Do relationships help, hurt, or have no effect when added to tables in a Fabric warehouse?

Let’s get more specific—do foreign keys improve query performance when reading data (not loading)? In other words, do they make queries run faster?

Sadly, the answer is not as promising as with SQL Server. But this also makes sense considering the distributed nature of Fabric data warehouses.

Comments closed

Asynchronous SQL Statement Execution in Snowflake

Koen Verbeeck doesn’t want to wait for an answer:

It’s been a while since I blogged about Snowflake, but a recent LinkedIn post caught my attention: the ability to add asynchronous execution of SQL statements in a stored procedure. In other words: parallel execution of SQL statements. This got me excited, because in my opinion this is something that has been missing in T-SQL since forever. Every time you want to do something in parallel, you need to use external tools to accomplish this in SQL Server (or Azure SQL DB, or Fabric Warehouse, or Fabric SQL DB, or … you get the point). You needed to use SQL Server Agent Jobs, or SSIS packages, or Azure Data Factory and so on.

Snowflake introduces the ASYNC and AWAIT keywords, which can be used to trigger asynchronous execution. 

Read on for a very simple example and some thoughts from Koen. Aside from possibly making data modifications faster (assuming there are no constraint checks), I’m not quite sure what the major benefit to this is. I’d generally use asynchronous calls to support UI operations, letting a calling application respond to user input while some background thread processes data. But I’m not positive what you get from pushing async/await logic into the database itself.

Comments closed

Iceberg Data Support in OneLake

Matthew Hicks isn’t replicating data anymore:

Microsoft OneLake is the single, unified, logical data lake that allows your entire organization to store, manage, and analyze data in one place. It provides seamless integration with various data sources and engines, making it easier to derive insights and drive innovation.

At the most recent Microsoft Build conference, we announced the integration effort between Snowflake and OneLake, which aims to allow users of both Snowflake and Microsoft Fabric to work on the same Iceberg data in OneLake, with no data duplication/movement needed. More recently, we released the preview of OneLake’s Iceberg table format support, which included the ability for Snowflake to write Iceberg tables directly to OneLake.

Click through for more information about the current status of this feature, as well as what’s coming soon.

Comments closed

Using the Excel Solver with Power Pivot

Chris Webb remembers a few functions:

After years of meaning to learn how to use Excel Solver, this week I’ve finally made a start: I want to use it to work out the optimal way of distributing workspaces across capacities in a Fabric tenant from the point of view of CU usage. I’m a long way from knowing how to do this properly (I’ll blog about it when I’m ready) but one of the first things I found is that while there are lots of resources on the internet showing how to use Solver, there are no examples of how to use Solver when your source data is stored in the Excel Data Model, aka Power Pivot. Getting that data onto the worksheet is fairly straightforward, but what if you also need Solver to change how that data is sliced and diced? It turns out that not hard to do if you know how to use cube functions.

Read on to see what Chris means and how you can send data from the Excel Data Model into the Solver.

Comments closed