Press "Enter" to skip to content

Curated SQL Posts

Lakehouse Table Partitioning in Microsoft Fabric

Gilbert Quevauvilliers performs a split:

When loading data, it is always important to load the data with performance and scalability in mind.

For lakehouse tables to return queries quickly and to scale it is essential to load your lakehouse tables with partitions.

What I am going to show you in my blog post today is how to load data into a Lakehouse table where the table will be automatically partitioned by Year/Month/Day.

Click through for the example.

Comments closed

Suspend and Resume Microsoft Fabric Capacity

Olivier Van Steenlandt saves some cash:

With only a limited budget for exploring and testing new tools, I had to figure out how to use my budget efficiently. Therefore, before making any decisions, I looked at the Microsoft Fabric pricing and possibilities.

If you want to take a look at the Microsoft Fabric pricing models, you can find an overview via the following link: Microsoft Fabric – Pricing | Microsoft Azure

To avoid any surprises and to be as cost-effective as possible, I created an easy Python script that I can use to pause and start my Microsoft Fabric capacity, or better said resume and suspend.

I highly recommend this for any organization that does not need 24/7 uptime for Fabric capacity. If you run your system 12 hours a day instead of 24, it takes your F64 capacity from $8k a month to $4k.

Comments closed

Bit Column Order and Data Length

Brent Ozar performs an experiment:

At the PGConf.dev, where Postgres developers get together and strategize the work they wanna do for the next version, I attended a session where Matthias van de Meent talked about changing the way Postgres stores columns. As of right now (Postgres 17), columns are aligned in 8-bit intervals, so if you create a table with alternating columns:

Read on to see an example, and then Brent performs a test to see how SQL Server handles this scenario. The comments also mention that at least older versions of Oracle behaved like Postgres.

Comments closed

Postgres Tuning Settings

Semab Tariq shares a few tips:

PostgreSQL is a widely used database known for its robust performance and reliability. To get the most out of PostgreSQL, tuning its parameters is crucial.

In this blog, we will explore the various PostgreSQL performance-related parameters and how to tune them effectively. By measuring Transactions Per Second (TPS) before and after tuning, and analyzing the results, we will demonstrate the significant impact of tuning on PostgreSQL performance.

Click through for some of the sorts of settings you might want to review. In Semab’s case, a simple server achieved nearly 30% better throughput after making these changes, so that’s not bad for the level of effort.

Comments closed

Controlling Azure Function Spend via Consumption Plan

Andy Brownsword saves some cash:

A consumption based App Service Plan in Azure provides us with a pay-as-you-go model for Function usage. This can help reduce spend from Premium plans where those plans exceed the requirements of the function, for example low volume or intermittent work.

Unfortunately you can’t move a Premium plan to Consumption based via the portal. Instead we’ll demonstrate how to use PowerShell to achieve this.

Read on for the code, as well as a bit more information on the Consumption tier..

Comments closed

Creating a Dragon Curve in R

Tomaz Kastrun adds dragons to the edge of the map:

The algorithm is a fractal curve of Hausdorff dimension 2. One starts with one segment. In each iteration the number of segments is doubled by taking each segment as the diagonal of a square and replacing it by half the square (90 degrees). Alternating and doing the left and right function / direction to complement in order to get the shape.

Clickt hrough for the sample code and how the plot looks.

Comments closed

Microsoft Fabric Warehouse Access Control

Koen Verbeeck talks permissions:

We are starting a new analytics project in Microsoft Fabric, and our data will land in a warehouse. This is the first time we’re using Fabric, and we are wondering about the different options for sharing access to a warehouse we developed in a workspace.

Click through for more information on providing and limiting access to data in a Microsoft Fabric warehouse.

Comments closed

Microsoft Fabric Lakehouse Ingesting CSV vs SQL

Reitse Eskens performs a comparison:

This blog will be a quite short one compared to the other blogs as it’s more of an overview to show you the capacity of Fabric ingesting CSV files in their native format into a Lakehouse and ingesting SQL data into a table structure inside the Lakehouse. Simple, straightforward stuff without any form of modification. You could call it bronze, raw, ingestion, temp or whatever your preferred naming convention is.

Why is this important? Well, we still have source systems that can only output to files. Just as we still have customers running on SQL Server 2000, legacy or even antique systems are still running. And it’s important to know how much capacity you use when just ingesting data without any modification.

Read on for the two scenarios, giving you an idea of which one is faster. I’d be interested in a third option, which is reading from Parquet files. My initial expectation would be that it would be even faster and more efficient, depending on the structure of the data.

Comments closed

Atomic Design for Report Development

Kurt Buhler has an interesting approach:

Developing a good semantic model or report takes a lot of time and effort. One way to reduce this cost is by re-using parts of an existing solution for a new model or project. This modular approach is particularly valuable when a developer faces common or recurring challenges and processes. Despite this, many developers commonly repeat efforts when they start new projects, models, and reports. For example, developers will often manually recreate measures, date tables, and patterns in a new model, or spend precious hours formatting visuals in a new report, while they have already created the same or similar things in the past. One reason for this is that it is difficult to identify candidate elements to re-use, or how you can re-use them in a convenient and scalable manner.

In this article, we want to introduce a conceptual framework from UI/UX called the atomic design methodology from Brad Frost. This framework can help developers to approach Power BI models and reports in a modular way to improve productivity and consistency of a developer’s work. The purpose of this article is to introduce the concept as well as some approaches that exist to re-use parts of your model and report. In future articles and videos, we will elaborate on these and other methods in additional detail.

I like the idea a lot, but Kurt does describe some of the challenges you’ll likely need to work through to adopt it.

Comments closed

Deploying a Power BI Project File via Azure DevOps

Angela Henry deploys to prod:

When it was announced there was a collective cheer from Power BI source control advocates heard ’round the world. Since it’s preview release, Microsoft has also added GIT integration with Fabric workspaces. This makes it so easy to incorporate source control for all (or almost all) of your Fabric artifacts, including Power BI.

But what happens when your organization already has a mature CI/CD process in place using Azure DevOps? Do you really want to break from that pattern and have it controlled somewhere else? That’s what this post is about, using Azure DevOps CI/CD pipelines to deploy your Power BI Project files (.pbip).

I’m going to share my experience in hopes that it will save you some time if this is the route you need to take.

Read on for Angela’s experience. Note that this applies both to Microsoft Fabric as well as a Fabric-less Power BI.

Comments closed