Press "Enter" to skip to content

Day: May 26, 2026

Adding Patterns to ggplot2 Plots

Zhenguo Zhang adds some patterns:

Adding patterns to plots is a great way to improve accessibility (making plots colorblind-friendly) and to add an extra dimension of information. The ggpattern package provides a rich set of tools to achieve this in ggplot2.

I’m personally not the biggest fan of patterns. I see them as a point of necessity when dealing with grayscale circumstances, such as printing out a chart in an academic journal. But it’s very easy to overdo patterns and end up making a mess of the visual.

But one side note about color vision deficiency and plots: make sure that your plots are monochrome-friendly because somebody probably will try to print out your chart or view it on a grayscale-only device. Or might actually be monochromatic.

Leave a Comment

The Disappointment of Parameter-Sensitive Plan Optimization

Hugo Kornelis is back with another video:

Bad hair day? Try having an almost-no-hair month!

Jokes aside. It has been almost six weeks since my last video blog. Not really the schedule I had planned. But I believe I have good reasons.

Anyway, I do have a new video ready now. As promised in my last video, I now cover Parameter Sensitive Plan Optimization (PSPO), a new feature, introduced in SQL Server 2022, that is supposed to alleviate the pain of bad parameter sniffing.

Read on for the promise and letdown of PSPO.

Leave a Comment

An Overview of Power BI File Formats

Reza Rad covers a multitude of formats:

Power BI files come in multiple formats. If you have been working with Power BI Desktop for a while, you probably know the PBIX format well. But there are other formats too: PBIT, PBIP, and PBIR. The last two are more recent additions, and there is also a language called TMDL (Tabular Model Definition Language) that comes with its own file type.

In this blog post, I am going to explain the differences between all of these formats, why each one exists, and which one you should use for your scenarios. I will also show you, through a practical demo, what PBIP looks like and how you can use it to make your development process significantly better.

Click through for the video and article.

Leave a Comment

Foreign Tables and Materialized Views in PostgreSQL

Richard Yen provides a write-back system for analysts:

I recently wrote a post about WAL log shipping and how a standby built on log shipping is a great way to give data analysts production data without putting the primary at risk. Having access to the production data in this way is great, but it’s read-only. How can we create views of this data for better analytics work? I want to make the case today that Foreign Data Wrappers and Materialized Views can make a great solution – not only in accessing production Postgres data, but also working with other data sources.

Click through for an architectural discussion of how they can work together.

Leave a Comment

An FAQ-Based Introduction to Data Factories in Azure

Koen Verbeeck answers some questions:

Is Microsoft Fabric replacing Azure Data Factory?

Officially, no. Or maybe not yet. At the time of writing, ADF still remains a separate product but it’s noticeable that more new features are added to Fabric than to ADF. There are still many customers using ADF, so Microsoft might keep the service around for a while. There’s also still a bit of a feature gap between the two services, but this becomes more narrower each month. Microsoft is offering migration scenarios from ADF to Fabric.

I picked this question because of how much the concept annoys me. There are three separate Data Factory code bases in Azure with overlapping but not matching functionality (which is how you can tell it’s multiple code bases and not just one code base reskinned). This can lead to a scenario where Person A says, “Oh, do this thing in Data Factory.” Person B then says, “But I can’t do that in Data Factory.” Person A’s response: “Oh, that’s weird, because I can do it in Data Factory.” This leads to necessary but somewhat absurd clarifications around how you need to use Microsoft Fabric Data Factory, not Azure Data Factory because, even though Microsoft Fabric Data Factory is hosted in Azure, it’s a different product.

And don’t get me started on the wide variety of KQL platforms, all of which are subtly different.

Leave a Comment

Concurrent Evaluation with Microsoft Fabric Dataflows Gen2

Chris Webb runs multiple jobs at once:

Did you know that if your Fabric Dataflows Gen2 contains several queries then you can control how many of them are evaluated in parallel when your dataflow refreshes? In this series I’ll look at how how you can do this and how it may result in better performance – at least in some cases.

Let’s start with the basics. I created a Dataflow Gen2 with ten queries which each returned a table of one row and one column after one minute. I used the #table function to generate the table without connecting to a data source, code from this post to add the delay and the trick in this post to make sure the delay was only applied when the dataflow refreshed. The output of each query was loaded to a Fabric Warehouse.

Click through for a demonstration.

Leave a Comment