Press "Enter" to skip to content

Author: Kevin Feasel

The 1600 Column Limit in PostgreSQL

Andreas Scherbaum covers a limitation:

A recent blog posting by Frédéric Delacourt (Did you know? Tables in PostgreSQL are limited to 1,600 columns) reminded me once again that in the analytics world customers sometimes ask for more than 1600 columns.

Read on for the technical limitations around how many columns could conceivably fit in a PostgreSQL table. But I will rely here on Swart’s 10% Rule: if you have more than 160 columns on a single table, that’s a sign you should step back and ask why this is the case. Even the widest of data warehousing dimensions is likely to be considerably smaller than this and the smart move is to rethink that data model before agitating for additional columns.

Leave a Comment

Comparing JSON Data Types in MySQL and Postgres

Aisha Bukar draws some comparisons:

MySQL and PostgreSQL are two of the most popular relational database systems in the world. Both are open-source, widely used in web and enterprise applications, and support structured data in tables.

Modern applications, however, often work with semi-structured data that doesn’t always neatly fit into tables with rows and columns. This type of data gets its name because it still has some organization but doesn’t follow a strict format.

JSON (JavaScript Object Notation) is a popular way to store and share this kind of data. It’s a text format easy for both people and computers to understand.

Read on to see what’s supported in each of these platforms, as well as strengths and limitations of using JSON in each.

Leave a Comment

Exposing Materialized View in Microsoft Fabric Lakehouses

Ed Lima makes some data available to other tools:

In today’s data-driven world, the ability to quickly expose data through modern APIs is crucial. Microsoft Fabric’s API for GraphQL combined with Materialized Lake Views offers a powerful solution that bridges the gap between your Fabric LakeHouse data and application developers who need fast, flexible access to your data.

In this guide, we’ll walk you through how to create a materialized view in a Lakehouse and expose it through a GraphQL API—all within the Microsoft Fabric ecosystem. This approach gives you the best of both worlds: the performance optimization of materialized views and the developer-friendly querying capabilities of GraphQL.

I’d say one interesting reason for why you might want to do this is to feed data to products like Teams, Power Automate, or Copilot Studio. In those cases, having the data be accessible via GraphQL makes it easier than working with finicky connectors that may or may not exist.

Leave a Comment

Join Planning in PostgreSQL 19

Robins Tharakan notes an upcoming performance boost:

The hidden cost of knowing too much. That’s one way to describe what happens when your data is skewed, Postgres statistics targets are set high, and the planner tries to estimate a join.

For over 20 years, Postgres used a simple O(N^2) loop to compare (equi-join) Most Common Values (MCVs) during join estimation. It worked fine when statistics targets are small (default_statistics_target defaults to 100). But in the modern era – we often see Postgres best-practices recommend cranking that up. Customers are known to be using higher values (1000 and sometimes even higher) to handle complex data distributions + throw a 10 JOIN query to the mix – and this “dumb loop” can easily become a silent performance killer during planning. 

That changes in Postgres 19.

Read on for an example of the problem and what is coming out to mitigate issues that currently exist.

Leave a Comment

Default Constraints and User-Defined Functions

Erik Darling has a new video. Erik shows how SQL Server handles default constraints that use user-defined functions and how this behaves under a variety of circumstances. There’s also a dive into parallelism and constraints. We also learned Erik’s ability to perform fractional math and how he actually differentiates “scalar” from “scaler,” proving once again that he is not midwestern from his use of extraneous vowel sounds.

1 Comment

Backing up a Microsoft Fabric Workspace

Gilbert Quevauvilliers finds a gap and fills it:

In the high-stakes world of data architecture, where downtime can cascade into real business disruptions, I’ve learned that even the most robust platforms have their blind spots. Just last month, while collaborating with a client’s Architecture team on their disaster recovery strategy, we uncovered a subtle but critical gap in Microsoft Fabric: while OneLake thoughtfully mirrors data across multiple regions by default, other workspace items—like notebooks, semantic models, and pipelines—aren’t directly accessible in a failover scenario without extra steps. For the nitty-gritty on Fabric’s built-in reliability features, check out this Microsoft Learn guide.

That’s the spark that led me down this rabbit hole, and in this post, I’ll walk you through a practical solution: a Python Notebook that automates backing up your entire Fabric workspace to OneLake and an Azure Storage Account for that extra layer of redundancy. Whether you’re prepping for the worst or just embracing the “better safe than sorry” mindset, this approach gives you portable, versioned copies you can restore quickly.

Click through for the notebook, as well as instructions on how to use it.

Leave a Comment

Running SQL Server on KubeVirt

Andrew Pruski builds a virtual machine:

With all the changes that have happened with VMware since the Broadcom acquisition I have been asked more and more about alternatives for running SQL Server.

One of the options that has repeatedly cropped up is KubeVirt

KubeVirt provides the ability to run virtual machines in Kubernetes…so essentially could provide an option to “lift and shift” VMs from VMware to a Kubernetes cluster.

Read on to learn a bit more about KubeVirt, including how to set up a Windows-based virtual machine with it. Andrew does document some performance woes, so that’d be a big concern to work out the why behind this.

Leave a Comment

Reverse Engineering a Physical Model Diagram with Redgate Data Modeler

Steve Jones gives the new Regate acquisition a try:

I recently wrote about a logical diagram with Redgate Data Modeler. That was interesting, but creating all the objects is a pain. I decided to try creating a physical diagram from an existing database. This post looks at the experience.

Click through for Steve’s thoughts. I appreciate how he’s willing to call out the pain points that exist in the product today.

Leave a Comment

Choosing RANK() over RANKX() in DAX

Marco Russo and Alberto Ferrari make a decision:

In this article, we are not going to discuss the syntax of the RANK and RANKX functions. If you need more information, we suggest you consult DAX Guide for syntax, as well as the following articles, which introduce both functions: Introducing the RANK window function in DAX and Introducing RANKX in DAX.

RANKX is the classic method of ranking in DAX; RANK is a newer window function that works faster, better, and in a more flexible way. RANK is used in both visual calculations and measures. Which function should you use in which scenario? The answer depends on your requirements: each solution has pros and cons.

Read on for the comparison criteria and when you should choose each.

Leave a Comment

Access S3 Buckets in VPCs in Fabric via Entra Integration

Premal Shah announces new functionality in preview:

When we first introduced Amazon S3 shortcut integration with Microsoft Entra ID, customers gained a powerful new way to connect S3 data to Microsoft Fabric — without storing or rotating AWS access keys. Using OpenID Connect (OIDC), Fabric authenticates directly with AWS Identity and Access Management (IAM), enabling secure, identity-based access to cloud storage.

However, many enterprises keep their S3 buckets locked down inside Virtual Private Clouds (VPCs) or behind corporate firewalls. In these environments, Entra OIDC can authenticate identities, but it cannot provide network access — so Fabric still cannot reach the S3 endpoint. That changes today.

Read on to see what has changed, how you can enable this functionality, and current limitations.

Leave a Comment