Architecture – Curated SQL

High Availability Architecture for PostgreSQL

Published 2025-07-03 by Kevin Feasel

Most teams building production applications understand that “uptime” matters. I am writing this blog to demonstrate how much difference an extra 0.09% makes.

At 99.9% availability, your system can be down for over 43 minutes every month. At 99.99%, that window drops to just over 4 minutes. If your product is critical to business operations, customer workflows, or revenue generation, those 39 extra minutes of downtime each month can be the difference between trust and churn.

Click through for some of the tools and practices that can help get you there in PostgreSQL.

The Small Data Showdown in Microsoft Fabric

Published 2025-07-01 by Kevin Feasel

Miles Cole does a bit of testing:

First, let’s revisit the purpose of the benchmark: The objective is to explore data engineering engines available in Fabric to understand whether Spark with vectorized execution (the Native Execution Engine) should be considered in small data architectures.

Beyond refreshing the benchmark to see if any core findings have changed, I do want to expand in a few areas where I got great feedback from the community:

I really appreciate the approach behind this, both in terms of sticking to more realistic data sizes for many operations as well as performing this test given all of the recent improvements in each engine.

Building Entity-Relationship Diagrams with DBeaver

Published 2025-06-26 by Kevin Feasel

Dave Stokes builds a diagram:

Even the most experienced database professionals are known to feel a little anxious when peering into an unfamiliar database. Hopefully, they inspect to see how the data is normalized and how the various tables are combined to answer complex queries. Entity Relationship Maps (ERM) provide a visual overview of how tables are related and can document the structure of the data.

Read on to see how you can do this with the DBeaver database access client.

Building an ML-Friendly Data Lake with Apache Iceberg

Published 2025-05-23 by Kevin Feasel

Anant Kumar designs a data lake:

As companies collect massive amounts of data to fuel their artificial intelligence and machine learning initiatives, finding the right data architecture for storing, managing, and accessing such data is crucial. Traditional data storage practices are likely to fall short to meet the scale, variety, and velocity required by modern AI/ML workflows. Apache Iceberg steps in as a strong open-source table format to build solid and efficient data lakes for AI and ML.

Click through for a primer on Iceberg, how to set up a fairly simple data lake, and some functionality that can help in model training.

Comments closed

Comparing Microsoft Fabric Engines

Published 2025-05-13 by Kevin Feasel

Nikola Ilic performs a comparison:

Before we proceed, an important disclaimer: the guidance I’m providing here is based on both my experience with implementing Microsoft Fabric in real-world scenarios, and the recommended practices provided by Microsoft.

Please keep in mind that the guidance relies on general recommended practices (I intentionally avoid using the phrase best practices, because the best is very hard to determine and agree on). The word general means that the practice I recommend should be used in most of the respective scenarios, but there will always be edge cases when the recommended practice is simply not the best solution. Therefore, you should always evaluate whether the general recommended practice makes sense in your specific use case.

Click through for a comparison between three engines: the lakehouse, the warehouse, and the eventhouse. It would really simplify things if the lakehouse and warehouse combined into one coherent whole.

Comments closed

Organizing a Microsoft Fabric Data Platform with Domains

Published 2025-05-07 by Kevin Feasel

Jon Vöge does a bit of organization:

A topic which seems more relevant than ever, is the question of how to organize the contents of your Microsoft Fabric Platform.

Through the contents of a few blogs, I will give you an overview of things to consider, as well as suggestions that you can choose from when designing your platform.

This first week, we’ll take a look at Domains in Microsoft Fabric.

Read on to understand why domains can be valuable and a solid way to structure them.

Comments closed

Choosing a Warehousing Data Architecture

Published 2025-05-07 by Kevin Feasel

James Serra compares and contrasts OLAP architectures:

As discussed in my blog and book “Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh” (Amazon), organizations are often challenged with choosing the right data architecture to meet their business goals—especially as AI and data-driven decision-making take center stage. To help clarify, here’s a quick review of the four core architectures, followed by guidance on when to use each. Each architecture includes five stages of data movement – ingest, store, transform, model, and visualize (described here).

Click through for James’s take on how each of them works and when you might choose one over the other.

Comments closed

Designing a Microsoft Fabric Workspace

Published 2025-04-22 by Kevin Feasel

Ron L’Esteve lays it out:

When planning a Data Platform for your organization on Microsoft Fabric, you need to consider workspaces during your design process. Proper workspace design is critical for the organization and consumption of Fabric items. Understanding how to effectively manage Microsoft Fabric workspaces can streamline your processes.

If your workspaces are not set up in a way that aligns with functional business verticals or environments (such as Dev/UAT/Prod), you will end up spending a significant amount of time and effort re-factoring this technical debt to meet the desired organizational structures. While it seems trivial to simply move a report, pipeline, or other workload from one workspace to another, the inbuilt dependencies can often be complex. With efficient planning and design efforts, these problems can be avoided.

Click through for Ron’s advice.

Comments closed

An Explanation of PostgreSQL’s Citus Extension

Published 2025-03-19 by Kevin Feasel

Craig Kerstiens covers a misunderstood extension:

Citus is in a small class of the most advanced Postgres extensions that exist. While there are many Postgres extensions out there, few have as many hooks into Postgres or change the storage and query behavior in such a dramatic way. Most that come to Citus have very wrong assumptions. Citus turns Postgres into a sharded, distributed, horizontally scalable database (that’s a mouthful), but it does so for very specific purposes.

Read on to learn when Citus can work well, when it isn’t a good fit, and a few architecture and design recommendations around using the extension.

Comments closed

Understanding Availability Zones in Azure

Published 2025-03-19 by Kevin Feasel

Mika Sutinen explains some of the nuance around Azure availability zones:

Azure Availability Zones help provide resiliency to your database services within an Azure Region. I simply love it how simple Microsoft has made building geographically dispersed database services. If you’ve ever designed and deployed multi-site, highly available database services in on-premises, you know what I am talking about.

However, with the Availability Zones in Azure, there are a couple of things to know. I’ve learned my lessons the hard way, so in this post I am providing some tools and guidance on how to avoid some pitfalls when building multi-zone database services.

Click through for that guidance.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Architecture