Press "Enter" to skip to content

Category: Architecture

Multi-Tenant Data Isolation Strategies

Rahul Miglani comes up with a list:

As organizations embrace cloud computing, multi-tenancy has become a popular architectural choice, enabling multiple customers (tenants) to share a single cloud environment. However, one of the biggest challenges in multi-tenancy is data isolation—ensuring that each tenant’s data remains private, secure, and accessible only to authorized users.

Microsoft Azure provides several data isolation strategies that allow businesses to securely manage and scale multi-tenant applications while ensuring compliance with regulatory standards like GDPR, HIPAA, and SOC 2.

In this blog, we will explore key data isolation strategies in multi-tenancy Azure architecture, their advantages, and best practices for implementation.

Reading through the list, the same set of options are available on-premises, though the calculus can be a bit different.

Comments closed

Thoughts on Scaling Elasticsearch

Vivek Kumar can’t stop at one:

With the evolution of modern applications serving increasing needs for real-time data processing and retrieval, scalability does, too. One such open-source, distributed search and analytics engine is Elasticsearch, which is very efficient at handling data in large sets and high-velocity queries. However, the process for effectively scaling Elasticsearch can be nuanced, since one needs a proper understanding of the architecture behind it and of performance tradeoffs.

Click through for those considerations and the trade-offs you might see.

Comments closed

Vertical Partitioning Rarely Works

Brent Ozar lays out an argument:

You’re looking at a wide table with 100-200 columns.

Years ago, it started as a “normal” table with maybe 10-20, but over the years, people kept gradually adding one column after another. Now, this behemoth is causing you problems because:

  • The table’s size on disk is huge
  • Queries are slow, especially when they do table scans
  • People are still asking for more columns, and you feel guilty saying yes

You’ve started to think about vertical partitioning: splitting the table up into one table with the commonly used columns, and another table with the rarely used columns. You figure you’ll only join to the rarely-used table when you need data from it.

Read on to understand why this is rarely a good idea and what you can do instead.

I will say that I’ve had success with vertical partitioning in very specific circumstances:

  1. There are large columns, like blobs of JSON, binary data, or very large text strings.
  2. There exists a subset of columns the application (or caller) rarely needs.
  3. Those large columns are in the subset of columns the caller rarely needs, or can access via point lookup.

For a concrete example, my team at a prior company worked on a product that performed demand forecasting on approximately 10 million products across the customer base. For each product, we had the choice between using a common model (if the sales fit a common pattern) or generating a unique model. Because we were using SQL Server Machine Learning Services, we needed to store those custom models in the database. But each model, even when compressed, could run in the kilobytes to megabytes in size. We only needed to retrieve the model during training or inference, not reporting, but we did have reports that tracked what kind of model we were using, whether it was a standard model or custom (and if custom, what algorithm we used). Thus, we had the model binary in its own table, separate from the remaining model data.

Comments closed

Failover Groups in Azure SQL Database

Mika Sutinen looks at some interesting functionality:

One of the interesting features in Azure SQL Database is the Failover Groups. It allows you to manage replication of an Azure SQL database, or group of databases, to another logical server. The reason I’ve bolded the manage replication is, that the replication itself is handled by active geo-replication, which is also a feature of Azure SQL Database.

Read on to see how these are different and why you might want to use failover groups.

Comments closed

RedGate’s State of the Database 2025

Louis Davidson summarizes a report:

Less than a week ago, Redgate released their annual State of the Database Landscape report. You can read about the methodology and see some of the results on this page. If you want the entire report, you can download that from the same page.

What I really love about this survey is that it is very low on opinions and sticks largely to the facts. As a person working with data, most of these facts will make you feel that you are not alone. Sometimes, it is easy to start thinking that your company is the only one in a given situation. 

Click through for some of Louis’s findings.

Comments closed

Brownfield Data Modeling

Jared Westover discusses a common trade-off:

Some decisions in life are easy, like whether to drink that second cup of coffee. But when it comes to databases, things get complicated fast. Developers often seek my input on adding tables and columns. A common question arises: Should they create a new table or expand an existing one by adding columns? This decision can be tricky because it depends on several factors, including query performance, future growth, and the complexity of implementing either solution. While adding one or two columns to an existing table may seem the easiest option, is it the best long-term solution? In this article, we look at whether it is better to add new columns versus a new table in SQL Server.

As an architectural pro-tip, when you’re looking to add a new column to an existing table, ask yourself if the new attribute you want to add actually relates to the natural key of the existing table. In Jared’s example, the natural key for video game tracker is presumably video game ID (which itself ties back to, presumably, the video game title, developer, console, and release date) and start date. Does a book actually relate to a video game and start date? No, it does not. Therefore, this book attribute does not belong on the video game tracker table.

When you dig deeper into Boyce-Codd Normal Form, you figure out that “relates to” in the prior paragraph translates to “has a functional dependency upon,” but using non-technical language for people not familiar with normalization, you can still get to the same conclusion, because ultimately, 95% of database normalization is common sense that we strenuously apply to a business domain.

And most of the time, the developer knows that this feels weird, but doesn’t want to spend the extra time doing it the best way and instead tries to do it the expedient way. This is where the role of the architect as politician comes in, and we gently guide people to the right conclusion. Or just tell them to put on their big boy britches and do it right. Either way.

Comments closed

Guidance on Multi-Platform Database Design

Kellyn Gorman hits on some of the point points around designing database solutions on multiple platforms:

When building products that interact with multiple database platforms, the complexity can be both a challenge and an opportunity.  For Subject Matter Experts (SMEs), observing design decisions made without sufficient knowledge of underlying database architecture can be particularly frustrating. These moments highlight the critical need for architectural foresight and platform-specific expertise to avoid pitfalls that compromise scalability, performance, and maintainability.

Read on for a list of common problems as well as the entry point to some solutions.

Comments closed

Constraints on Polymorphic Associations in SQL Server

Jared Westover wants a foreign key, but one referencing multiple tables:

Do you like a challenge? If you answered yes, you’re my kind of person. Recently, a developer presented me with a problem: they needed a foreign key reference in one table to associate with multiple other tables. Over the years, I’ve often been asked how to make this situation work. However, achieving this relationship with foreign keys is technically impossible with SQL Server and most mainstream relational database platforms. Since SQL Server restricts foreign keys to referencing a single table, how can we solve this problem?

My immediate answer was “triggers,” which happens to be the solution Jared intentionally omits.

I’d rather go with the multiple association tables approach over multiple indicator types, as the latter requires (n-1) NULLs, where n is the number of indicator types (review types in Jared’s example) and I hate NULL because NULL is the void lying about being a value, sort of like how skim milk is water lying about being milk.

Comments closed

Concurrency Control in Oracle vs PostgreSQL

Umair Shahid continues a series on migrating from Oracle to PostgreSQL:

Transitioning from Oracle to PostgreSQL can be a transformative experience for database administrators because of the subtle differences between the two technologies. Understanding how the two handle concurrency differently is critical to managing highly concurrent workloads. 

Concurrency control is essential for maintaining data consistency when multiple users access the database simultaneously. Oracle and PostgreSQL take different approaches to concurrency control: Oracle primarily relies on locking and consistent snapshots, while PostgreSQL utilizes a Multi-Version Concurrency Control (MVCC) system.

This article provides an in-depth look at concurrency control in PostgreSQL from an Oracle perspective.

Read on for that comparison.

Comments closed

SCD Types in Microsoft Fabric

Kenneth Omorodion reminds us that the Kimball model is still quite valuable:

In modern data warehousing, how we handle updates to dimension tables is crucial. There are several approaches; but the decision often comes down to two primary strategies: Slowly Changing Dimensions (SCD) Type 2 and overwriting tables. Each has its own benefits, use cases, and trade-offs. This tip will explore the two methods and why SCD Type 2 is often a better option in many data warehouse scenarios.

Read on for this overview of the benefits of type-2 slowly changing dimensions, as well as a little bit of coverage of several other types of slowly changing dimensions.

Comments closed