Press "Enter" to skip to content

Category: Warehousing

Common Star Schema Mistakes

Ben Richardson gets back to basics:

Sometimes the culprit isn’t actually your DAX, it’s your data model.

Star schema mistakes are incredibly common in Power BI, and really hard to track down.

When your data model isn’t a clean star schema, you end up with broken filters, confusing relationships and slow visuals.

It’s important to get it right from the start! So we broke down the top 10 most common mistakes people make, how to identify them and how to fix them!

This is where reviewing (or reading) Ralph Kimball’s Data Warehouse Toolkit can save you a lot of time and stress. The Microsoft data analytics world is predicated so heavily on Kimball-style dimensional modeling that the choices tend to be building a proper star schema up-front or spend processing and developer time trying to fix it in post-production using DAX or trickery.

Leave a Comment

Tips for Building a Data Warehouse

James Serra gets back to foundations:

I had a great question asked of me the other day and thought I would turn the answer into a blog post. The question is “I’m an experienced DBA in SQL Server/SQL DB, and my company is looking to build their first data warehouse using Microsoft Fabric. What are the best resources to learn how to do your first data warehouse project?”. So, below are my favorite books, videos, blogs, and learning modules to help answer that question:

Click through for James’s recommendations. I strongly agree with his advice to start with Ralph Kimball’s The Data Warehouse Toolkit, and frankly, I think a lot of James’s advice here is sound. The person asking focuses on Fabric, and there are plenty of Fabric-specific things to learn, but at the end of the day, modern data warehouses are still data warehouses.

Leave a Comment

View Creation via Visual Queries in Microsoft Fabric

Jon Vöge creates a view:

As companies adopt Microsoft Fabric, the distance between backend artifact and Semantic Model is smaller than ever, and it feels more obvious than ever to push some of those local transformations to your Fabric Storage item of choice.

The question is. How do you do that? There are many options:

Read on for those options. Jon focuses on one for people with less database experience.

Comments closed

Microsoft Fabric Warehouse Snapshots now GA

Twinkle Cyril makes an announcement:

Managing data consistency during ETL has always been a challenge for our customers. Dashboards break, KPIs fluctuate, and compliance audits become painful when reporting hits ‘half-loaded’ data. With Warehouse Snapshots, Microsoft Fabric solves this by giving you a stable, read-only view of your warehouse at a specific point in time and now, this capability is Generally Available! Think of this as a true time travel database, an industry-first capability that sets us apart.

I wonder how much they differ from the database snapshots available in SQL Server.

Comments closed

Star Schemas and Keys

Chris Barber provides a primer on the types of keys that are critical for a star schema:

Keys are a core component of star schema modelling; relationships between tables are built using the keys. This article covers:

  1. The main key types
  2. Star Schema diagrams
  3. Best practices when using Keys

An understanding of keys become increasingly important with more complex solutions. Not only do you need to understand them from a modelling perspective, but a common vernacular is required to communicate with team members.

It’s easier to think of the keys Chris describes in two separate classes rather than four unique items. Surrogate and natural keys are descriptors of a primary key (or any other unique/alternate key), after all.

Comments closed

Resolving Write Conflicts in Microsoft Fabric Data Warehouse

Twinkle Cyril has a conflict:

Fabric Data Warehouse (DW) supports ACID-compliant transactions using standard T-SQL (BEGIN TRANSACTION, COMMIT, ROLLBACK) and uses Snapshot Isolation (SI) as its exclusive concurrency control model. All operations within a transaction are treated atomically—either all succeed or all fail. This ensures that each transaction operates on a consistent snapshot of the data as it existed at the start of the transaction, which means.

Read on to see what this means, as well as what happens when multiple writers interfere with one another and how to avoid these sorts of issues. My Kimball-coded brain says that, if you have a data warehouse, you should have one data loading process. In that case, it’s not easy for the single data loading process to get tripped up on its own.

Comments closed

Locks in Microsoft Fabric Data Warehouse

Twinkle Cyril enumerates the lock types in Fabric Data Warehouse:

Fabric DW supports ACID-compliant transactions using standard T-SQL (BEGIN TRANSACTION, COMMIT, ROLLBACK) and enforces snapshot isolation across all operations. Locks in Fabric Data Warehouse are used to manage concurrent access to metadata and data, especially during DDL operations. Here’s how locking works:

Click through for a chart. The locking policy is a lot simpler than what we see in SQL Server and you can see a description of the pros and cons of that simpler approach.

Comments closed

Tag-Based Masking in Snowflake

Kevin Wilkie gets tagging:

If you’ve followed our site for a while, you would have seen in a previous post how powerful tag-based masking policies are in Snowflake. They let you enforce consistent data masking rules across columns without constantly rewriting logic. But Snowflake hasn’t stopped there—recent enhancements now make it even easier to classify, tag, and mask data at scale. In this post, we’ll recap the essentials of tag-based masking, highlight the new functionality, and share some practical tips for rolling it out in your environment.

Kevin has a new blog theme and everything.

Comments closed

Contrasting Microsoft Fabric, Databricks, and Snowflake

Ron L’Esteve builds a comparison chart:

Databricks and Microsoft Fabric are two of the most innovative Unified Data and Analytics intelligence platforms available on the market today. While similar, each brings their own advantages and limitations. Snowflake joins these two powerhouses when data warehouse decisioning comes into play. Sometimes it is challenging to decide which one to pick for your organization’s needs. This tip will help with uncovering when to choose Databricks vs Fabric vs Snowflake.

When it comes to Spark performance, Databricks is always going to win—they keep most of their optimizations to themselves, so anyone starting from open-source Spark is at a disadvantage. Otherwise, it’s a bit of a slugfest between Fabric and Databricks. At the end, Ron also brings in Snowflake, focusing on the data warehousing side of things for that three-way comparison. I don’t think there’s a clear winner among the three, and on net, that’s probably a good thing, as it forces the groups to continue competing.

Comments closed