Press "Enter" to skip to content

Author: Kevin Feasel

Type 2 SCDs in Microsoft Fabric

Reza Rad has changes to make:

In the previous article, I explained SCD (Slowly Changing Dimension) and its different types. In this article, I’ll show you how to implement SCD Type 2 (one of the most common types) using Microsoft Fabric and Power BI. This article includes using Lakehouse, Dataflow, Warehouse, Data Pipeline, SQL Stored Procedures, Power BI Semantic model, and report in Microsoft Fabric.

Click through to learn more about the structure of a type-2 slowly changing dimension, as well as how you can store and load this information to track changes over time.

Comments closed

Dealing with AG Secondaries Falling Behind

David Fowler troubleshoots a wait type:

Are you struggling with a laggy redo and a build up in the redo queue on your readonly secondaries? Are you suffering with high PARALLEL_REDO_TRAN_TURN waits? Then this magic remedy could cure your ailments.

Read on to learn more about the parallel redo process and one thing you can do if it’s falling too far behind. I wouldn’t permanently switch back to serial redo, but maybe make that switch if there’s some consistent pattern to when the process falls behind.

Comments closed

Viewing the File System of an ADO Pipeline Agent

Justin Bird goes spelunking:

There are several predefined variables that you can use within an Azure DevOps pipeline, some of which allow you to reference locations on the pipeline agent file system. Examples of when to use them might be to pick up a file from the downloaded repository or to describe where to output a build. For example the variable $(Build.ArtifactStagingDirectory) will be replaced during the defining of the pipeline job with the respective filepath.

I find it difficult at times to work abstractly with the file system, and so I use this template to visualise the file system for debugging purposes, to observe the outcome of my code, and to make sure I’m picking the right variable.

Click through for some more information about pre-defined variables and a code snippet Justin uses to view filesystem details.

Comments closed

Fixing an Error in DBCC CloneDatabase

Steve Jones runs into an issue:

This was an interesting error, and I was able to duplicate it, so I decided to write a post on how to find the problem and fix it. The error after running DBCC CLONEDATABASE is:

NO_STATISTICS and NO_QUERYSTORE options turned ON as part of VERIFY_CLONE.
Database cloning for 'atest' has started with target as 'aSmallTest'.
Msg 2601, Level 14, State 1, Line 11
Cannot insert duplicate key row in object 'sys.sysschobjs' with unique index 'clst'. The duplicate key value is (885578193).

The final key value (885578193) for you might be different, but the error is the same.

Read on to learn more about the issue, as well as how Steve was able to fix the problem.

Comments closed

Choosing between Azure SQL DB Ledger and Azure Confidential Ledger

Pieter Vanhove reminds me that ledger tables exist:

Ledger technology is a way of storing data that ensures its integrity, immutability, and verifiability. It can be used for scenarios where trust and transparency are essential, such as financial transactions, supply chain tracking, or regulatory compliance. Azure offers two services that leverage ledger technology to provide tamper-proof data storage: ledger in Azure SQL Database and Azure Confidential Ledger. In this blog post, we will compare these two services and help you decide which one is best suited for your needs.

Read on for the comparison. I think the answer for most cases is “neither” but there are specific times when ledger tables could make a good amount of sense.

Comments closed

Extracting the End of a String in R

Steven Sanderson just wants the conclusion:

Hey useR’s! Today, we’re going to discuss a neat trick: extracting substrings starting from the end of a string. We’ll cover how to achieve this using base R, stringr, and stringi. By the end of this post, you’ll have several tools in your R toolbox for string manipulation. Let’s get started!

Read on to see how you can do it in three separate libraries.

Comments closed

An Overview of Slowly Changing Dimensions

Reza Rad talks slowly changing dimensions:

If you want to use Power BI, Microsoft Fabric, or any other data analytics tools, one of the key concepts to understand when working with a data warehouse system is the SCD (Slowly Changing Dimension). I will do this in a series of at least two articles. The first one (this one) will be on the concept of what SCD is, its meaning, and its different types. Then, the next one will discuss how to implement SCD types (such as Type 2) using Microsoft Fabric and Power BI.

Reza focuses on SCD types 0-4 but does briefly touch on types 5-7 (of which, I’d never heard of SCD type 7).

Comments closed

Finding Query History in Snowflake

Kevin Wilkie digs into the history:

If you’re running an audit process in SQL Server, you can do this fairly easily. But how many of us work in a place that requires this or has enough space to do this? Sadly, very few of us…

If you have Query Store running (and why wouldn’t you?) you can find a specific query you were running. But you have to have a specific something you can search for…

But can you get just a list of what you specifically ran in order? Not that I can find….

My recollection is that, if you’re using Azure SQL Database, that auditing is on by default (or you can turn it on with a toggle switch or radio buttion). For Azure Synapse Analytics dedicated SQL pools, there are a few DMVs that cover query operations. But yeah, in general, it’s something you’d need to enable first.

Comments closed

Troubleshooting Transactional Replication Latency in SQL Server

Bobirmirzo Arslanov has a two-parter around troubleshooting transactional replication latency issues. Part 1 looks at architecture and some of the distribution tables:

Before you dive into solving any issue, you need to fully understand the type of environment you have as there might have been changes you are unaware of. An easy way to do that is to run script  SQLServer/Script Replication Topology at master · sqlserver-parikh/SQLServer (github.com) which gives output like below.

Part 2 digs in deeper:

If wait time is high compared to CPU time, check wait type and troubleshoot accordingly. For example, on the above example we faced MEMORY_ALLOCATION_EXT wait_type but duration is 0. So, we are not waiting.

If CPU time is higher, this means log thread is running but latency is being observed because you have high load.

Check out both posts. I’d really like to see a similar article for merge replication, as it’s a lot harder to troubleshoot.

Comments closed