Press "Enter" to skip to content

Day: January 26, 2024

Being Smart about Missing Index Requests

Erik Darling doesn’t trust the system:

SQL Server’s missing index requests (and, by extension, automatic index management) are about 70/30 when it comes to being useful, and useful is the low number.

The number of times I’ve seen missing indexes implemented to little or no effect, or worse, disastrous effect… is about 70% of all the missing index requests I’ve seen implemented.

This has been pretty close to my experience as well. Click through for a much better approach to the task.

Comments closed

Data Vault 2.0 Models in Microsoft Fabric

Michael Olschimke and Dmytro Polishchuk continue a series:

The last article in this blog series discussed the basic entity types in Data Vault 2.0: hubs, links and satellites. While it would be theoretically possible to limit a model to just these three basic entity types, the resulting Data Vault model would be inefficient: it would most likely consume too much storage, be less efficient due to the many joins, and require a number of grain shifts during information delivery. This is due to certain characteristics in the data that require special treatment.

For these characteristics, Data Vault 2.0 provides special entity types that deal with the specialities. This article focuses on two of them: the non-historized link, which is used to capture transactions and events, and the multi-active satellite, which is used to model multiple active descriptions for the same parent hub or link in the same load.

Read on for an example of how to implement this in a Microsoft Fabric warehouse.

Comments closed

Using the Spark Connect GRPC API

Ed Elliott digs into API details:

In the first two posts, we looked at how to run some Spark code, firstly against a local Spark Connect server and then against a Databricks cluster. In this post, we will look more at the actual gRPC API itself, namely ExecutePlan, Config, and AddArtifacts/ArtifactsStatus.

Click through to see how it all works, with plenty of C# code to guide you along the way.

Comments closed

Loading Data from Statistics Denmark into Power BI

Erik Svensen goes over an oldie:

It turns out that the blogpost I wrote 10 years ago about getting data from Statistics Denmark into Power BI with Power Query still is being used – link.

But as the API has changed a bit since then I was asked to do an update of the blogpost – so here is how you can get the population of Denmark imported into Power BI with Power Query.

Read on to see the right way to do it today.

Comments closed

Embracing the Boring Part of Data Governance

Nikki Kelly shares some thoughts on data governance:

Data Governance – you have heard the term a million times and not once has it driven excitement in to your heart. I’d like to spend the next few minutes changing that.

Data Governance is formally defined as “… a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods.”

Boring.

Nikki makes a great point that the process may feel boring but the net results are critical.

Comments closed