Press "Enter" to skip to content

Month: July 2023

Statistics and Ascending Keys in SQL Server

Matthew McGiffen talks stats:

The Ascending Key Problem relates to the most recently inserted data in your table which is therefore also the data that may not have been sampled and included in the statistics histograms. This sort of issue is one of the reasons it can be critical to update your statistics more regularly than the built-in automatic thresholds.

We’ll look at the problem itself, but also some of the mitigations that you can take to deal with it within SQL Server.

Read on to see how this behaved prior to SQL Server 2014’s new cardinality estimator and what has changed since then.

Comments closed

sp_HumanEventsBlockViewer Updates

Erik Darling has another update:

In this post, I’m going to talk about a couple cool changes to sp_HumanEventsBlockViewer, a procedure I wrote to analyze the blocked process report via Extended Events, and wish I had given a snazzier name to.

You see, when I wrote it, I pictured it as a utility script for sp_HumanEvents, which will set up the blocked process report and an extended event. But it turns out I use it a lot more on its own.

Read on for Erik’s update, including a neat trick around using an aggregate within a window function to generate ordering.

Comments closed

Tracking High-Level Power BI Import Throughput Stats

Chris Webb collects some measurements:

In the first post in this series I described the events in Log Analytics that can be used to understand throughput – the speed that Power BI can read from your dataset when importing data from it – during refresh. While the individual events are easy to understand when you look at a simple example they don’t make it easy to analyse the data in the real world, so here’s a KQL query that takes all the data from all these events and gives you one row per partition per refresh:

Click through for the KQL script, as well as what it all means and what you can get out of it.

Comments closed

The Basics of Fact-Dimensional Modeling

Nikola Ilic gives us a primer on Kimball-style fact and dimensional modeling:

Before we come up to explain why dimensional modelling is named like that – dimensional, let’s first take a brief tour through some history lessons. In 1996, a man called Ralph Kimball published a book “The Data Warehouse Toolkit”, which is still considered a dimensional modelling “Bible”. In his book, Kimball introduced a completely new approach to modelling data for analytical workloads, the so-called “bottom-up” approach. The focus is on identifying key business processes within the organization and modelling these first, before introducing additional business processes.

This is a really good overview of the topic, though I’m saddened that “dimensional bus matrix” didn’t make the cut of things to discuss. Mostly because I like the name “dimensional bus matrix.”

Comments closed

Updates to sp_QuickieStore and sp_PressureDetector

Erik Darling has been busy. FIrst, sp_QuickieStore:

The first thing on the list that I want to talk about is the ability to cycle through all databases that have Query Store enabled.

If you have a lot of databases with it turned on, it can be a real hassle to go through them all looking for doodads to diddle.

Now you can just do this:

Next, sp_PressureDetector:

I added  high-level disk metrics similar to what’s available in other popular scripts to mine. Why? Sometimes it’s worth looking at, to prove you should add more memory to a server so you’re less reliant on disk.

Especially in the cloud, where everything is an absolute hellscape of garbage performance that’s really expensive.

Click through for both sets of updates and thank Erik for his willingness to give so much to the community.

Comments closed

Power BI: Git Integration vs Deployment Pipelines

Marc Lelijveld notes a chemical reaction:

Together with the introduction of Microsoft Fabric a long-awaited feature for Power BI also became available. Git integration now allows us to connect your workspace to a git repository to sync the meta data from your Power BI dataset and report between the workspace and the repository.

At the same time, Power BI deployment pipelines are around. The pipelines help you to move content between workspaces from Development to Test and finally to Production. The question that arises, when to use what? And how do they complement each other, or maybe even clash with each other?

Read on for Marc’s thoughts across these two topics.

Comments closed

Connecting a SQL Server Instance to Azure Arc

Deepthi Goguri has a guide:

When you install SQL Server 2022 through the GUI, you will see an option in the features “SQL Server Extention for Azure”

This is more of a “how” than a “why.” Azure Arc-enabled SQL Server instances let you use Azure’s control plane (their graphs and some configuration options) to manage SQL Server instances, regardless of whether they’re actually in Azure or on-premises. That way, a DBA with one foot in both camps can have a consistent administrative experience for things like inventorying SQL Server instances.

Comments closed

Subsetting List Objects in R

Steven Sanderson makes a sub-list and checks it twice:

If you’re an aspiring data scientist or R programmer, you must be familiar with the powerful data structure called “lists.” Lists in R are collections of elements that can contain various data types such as vectors, matrices, data frames, or even other lists. They offer great flexibility and are widely used in many real-world scenarios.

In this blog post, we will explore one of the essential skills in working with lists: subsetting. Subsetting allows you to extract specific elements or portions of a list, helping you access and manipulate data efficiently. So, let’s dive into the world of list subsetting and learn some useful techniques along the way!

Read on for multiple ways of subsetting lists in base R.

Comments closed