Press "Enter" to skip to content

Month: May 2024

A Deep Dive into Covering and Non-Covering Indexes

Etienne Lopes goes deeper than I thought:

A few days ago a client brought me a question, he had two tables with the same data, one in each database (same SQL version and compatibility level in different environments with similar infrastructure and configurations). Both tables had an extremely selective column and both had indexes in that column. The thing is that when running a particular query on one Database, let’s call it DB1 here, it was really fast whereas running the same query on the other database (DB2) it was a lot slower. The subset of rows that matched the WHERE clause was the same and it was really small in both cases (since the tables had the same data).

“Ok, let’s have a look at that”, I said, and when doing so, the query running on DB1 was doing a Clustered Index Seek whereas the same query in DB2 was doing a Table Scan. Didn’t I say that both tables had indexes in that column?

Read on for the post. Etienne warns that it’s a big boy of a post, and it covers a lot, including “seek” operations that are scans in disguise, index levels, and more.

Comments closed

The NOT Keyword in SQL Server

Kevin Wilkie gets tied up in NOTs:

Today, I want to talk about the keyword NOT in SQL Server. It can be your friend or your worst enemy depending upon how you use it. Let’s delve into some examples of what I’m talking about and how it’s easy for people to mess it up.

Look, it’s not that I’m not saying you shouldn’t avoid not using NOT here. It’s just that it’s really hard to get the coveted quintuple-negative in natural speech.

Comments closed

SQL ConstantCare Population Report for Spring 2024

Brent Ozar has a new chart for us:

The short story: SQL Server 2019 appears poised to swallow the SQL Server market altogether, hahaha.

The long story: ever wonder how fast people are adopting new versions of SQL Server, or what’s “normal” out there for SQL Server adoption rates? Let’s find out in the winter 2023 version of our SQL ConstantCare® population report.

Out of the thousands of monitored SQL Servers, SQL Server 2019 is now at 48% of the market! That’s the highest percentage we’ve seen for any version in the 3 years that we’ve been doing this analysis, up from last quarter’s 44% market share. Here’s how adoption is trending over time, with the most recent data at the right:

Click through for the graph and a bonus excursion on database size and associated hardware. And as always, this is an interesting view of one sample of SQL Server instances, but that sample may not (and probably does not) represent SQL Server usage as a whole.

Comments closed

Object Dependencies and Referencing Entities

Andy Brownsword takes another look:

Last week we looked at identifying object dependencies through performing a wildcard search on the objects such as procedures. I noted another option could be to use the views such as sys.sql_expression_dependencies and Kevin also commented about using the supporting functions such as sys.dm_sql_referencing_entities.

I wanted to briefly look at both of these options and look at how their results can differ. 

Linkception graf achieved.

Comments closed

The Benefits of Separating Data from Content in Microsoft Fabric

Tom Martens drives a wedge:

This article is solely about one question: what has to be done if a content creator needs to create and publish reports but the content creator is not allowed to see all the data?

This seems to be a simple requirement: develop content (finally publish the report), but with Row Level Security (RLS) applied.

To answer the question, I think it’s necessary to understand the following core principle, at least to some extent:

  • Workspace roles

Read on for more information about how workspace roles work in this domain.

Comments closed

Model Selection with AIC

Steven Sanderson talks about the Akaike Information Criterion:

In the world of data analysis and statistics, one of the key challenges is selecting the best model to describe and analyze your data. This decision is crucial because it impacts the accuracy and reliability of your results. Among the many tools available, the Akaike Information Criterion (AIC) stands out as a powerful method for comparing different models and choosing the most suitable one.

Today we will go through an example of model selection using the AIC, specifically focusing on its application to various statistical distributions available in the TidyDensity package. TidyDensity, a part of the healthyverse ecosystem, offers a comprehensive suite of tools for data analysis in R, including functions to compute AIC scores for different probability distributions.

Read on for a quick primer on the AIC itself and how you can use it in TidyDensity.

Comments closed

Dealing with Page Latch Weights

Jared Poche talks latches:

Reducing waits is a great way to improve the performance of your SQL Servers. Minimizing PAGELATCH_EX and PAGELATCH_SH wait types are more involved than most. There are generally two causes; one of which is largely solved in recent versions, and one which requires real thought and planning to resolve.

Read on for both of those causes, as well as several tips you can use to minimize the risk of waiting.

Comments closed

Using KEEPFILTERS in DAX

Marco Russo and Alberto Ferrari share some advice:

In the article Using KEEPFILTERS in DAX, we described how to use KEEPFILTERS, which is a function that preserves the existing filter on columns affected by a new filter applied by CALCULATE or CALCULATETABLE. We suggest reading that article if you are not familiar with KEEPFILTERS. However, we wanted to clarify a rule of thumb you can apply to quickly decide when to use KEEPFILTERS or not in most cases. This will allow you to only invest more time when you are dealing with specific requirements.

Read on for common use cases of KEEPFILTERS, as well as recommendations on when and how you should use the function.

Comments closed

Building Metadata-Driven Pipelines in Microsoft Fabric

Koen Verbeeck lays out a process:

The goal of metadata driven code is that you build something only once. You need to extract from relational databases? You build one pipeline that can connect to a relational source, and you parameterize everything (server name, database name, source schema, source table, destination server name, destination table et cetera). Once this parameterized piece of code is ready, all you must do is enter metadata about the sources you want to extract. If at a later point an additional relational source needs to be extracted, you don’t need to create a brand-new pipeline. All you need to do is enter a new line of data in your metadata repository.

Aside from speeding up development – after you’ve made the initial effort of creating your metadata driven pipeline – is that everything is consistent. You tackle a certain pattern always in the same way. If there’s a bug, you need to fix it in one single location.

Read on to see how this works. The idea is certainly not new, as Koen mentions, but there are some specific factors that come into play for Microsoft Fabric pipelines.

Comments closed

Missing Log Information after a VM Rollback

Jonathan Kehayias (via Paul Randal) advises caution:

Recently I received an email with a very vague description of a problem that went something along the lines of:

“After some issues with Windows patching and rolling back from snapshots, our cluster resources won’t come online. In the process of troubleshooting the issue and validating the fix, we rolled back a few times. We can’t find any details about the issue in the cluster logs, Windows event logs, or SQL Server error log to investigate further.”

Read on for more information about the immediate problem, the root cause, and the actual issue the customer ran into before compounding the problem.

Comments closed