Press "Enter" to skip to content

Curated SQL Posts

Dealing with Page Latch Weights

Jared Poche talks latches:

Reducing waits is a great way to improve the performance of your SQL Servers. Minimizing PAGELATCH_EX and PAGELATCH_SH wait types are more involved than most. There are generally two causes; one of which is largely solved in recent versions, and one which requires real thought and planning to resolve.

Read on for both of those causes, as well as several tips you can use to minimize the risk of waiting.

Comments closed

Using KEEPFILTERS in DAX

Marco Russo and Alberto Ferrari share some advice:

In the article Using KEEPFILTERS in DAX, we described how to use KEEPFILTERS, which is a function that preserves the existing filter on columns affected by a new filter applied by CALCULATE or CALCULATETABLE. We suggest reading that article if you are not familiar with KEEPFILTERS. However, we wanted to clarify a rule of thumb you can apply to quickly decide when to use KEEPFILTERS or not in most cases. This will allow you to only invest more time when you are dealing with specific requirements.

Read on for common use cases of KEEPFILTERS, as well as recommendations on when and how you should use the function.

Comments closed

Building Metadata-Driven Pipelines in Microsoft Fabric

Koen Verbeeck lays out a process:

The goal of metadata driven code is that you build something only once. You need to extract from relational databases? You build one pipeline that can connect to a relational source, and you parameterize everything (server name, database name, source schema, source table, destination server name, destination table et cetera). Once this parameterized piece of code is ready, all you must do is enter metadata about the sources you want to extract. If at a later point an additional relational source needs to be extracted, you don’t need to create a brand-new pipeline. All you need to do is enter a new line of data in your metadata repository.

Aside from speeding up development – after you’ve made the initial effort of creating your metadata driven pipeline – is that everything is consistent. You tackle a certain pattern always in the same way. If there’s a bug, you need to fix it in one single location.

Read on to see how this works. The idea is certainly not new, as Koen mentions, but there are some specific factors that come into play for Microsoft Fabric pipelines.

Comments closed

Missing Log Information after a VM Rollback

Jonathan Kehayias (via Paul Randal) advises caution:

Recently I received an email with a very vague description of a problem that went something along the lines of:

“After some issues with Windows patching and rolling back from snapshots, our cluster resources won’t come online. In the process of troubleshooting the issue and validating the fix, we rolled back a few times. We can’t find any details about the issue in the cluster logs, Windows event logs, or SQL Server error log to investigate further.”

Read on for more information about the immediate problem, the root cause, and the actual issue the customer ran into before compounding the problem.

Comments closed

Security and the Weakest Link

Brian Kelley explains:

When I look at a system and think about its security model, the first thing I start poking around at is where I think security is weakest. For instance, if my target is a Microsoft SQL Server box, I don’t generally look for a weakness in SQL Server itself. I start looking at the operating system, I look at accounts that may have access, and since I’m really worried about the data being taken, I look to see how backups are handled and where they are written to.

Read on for more information, as well as a link to Brian’s 2019 video on attacking SQL Server.

Comments closed

SQL Server Drivers and Access Tokens for Connection Pooling

David Engel gives us a deep dive:

We often get support escalations related to failures around expired access tokens when using Microsoft Entra ID (formerly Azure Active Directory) authentication. There is a lot of nuance in the various drivers about how and when access tokens are used and when they get renewed after expiring. I’m going to do a deep-dive and try to explain some of the details here.

This post is mainly targeted at people who have to troubleshoot issues in this area or those who are building large services and need to understand the impact of their authentication choices.

Read on to learn more about how the ODBC, JDBC, and .NET drivers differ.

Comments closed

MCMC Sampling with TidyDensity

Steven Sanderson performs some sampling:

In the area of statistical modeling and Bayesian inference, Markov Chain Monte Carlo (MCMC) methods are indispensable tools for tackling complex problems. The new tidy_mcmc_sampling() function in the TidyDensity R package simplifies MCMC sampling and visualization, making it accessible to a broader audience of data enthusiasts and analysts.

Read on for a brief primer on MCMC and an example of how the tidy_mcmc_sampling() function works.

Comments closed

Digging into Cursors

Hugo Kornelis gives us a primer on cursors:

And yes, I know the mantra. Do not use cursors. They are slow. There is (almost) always a faster set-based alternative available. So why would I even waste blog space on cursors, when the only smart thing to do is to rip them out and replace them with a set-based alternative?

Well, there are, in fact, many reasons. The “almost” above suggests that there are still cases where row by row processing is in fact the most efficient method. There are cases where we use a cursor, even though set-based is slower, because we need to call a stored procedure for each row returned, and that stored procedure is too complex to be changed to process an entire set at once. Or, perhaps, you just inherited existing code that uses a cursor, and you need to fix the immediate performance issues now, so you cannot afford the time investment to rewrite the row by row logic to set-based logic.

Another example of this was calculating running totals prior to SQL Server 2012’s support of aggregate window functions. The cursor approach was considerably faster than the self-join approach because the self-join approach required joining on an inequality statement. Granted, there was the “quirky update” technique, but that depended on an accident of SQL Server internals that Microsoft never officially supported and could have broken at any time.

Comments closed

Migrating DATETIME Data to DATETIMEOFFSET

William Assaf adds some time zones:

I recently reviewed, worked on, and added a similar example to the DATETIMEOFFSET Microsoft Learn Docs article at the recommendation of my colleague Randolph West, who guessed (accurately) I would enjoy such a task. It was a nice pre-Build diversion. 

This topic is one that I have co-presented on in the past and hounded project capstone review presentations about. If you’re not storing time zone offset in your date/time data, you’re setting yourself up for future pain. That future pain is not what this blog post is about.

My preference is not to store time zone offset but instead store everything in UTC and perform any time zone switcharoos in the UI. But if you are storing local dates and times, I completely agree that you should keep track of the time zone. I worked for an east coast US company that bought a west coast US company, and both stored local dates and times in their SQL Server databases, making data consolidation a real challenge.

Comments closed

Listen and Notify in Postgres

Brandur Leach shows how to use PostgreSQL’s listen/notify capabilities:

Listen/notify in Postgres is an incredible feature that makes itself useful in all kinds of situations. I’ve been using it a long time, started taking it for granted long ago, and was somewhat shocked recently looking into MySQL and SQLite to learn that even in 2024, no equivalent exists.

In a basic sense, listen/notify is such a simple concept that it needs little explanation. Clients subscribe on topics and other clients can send on topics, passing a message to each subscribed client. The idea takes only three seconds to demonstrate using nothing more than a psql shell:

Read on to learn more about the notifier pattern. What’s interesting is that the notifier patter, which adds a fair bit of structure to this very simple process, makes it work a good bit like SQL Server’s Service Broker.

Comments closed