Press "Enter" to skip to content

Author: Kevin Feasel

When All AG Nodes are Secondaries

Randy Knight demands quorum:

If you’ve encountered a situation where none of your SQL Server Always On Availability Group (AG) replicas become PRIMARY after a cluster failure — you’re not alone.  We recently had a customer with this exact scenario (AG won’t become primary after force quorum), and it is both uncommon and difficult to troubleshoot so I thought it would be worth posting about.

Click through for the scenario, what’s happening, and how to resolve this.

Leave a Comment

High Water Mark and PostgreSQL Vacuum Operations

Shane Borden troubleshoots an issue:

The first thing we came to understand is that the pattern of work on the primary is a somewhat frequent large DELETE statement followed by a data refresh accomplished by a COPY from STDIN command against a partitioned table with 16 hash partitions.

The problem being observed was that periodically the SELECTs occurring on the read replica would time out and not meet the SLA. Upon investigation, we found that the “startup” process on the read replica would periodically request an “exclusive lock” on some random partition. This exclusive lock would block the SELECT (which is partition unaware) and then cause the timeout. But what is causing the timeout?

Read on for the answer and tips on how to determine if you have problems with High Water Mark growth in PostgreSQL.

Leave a Comment

Tuning SSIS Data Flow Buffers

Andy Brownsword speeds things up:

When using data flows in SSIS packages we want the flow to be fast and fluid. Improving performance of the flows will vary in different packages, but one element can consistently help – tuning memory utilisation.

In this post we’ll look at tuning the memory utilisation by altering the number of rows passing through the flow concurrently. Specifically we’re looking at the following properties:

  • DefaultBufferSize
  • DefaultBufferMaxRows

This is a pretty big deal, finding the right size that allows you to keep data flowing smoothly without having to wait for buffers to fill. The defaults are aggressively low in most cases. And good on Andy for providing a series of tests to give practical numbers.

Leave a Comment

A Primer on Loss Functions

Kanwal Mehreen compares loss functions:

I must say, with the ongoing hype around machine learning, a lot of people jump straight to the application side without really understanding how things work behind the scenes. What’s our objective with any machine learning model, anyway? You might say, “To make accurate predictions.” Fair enough.

But how do you actually tell your model, “You’re close” or “You’re way off”? How does it know it made a mistake — and by how much?

That’s where loss functions come in.

Read on to learn what loss functions are, how they work, and when you might want to choose each.

Leave a Comment

Listing Items in a Fabric Workspace using Powershell

Rob Sewell continues a series on working with Microsoft Fabric assets via Powershell:

Having created a workspace, a lakehouse, a warehouse, a Fabric SQL database previously, it’s time to move onto something a little more interesting. I was getting bored writing the same post over and over again, so I thought I would show you how to list the items in a Fabric Workspace using PowerShell. This command was super useful for me today to answer the question “What do they have in this workspace?” and I hope it will be useful for you too.

Click through to see how you can accomplish this.

Leave a Comment

The Challenges of Benchmarking

Tomas Vondra digs into some metrics:

I do a fair number of benchmarks, not only to validate patches, but also to find interesting (suspicious) stuff to improve. It’s an important part of my development workflow. And it’s fun 😉 But we’re dealing with complex systems (hardware, OS, DB, application), and that brings challenges. Every now and then I run into something that I don’t quite understand.

Consider a read-only pgbench, the simplest workload there is, with a single SELECT doing lookup by PK. If you do this with a small data set on any machine, the expectation is near linear scaling up to the number of cores. It’s not perfect, CPUs have frequency scaling and power management, but it should be close.

Click through for a quick look at some odd behavior, followed by a lot of interesting digging into the weeds trying to find answers.

Leave a Comment

Setting the Optimal logical_decoding_work_mem in PostgreSQL

Ashutosh Bapat shares a tip with us:

Logical replication is a versatile feature offered in PostgreSQL. I have discussed the the theoretical background of this feature in detail in my POSETTE talk. At the end of the talk, I emphasize the need for monitoring logical replication setup. If you are using logical replication and have setup monitoring you will be familiar with pg_stat_replication_slots. In some cases this view shows high amount of spill_txns, spill_count and spill_bytes, which indicates that the WAL sender corresponding to that replication slot is using high amount of disk space. This increases load on the IO subsystem affecting the performance. It also means that there is less disk available for user data and regular transactions to operate. This is an indication that logical_decoding_work_mem has been configured too low. That’s the subject of this blog: how to decide the right configuration value for logical_decoding_work_mem. Let’s first discuss the purpose of this GUC. Blog might serve as a good background before reading further.

Read on to learn a bit more about how this value works and what you can do to set it correctly.

Leave a Comment

Purview Data Loss Prevention in Microsoft Fabric

Yael Biss doesn’t want people walking off with the data:

As data volume and complexity soar, protecting sensitive information has become non-negotiable. With the latest enhancements to Purview Data Loss Prevention (DLP) Policies in Microsoft Fabric, organizations now have the power to proactively secure their data in Onelake.

Whether you’re just getting started or looking to take your data governance to the next level, following proven best practices will maximize your security, compliance, and productivity.

Click through for several tips on how to use Microsoft Purview DLP in Fabric. One of those tips ought to be “Get a side hustle so you can afford both Purview and Fabric.”

Leave a Comment

Estimating Query Percentiles in PostgreSQL

Michael Christofides makes an assertion:

I recently saw a feature request for pg_stat_statements to be able to track percentile performance of queries, for example the p95 (95th percentile) or p99 (99th percentile).

That would be fantastic, but isn’t yet possible. In the meantime, there is a statistically-dodgy-but-practically-useful (my speciality) way to approximate them using the mean and standard deviation columns in pg_stat_statements.

Click through for the code. Michael even covers the immediate objection I have (that the data isn’t normally distributed, so you shouldn’t use the same Z score estimators that exist for the normal). That said, if you’re interested in “p99…ish” then this is a clever approach to take.

Leave a Comment

Pain Points around tempdb

Kevin Hill has a list:

TempDB is the SQL Server equivalent of a junk drawer – everyone uses it, nobody monitors it, and eventually it becomes a bottleneck you can’t ignore.

Whether it’s poorly configured from the start or getting hammered by bad execution plans, TempDB often becomes the silent killer of performance. The good news? A few targeted changes can make a big impact.

Read on for some of tempdb’s greatest hits. Kevin also has a few quick tips for tempdb.

Leave a Comment