Press "Enter" to skip to content

Category: Error Handling

Troubleshooting Weird Issues

Chad Callihan says sometimes, the best answer is not to play the game:

After some database infrastructure changes related to phasing out the use of linked servers, I encountered issues with a setup tool used to build out new databases and other related features. One section of the tool was failing, and the errors indicated that there were still stored procedures utilizing linked servers, which was causing the problem. I asked myself a few questions on how best to proceed. Does the setup tool need to be updated? Do the related database procedures using linked servers need to be updated? Do the linked server changes made need to be rolled back altogether?

Read on for a proper Gordian Knot solution.

Leave a Comment

When All AG Nodes are Secondaries

Randy Knight demands quorum:

If you’ve encountered a situation where none of your SQL Server Always On Availability Group (AG) replicas become PRIMARY after a cluster failure — you’re not alone.  We recently had a customer with this exact scenario (AG won’t become primary after force quorum), and it is both uncommon and difficult to troubleshoot so I thought it would be worth posting about.

Click through for the scenario, what’s happening, and how to resolve this.

Leave a Comment

Split-Brain Scenarios in PostgreSQL Clusters

Semab Tariq knows that an application cannot serve two masters:

In this blog post, we will try to explore a critical failure condition known as a split-brain scenario that can occur in PostgreSQL HA clusters. We will first see what split-brain means, and then how it can impact PostgreSQL clusters, and finally discuss how to prevent it through architectural choices and tools available in the PostgreSQL ecosystem

Click through for an explanation of split-brain and what can cause this problem. Additionally, Semab includes several tips on how to limit the likelihood of a split-brain scenario occurring.

Leave a Comment

sqlcmd in SQL Server 2025 and Certificate Chain Not Trusted

Vlad Drumea points out a new thing to keep an eye on:

SQL Server 2025 provides ODBC sqlcmd version 17 which enforces an encrypted connection.

If you’re trying to use it to connect to instances that don’t have a CA-signed certificate or where TLS encryption was never properly configured, sqlcmd will throw the famous “certificate chain not trusted” error message:

Sqlcmd: Error: Microsoft ODBC Driver 18 for SQL Server : SSL Provider: The certificate chain was issued by an authority that is not trusted.
Sqlcmd: Error: Microsoft ODBC Driver 18 for SQL Server : Client unable to establish connection.

The proper answer to this is to get trusted certificates. The workaround is what Vlad describes, so click through for that.

Comments closed

“Can’t Determine Relationships between the Fields” in Power BI

Marco Russo and Alberto Ferrari explain an error:

When you create a Power BI matrix, you drag and drop columns in the matrix, then add some measures, and Power BI figures out on its own which combinations of values to show. The process is so intuitive that we mostly ignore the details. However, Power BI sometimes cannot figure out how to populate the matrix, thus producing the error: “can’t determine relationship between the fields”. Adding a measure fixes the problem, but why? In some other scenarios, Power BI shows many empty rows, eliminating many of them only when you add a measure. Power BI shows a subset of the values in other scenarios, even when no measure is involved.

Read on for the explanation.

Comments closed

The Unreliability of Microsoft Fabric

Brent Ozar points out some major issues:

The link https://aka.ms/fabricsupport takes you to a localized status page that almost always shows all green checkmarks – even when the service is on fire. During last month’s 12+hour overnight outage, people were screaming on Reddit overnight that things were down, but the status dashboard was showing all green. When Microsoft employees woke up, they asked if people were still having problems – and then eventually got around to updating the status page to reflect the outage when it was clear that things were really borked.

Redditors have resorted to relying on reporting Fabric outages to Statusgator, who then tracks the time gap between a burst of user outage reports, to the time Microsoft actually updates their status page – and it ain’t pretty:

Click through for Brent’s take and an embarrassingly bad post-mortem. Given that Microsoft Fabric is a software-as-a-service product, there’s an inherent level of trust necessary in using it: you’re relying upon the platform team to ensure things are running smoothly and that you get what you’re paying for. Incidents like this erode that trust. Outages themselves are bad but they do happen. The real problem is in not embracing the outage: be clear with customers on current status and cause, and ensure people can easily see the history of events.

Comments closed

SQL Agent “Success” on Failure

Todd Kleinhans does not believe that green is good:

Far too many times, I have seen DBA(s) and others have this false sense that if the Agent run status shows green, then everything must be ok.

Click through for a funny story about a gas station robbery and examples of how a SQL Agent job can report success but actually fail. You also see this a lot with replication or tasks that are asynchronous in nature: the task is reporting that we successfully started whatever operation, but that doesn’t mean the operation itself succeeded.

Comments closed

PARSE_SYNTAX_ERROR in Microsoft Fabric Notebooks

Olivier Van Steenlandt runs into an error:

As mentioned earlier, I have been playing around with Microsoft Fabric intensively in the past few months. During this period, I ran into a specific issue with one of my notebooks. What happened? Well, I was starting on a new notebook in the evening and life happened… So I stopped playing around to do something else.

A few days later, I wanted to continue my work and remembered that I was required to change something in my data load from a csv file.

Read on for the cause of this error. It’s something that can affect anyone at any time. Even you. Well, probably not you, but the person next to you? Yeah, even that person.

Comments closed

Review those Logs

Kevin Hill has a public service announcement:

Most SQL Server crashes don’t come out of nowhere.
They leave breadcrumbs – red flags that something’s not right. The problem? If you don’t know where to look, you miss the signs…until it’s 2am and your CEO’s calling.

Let’s talk about how to listen for those whispers before they turn into full-blown alarms.

Click through for some advice on the topic. I’ll also note that everything Kevin mentions, you can automatically retrieve and centralize in a monitoring system, and once you have more than a couple of SQL Server instances, I’d recommend doing so.

Comments closed

Fixing OPTIMIZATION_REPLAY_FAILED Errors in SQL Server

Kendra Little fixes a problem:

Forcing plans with Query Store can be a powerful tool—until it mysteriously fails. In real production systems, plan forcing sometimes just… doesn’t work. One common culprit is the cryptic OPTIMIZATION_REPLAY_FAILED error.

If you’re hitting OPTIMIZATION_REPLAY_FAILED, try re-forcing the plan using @disable_optimized_plan_forcing=1.

Click through for a summary of the problem and additional helpful information on the issue.

Comments closed