Press "Enter" to skip to content

Category: Error Handling

Ranger and Jersey Clients

Jon Morisi troubleshoots an irksome issue:

Just a quick blog here about an issue I had with HDP-3.1.4.0.  I recently was setting up a new user with specific rights in Ranger for Hive access.  After creating the new policy and attempting to validate it, I received an error message stating that the hive user does not have use privilege.  This error was produced even though I had just created the policy specifically granting those privilege’s.

Upon further review I noticed that the plugin was downloading the policy, but not applying it.  

Read on to learn what the problem was and how Jon corrected it.

Comments closed

Enumerating Breaking Changes to Power BI Reports

Brett Powell gives us a list of things which might cause breaking changes in Power BI reports:

A breaking change, which we can define as any change to a dataset which causes either reports to render errors or the dataset to fail to refresh, can severely impact business workflows and reflect poorly on those responsible for the solution. Given significant investments in other areas of the organization’s data estate such as Azure Synapse Analytics, a simple, easily avoidable oversight in a Power BI deployment may not be tolerated.

Read on for the list.

Comments closed

Error Handling Patterns in Kafka

Gerardo Villeda gives a few options for handling errors in an Apache Kafka topic:

Apache Kafka® applications run in a distributed manner across multiple containers or machines. And in the world of distributed systems, what can go wrong often goes wrong. This blog post covers different ways to handle errors and retries in your event streaming applications. The nature of your process determines the patterns, and more importantly, your business requirements.

This blog provides a quick guide on some of those patterns and expands on a common and specific use case where events need to be retried following their original order. This blog post illustrates a scenario of an application that consumes events from one topic, transforms those events, and produces an output to a target topic, covering different approaches as they gradually increase in complexity.

Click through for the list. Each explanation is pretty short, but opens the door for further analysis.

Comments closed

Data Pipeline Error Handling with Apache NiFi

Pieter Humphrey gives us a few techniques for handling data pipeline errors when running Apache NiFi:

The more complex the model, the more possible sources of problems exist. Forecasting every single potential problem is, of course, impossible. Identifying the most important ones and providing self-solving solutions can greatly reduce the operational uncertainty of our NiFi pipeline and improve its robustness.

To see how to do this analysis, we will consider four possible strategies: one external and three internal. They certainly do not cover all potential error scenarios, they are just examples that we can extrapolate from, and inform how to handle other potential failure domains.

Click through for an overview of the topic as well as those four techniques.

Comments closed

One…Million IO Requests

Sean Gallardy wins the jackpot:

If, somehow, you’ve managed to see this error in your errorlog then congratulations, you’ve won an instance of SQL Server that probably won’t be doing much.

I found out about this message a few months ago, but it has been in the product for years and I went this long without ever even knowing it existed (congrats me!) until I was asked about it and coincidentally ended up finding it in an errorlog the same week. Clearly, I have too much fun packed into my weeks. I asked around, only one other person had ever found this in an errorlog before… that’s either impressive, depressing, or some perfect quantity of both – mellow it out to a smooth melancholy.

Click through to see more information about the 1000000 IO error message and when you might find it.

Comments closed

Errors and Return Codes in SQL Agent Powershell Job Steps

Ron the Polymath has a framework:

PowerShell job steps offer a lot of advantages, but when things don’t work as expected, it can frustrating to understand why. Things like when a non-zero exit code reports the step as successful. Some important points I found with PowerShell steps (especially the first item):

Read on for those interesting points, for a block of Powershell code you can use to track errors, and a SQL Agent job template to boot.

Comments closed

VS_NEEDSNEWMETADATA in SSIS

Hadi Fadlallah discusses what was the bane of my existence for about 3 months in 2010:

In this article, we will briefly explain the VS_NEEDSNEWMETADATA SSIS exception, one of the most popular exceptions that an ETL developer may face while using SSIS. Then, we will run an experiment that reproduces this error. Then, we will show how we can fix it.

This was really annoying prior to SQL Server 2008 (at least, that’s my early-morning recollection of when the SSIS engine started trying to auto-fix this) and has been mildly annoying since. I had far too many conversations which I could summarize as “Yes, I understand that this Excel spreadsheet is basically the same, but it’s different in that the casing on one header column has changed slightly and that breaks the entire system.

Comments closed

Cannot Recover the Master Database

Erik Cobb takes us through an undesirable experience:

Recently, during patching for a 2019 SQL Server, the SQL services refused to start after the patching. It was throwing the following heart attack inducing error:

Cannot recover the master database. SQL Server is unable to run. Restore master from a full backup, repair it, or rebuild it.

This is not the first time I have seen this error. Actually, this seems to be the default error that Microsoft throws any time there is a problem with the patching process. The good news is, most likely this is a false error and the master database is perfectly fine.

Read on to learn more.

Comments closed

Fun with MERGE and Deadlocks

Daniel Hutmacher walks us through another reason to avoid using the MERGE operator:

I recently ran into a curious deadlock issue. I have a process that performs a lot of updates in a “state” table using multiple, concurrent connections. The business logic in the application guarantees that two connections won’t try to update the same item, so we shouldn’t ever run into any locking issues. And yet, we keep getting deadlocks.

What’s going on here? Hint: it has to do with isolation levels and range locks.

Read on for the problem-causing query and a few ways to resolve the problem.

Comments closed

Surviving a Kafka Outage

Jakub Korab walks us through availability features in Kafka as well as what to expect if your brokers are unavailable:

In the case of an outage, you have to ensure that these messages can be processed eventually. Keeping unsent messages around and retrying indefinitely in the hopes that the outage will rectify may eventually result in your application running out of memory. This is a crucial consideration in high-throughput applications.

If business functions are performed by systems downstream of Kafka, and the sending application only acts as an ingestion point, the situation is slightly more relaxed. If Kafka is unavailable to send messages to, then no external activity has taken place. For these systems, a Kafka outage might mean that you do not accept new transactions. In such a case, it may be reasonable to return an error message and allow the external third party to retry later. Retail applications typically fall into this category.

Read the whole thing.

Comments closed