Press "Enter" to skip to content

Author: Kevin Feasel

Fixing OPTIMIZATION_REPLAY_FAILED Errors in SQL Server

Kendra Little fixes a problem:

Forcing plans with Query Store can be a powerful tool—until it mysteriously fails. In real production systems, plan forcing sometimes just… doesn’t work. One common culprit is the cryptic OPTIMIZATION_REPLAY_FAILED error.

If you’re hitting OPTIMIZATION_REPLAY_FAILED, try re-forcing the plan using @disable_optimized_plan_forcing=1.

Click through for a summary of the problem and additional helpful information on the issue.

Comments closed

Permissions to Execute Fabric Data Factory REST API Calls

Andy Leonard doesn’t need to ask for permission:

The problem we are trying to solve is: Grant an Azure Data Factory permission to execute the ReST API method calls against artifacts in a Fabric workspace.

I begin visiting the Azure Portal. If you don’t have an Azure account, you can create one. You can do a lot of things in Azure for little money, but you need an account and that account needs to be secured by a working source of funds in case you leave a virtual machine running for days. Hypothetically. BE CAREFUL. DO NOT LEAVE A VM RUNNING FOR DAYS.

Good en passant advice. I once blew through a $250 Azure credit by laving an HDInsight cluster on for, uh, a few hours.

Comments closed

Goodbye, Azure ML SDK v1

I have a new video:

In this video, I cover some news from Microsoft around the deprecation of the Azure Machine Learning SDK v1. We’ll take a look at the upgrade guide and see what it will take to perform this upgrade.

Microsoft will still support the SDK v1 until September of 2026, so we have a year to get code sorted out. The CLI v1, however, will go away sooner, so be sure you’re keeping up on that.

Comments closed

End of Month Testing in Powershell

Andy Levy checks if this is the end of the month:

This is one of those blog posts you write so that 2 years later, you can look it up to remind yourself how to do something.

I found myself needing to figure out if “today” was the end of the month in PowerShell. In T-SQL, this is easy, as we have the EOMONTH() function. But PowerShell (the .NET System.DateTime struct) doesn’t have the same thing.

Read on for the solution Andy came up with.

Comments closed

Reading Delta Tables via SQL Code in a Microsoft Fabric Python Notebook

Gilbert Quevauvilliers writes a SQL statement:

I come from a TSQL background, so using SQL makes it easy for me to work with data.

There are multiple ways to use SQL in a PySpark notebook, and when I started using a Python notebook it was not so straightforward.

In this blog post I will show you how I use SQL Code.

As mentioned previously I am by no means an expert, I typically find a way that works, is fast and doesn’t consume a lot of capacity. If that works consistently for me then that is how I go about it.

Click through for the solution, which uses DuckDB. As such, the SQL syntax isn’t T-SQL—it’s more like psql. But it does do a great job of interacting with Parquet files and Delta tables.

Comments closed

Tips for Using Database Snapshots

Stephen Planck offers some advice:

Database snapshots are one of those features that’s been around forever, but still solves real-world problems with very little setup. In a single statement you can capture a point-in-time, read-only copy of any user database, use it for reporting or off-load testing, and—if disaster strikes—revert the source back to that snapshot in minutes. This guide explains how snapshots work under the hood, walks through day-to-day tasks (including creating the original database), and highlights the pitfalls you should plan for before using them in production.

One additional tip that I’d offer: limit yourself to one database snapshot per database. There’s a performance cliff you can hit when you get fancy and try to have multiple database snapshots going at the same time off of the same database.

Comments closed

Queue-Style Batch Deletion Woes in PostgreSQL

Shayon Mukherjee runs into an interesting issue:

I recently discovered an unexpected behavior in PostgreSQL involving a pattern of using a Common Table Expression (CTE) with DELETE ... RETURNING and LIMIT to process a batch of items from a queue-like table. What seemed straightforward turned out to have a surprising interaction with the query planner.

Click through to see what happened, as well as Shayon’s recommendation on how to fix it.

Comments closed

Kafka Data Exploration with Tableflow

Robin Moffatt does some exploratory data analysis:

One of the challenges that I’d always had when it came to building streaming data pipelines is that once data is in a Kafka topic, it becomes trickier to query. Whether limited by the available tools to do this or the speed of access, querying Kafka is just not a smooth experience.

This blog post will show you a really nice way of exploring and validating data in Apache Kafka®. We’ll use Tableflow to expose the Kafka topics as Apache Iceberg™️ tables and then query them using standard SQL tools.

Click through for the demonstration using a real dataset.

Comments closed

Creating Error Bars in ggplot2

Zhenguo Zhang draws a chart:

Sometimes you may want to create a plot with the following features:

  • a point to indicate the mean of a group
  • error bars to indicate the standard deviation of the group
  • and each group may have subgroups, which are represented by different colors.

In this post, I will show you how to create such a plot using the ggplot2 package in R.

Read on for the demonstration, as well as fixing a common problem of overlapping data points. H/T R-Bloggers.

Comments closed