Press "Enter" to skip to content

Author: Kevin Feasel

Online Upgrade of Logical and Physical Replication Notes in Postgres

Amit Kapila notes an improvement in replication capabilities:

In my last blog post, Evolution of Logical Replication, I mentioned the future development of a feature to allow “upgrades of logical replication nodes.” The upcoming release of PostgreSQL 17 includes this feature. Previously, after major version upgrades (via pg_upgrade), users couldn’t immediately connect and write data to logical replication nodes. This was because the slots were lost during upgrades, preventing replication from continuing. As a result, new writes wouldn’t get replicated, causing data on both nodes to become out of sync. As explained in this blog post, users had to block applications from writing until the replication setup was re-enabled after the upgrade.

Read on to see how PostgreSQL 17 changes things.

Comments closed

Azure Database for PostgreSQL Backup and Restoration

Josephine Bush makes and restores a backup:

I wanted to test a migration of Azure PostgreSQL from simple to flexible. I wanted a simple db to restore onto my single server. I may not have needed to do this, but I was then curious how one would go about dumping and restoring on Azure Postgres in general. I plan to use the Azure migration service, but since I started this dump/restore exercise, I thought I would document it here.

Read on for the process.

Comments closed

Debugging a DAX Measure with DAX Query View

Dennes Torres sorts out an issue:

DAX Query View was created some time ago and it was a great tool to run DAX inside Power BI environment.

However, testing a measure in DAX Query View may not be something so obvious. While the measure in a report obeys to the filter context and most of times results in a single value, the filter context doesn’t exist in Query View and the execution needs to return a table.

In this way, a simple copy/past of the measure in DAX Query View will not run. The measure needs to be adapted to be tested.

Let’s analyze one sample case to discover how this happens.

Click through for the answer.

Comments closed

Tracking Python Packages in Snowflake

Kevin Wilkie takes a peek:

When working with one of the many modern computer languages that use libraries, one of the many things to be aware of – as a developer – is the version of the libraries available for your usage.

Since there are multiple languages in Snowflake that use libraries, let’s go over how to check out the versions that come installed and how to install one yourself.

Read on for those answers. Well, one answer and one conundrum.

Comments closed

Simple Outlier Detection and Removal in R

Steven Sanderson looks for oddities:

Outliers can significantly skew your data analysis results, leading to inaccurate conclusions. For R programmers, effectively identifying and removing outliers is crucial for maintaining data integrity. This guide will walk you through various methods to handle outliers in R, focusing on multiple columns, using a synthetic dataset for demonstration.

The techniques Steven uses are perfectly reasonable (though I like to use MAD from the median rather than standard deviations from the mean because MAD from the median doesn’t suffer from the sorts of endogeneity problems standard deviation does in a dynamic process). My primary warning would be to keep outliers in a dataset unless you know why you’re removing them. If you know the values were impossible or wrong—for example, a person who works 500 hours a week—that’s one thing. But sometimes, you get exceptional values out of an ordinary process, and those values are just as real as any other. I might have had a sequence in which I flipped a fair coin and it landed on heads 10 times in a row. It’s statistically very uncommon, but that doesn’t mean you can ignore it as a possibility or pretend it didn’t happen.

Comments closed

Data Ingestion with Microsoft Fabric Copy Jobs

Reitse Eskens spends a bunch of time at the copier:

The copy job is essentially an abstraction of a pipeline reading data from the source system and writing the data into either a Lakehouse or a Warehouse. It really is ingesting data and nothing else. In my opinion that what copy data flows are meant to do and are very good at too.

The big challenge we all keep facing is how to create incremental loads. We have to build some sort of metadata database where we keep the latest ID, data or other column we use to discern the increment on. In our flow, we need to get that value, compare it against the source system and get the differences. The biggest task is to find out if records are deleted.

With the Copy Job, a large part of this task is taken out of your hands. The Copy Job has a configuration GUI (or wizard) that helps you out quite quickly. So let’s not waste anymore characters and dig in!

Read on to see how it works and its capabilities and limitations. The key question, as always, is whether your workload fits into the wheelhouse. If so, this sounds really useful. If not, it’s a proper struggle.

Comments closed

TLS 1.2 (or Later) in Azure SQL

Sakshi Gupta provides a public service announcement:

From November 1st, any Azure SQL server left with the “Select an option” or “NONE” setting (where “NONE” means no enforced minimum TLS version) will only allow connections using TLS 1.2 and TLS 1.3. Connections using TLS 1.0 or TLS 1.1 will be rejected. It is critical for all customers to configure their servers correctly and ensure that their client applications can operate with TLS 1.2 or higher.

Pretty much any SQL Server client or driver that Microsoft released from 2016 forwards will support TLS 1.2, so for most organizations, this should be as simple as enabling the option in development and ensuring applications connect as expected.

Comments closed

Tips for Navigating the Support Ticket Process

Kendra Little shares several tips:

Asking Microsoft for support for SQL Server or Azure SQL is a lousy experience these days. This is true whether you are using a cheaper service tier or the more expensive support tier formerly known as “Premiere Support.” Either way, I’ve found the same issues: as the person requesting support, I must know a whole lot about the root cause of my problem and how to solve it, or my request will be dismissed with misinformation. I need to have data and metrics that back up my claims in order to get the ticket escalated to someone who can help, and I will need to provide those receipts three or four times. Once something is escalated to the Product Group, I may get a helpful response, but it will generally take a while. If I’m not engaged directly with the Product Group and the answer is being relayed through a lower support tier, it often won’t make much sense.

These issues don’t happen due to bad work ethics or personal failings of support workers. These are good humans, who are trying their best! The problem is worse, because it’s systemic.

Kendra’s specific advice is around Microsoft and the Azure SQL family of products (SQL Server, Azure SQL DB, Azure SQL Managed Instance) but the advice is sound for much more than that. This advice will help you out when dealing with the support organization for pretty much any large company.

Comments closed