Kevin Feasel – Page 118

In my last blog post, Evolution of Logical Replication, I mentioned the future development of a feature to allow “upgrades of logical replication nodes.” The upcoming release of PostgreSQL 17 includes this feature. Previously, after major version upgrades (via pg_upgrade), users couldn’t immediately connect and write data to logical replication nodes. This was because the slots were lost during upgrades, preventing replication from continuing. As a result, new writes wouldn’t get replicated, causing data on both nodes to become out of sync. As explained in this blog post, users had to block applications from writing until the replication setup was re-enabled after the upgrade.

Read on to see how PostgreSQL 17 changes things.

Comments closed

Tracking Microsoft Fabric Notebook Progress

Published 2024-09-26 by Kevin Feasel

Gilbert Quevauvilliers asks are we there yet? are we there yet?

How to view or track the progress of Notebook while it is running in Microsoft Fabric

I was recently working with a Notebook in Microsoft Fabric that was started via a Data Pipeline.

The challenge I had was that I had no idea how far the notebook had gone (as there were quite a lot of cells in this particular notebook).

In this blog post I am going to show you how I can use Microsoft Fabric to identify exactly which cell my notebook is currently on.

Click through for the answer. And so help me, if you ask that question one more time, I’m turning this thing around and we’re going back home.

Comments closed

Debugging a DAX Measure with DAX Query View

Published 2024-09-26 by Kevin Feasel

Dennes Torres sorts out an issue:

DAX Query View was created some time ago and it was a great tool to run DAX inside Power BI environment.

However, testing a measure in DAX Query View may not be something so obvious. While the measure in a report obeys to the filter context and most of times results in a single value, the filter context doesn’t exist in Query View and the execution needs to return a table.

In this way, a simple copy/past of the measure in DAX Query View will not run. The measure needs to be adapted to be tested.

Let’s analyze one sample case to discover how this happens.

Click through for the answer.

Comments closed

Azure Database for PostgreSQL Backup and Restoration

Published 2024-09-26 by Kevin Feasel

Josephine Bush makes and restores a backup:

I wanted to test a migration of Azure PostgreSQL from simple to flexible. I wanted a simple db to restore onto my single server. I may not have needed to do this, but I was then curious how one would go about dumping and restoring on Azure Postgres in general. I plan to use the Azure migration service, but since I started this dump/restore exercise, I thought I would document it here.

Read on for the process.

Comments closed

Tracking Python Packages in Snowflake

Published 2024-09-25 by Kevin Feasel

Kevin Wilkie takes a peek:

When working with one of the many modern computer languages that use libraries, one of the many things to be aware of – as a developer – is the version of the libraries available for your usage.

Since there are multiple languages in Snowflake that use libraries, let’s go over how to check out the versions that come installed and how to install one yourself.

Read on for those answers. Well, one answer and one conundrum.

Comments closed

Simple Outlier Detection and Removal in R

Published 2024-09-25 by Kevin Feasel

Steven Sanderson looks for oddities:

Outliers can significantly skew your data analysis results, leading to inaccurate conclusions. For R programmers, effectively identifying and removing outliers is crucial for maintaining data integrity. This guide will walk you through various methods to handle outliers in R, focusing on multiple columns, using a synthetic dataset for demonstration.

The techniques Steven uses are perfectly reasonable (though I like to use MAD from the median rather than standard deviations from the mean because MAD from the median doesn’t suffer from the sorts of endogeneity problems standard deviation does in a dynamic process). My primary warning would be to keep outliers in a dataset unless you know why you’re removing them. If you know the values were impossible or wrong—for example, a person who works 500 hours a week—that’s one thing. But sometimes, you get exceptional values out of an ordinary process, and those values are just as real as any other. I might have had a sequence in which I flipped a fair coin and it landed on heads 10 times in a row. It’s statistically very uncommon, but that doesn’t mean you can ignore it as a possibility or pretend it didn’t happen.

Comments closed

Data Ingestion with Microsoft Fabric Copy Jobs

Published 2024-09-25 by Kevin Feasel

Reitse Eskens spends a bunch of time at the copier:

The copy job is essentially an abstraction of a pipeline reading data from the source system and writing the data into either a Lakehouse or a Warehouse. It really is ingesting data and nothing else. In my opinion that what copy data flows are meant to do and are very good at too.

The big challenge we all keep facing is how to create incremental loads. We have to build some sort of metadata database where we keep the latest ID, data or other column we use to discern the increment on. In our flow, we need to get that value, compare it against the source system and get the differences. The biggest task is to find out if records are deleted.

With the Copy Job, a large part of this task is taken out of your hands. The Copy Job has a configuration GUI (or wizard) that helps you out quite quickly. So let’s not waste anymore characters and dig in!

Read on to see how it works and its capabilities and limitations. The key question, as always, is whether your workload fits into the wheelhouse. If so, this sounds really useful. If not, it’s a proper struggle.

Comments closed

The Limitations of TRY-CATCH in SQL Server

Published 2024-09-25 by Kevin Feasel

Brent Ozar tries to catch but lets it slip through his fingers:

If you’re using TRY/CATCH to do exception handling in T-SQL, you need to be aware that there are a lot of things it doesn’t catch. Here’s a quick example.

Let’s set up two tables – bookmarks, and a process log to track whether our stored proc is working or not:

Read on for the example.

Comments closed

TLS 1.2 (or Later) in Azure SQL

Published 2024-09-25 by Kevin Feasel

Sakshi Gupta provides a public service announcement:

From November 1st, any Azure SQL server left with the “Select an option” or “NONE” setting (where “NONE” means no enforced minimum TLS version) will only allow connections using TLS 1.2 and TLS 1.3. Connections using TLS 1.0 or TLS 1.1 will be rejected. It is critical for all customers to configure their servers correctly and ensure that their client applications can operate with TLS 1.2 or higher.

Pretty much any SQL Server client or driver that Microsoft released from 2016 forwards will support TLS 1.2, so for most organizations, this should be as simple as enabling the option in development and ensuring applications connect as expected.

Comments closed

Comparing Pandas to Koalas in Microsoft Fabric

Published 2024-09-25 by Kevin Feasel

Tomaz Kastrun does some testing:

Data engineering and even simple data wrangling functions in Fabric can make several tasks faster, when you know know, which package (language) to choose. By comparing Python Pandas with PySpark Pandas (Koalas), we will see that there are huge performance gains, when using correct language.

Click through for the demo.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel

Online Upgrade of Logical and Physical Replication Notes in Postgres

Tracking Microsoft Fabric Notebook Progress

Debugging a DAX Measure with DAX Query View

Azure Database for PostgreSQL Backup and Restoration

Tracking Python Packages in Snowflake

Simple Outlier Detection and Removal in R

Data Ingestion with Microsoft Fabric Copy Jobs

The Limitations of TRY-CATCH in SQL Server

TLS 1.2 (or Later) in Azure SQL

Comparing Pandas to Koalas in Microsoft Fabric