Press "Enter" to skip to content

Day: December 13, 2024

Churn Analysis using Logistic Regression in Python

Daniel Calbimonte takes us through a churn analysis scenario:

This article explains how to analyze the data using Python and perform customer churn analysis to determine why customers stop using a service.

Read on for the article. If you want to dig deeper into churn analysis, I can recommend a book entitled Fighting Churn with Data. Its focus is more on categorical and numerical analysis rather than using statistical classification techniques like logistic regression to identify churn factors. That makes it easier to digest for non-statisticians, especially because most of the code is SQL.

Leave a Comment

The Showdown: Spark vs DuckDB vs Polars in Microsoft Fabric

Miles Cole puts together a benchmark:

There’s been a lot of excitement lately about single-machine compute engines like DuckDB and Polars. With the recent release of pure Python Notebooks in Microsoft Fabric, the excitement about these lightweight native engines has risen to a new high. Out with Spark and in with the new and cool animal-themed engines— is it time to finally migrate your small and medium workloads off of Spark?

Before writing this blog post, honestly, I couldn’t have answered with anything besides a gut feeling largely based on having a confirmation bias towards Spark. With recent folks in the community posting their own benchmarks highlighting the power of these lightweight engines, I felt it was finally time to pull up my sleeves and explore whether or not I should abandon everything I know and become a DuckDB and/or Polars convert.

Read on for the method and results from several thoughtful tests.

Leave a Comment

Using GitHub in SSMS 21

Brent Ozar ties into a Git repo:

SQL Server Management Studio v21 added native Git support, making it easier to use source control natively inside SSMS.

This feature is for developers who are already used to working with Github to manage specific object changes as they work, and who are already accustomed to Git terminologies like branches, commits, pushes, and pulls. This feature is not for DBAs who think it’s going to automatically keep your SQL Server’s object definitions inside source control for you. I totally understand why you’d want that, for sure, and I do as well, but our princess is in another castle.

Read on to see how it works. As I went through the article, my thought was that this is almost exactly the same experience as what you get with Visual Studio. And on the whole, I’d consider that a good thing, as this isn’t some implementation of 10% of the total functionality.

Leave a Comment

Hiding Power BI Report Pages in Workspace and Org Apps

Jon Vöge hides a link:

Do you want to share Power BI Reports with End Users through an app, but hide the Page Navigation of the report?

This is especially useful if you have built-in navigation using Buttons, Sidebars or other menus in your report.

Luckily, there is a quick solution. Well, two solutions actually, depending on whether you are using Workspace Apps, or the new Organizational App item in Fabric.

Read on to see how it all works.

Leave a Comment

Waiting for Locks in Postgres

Hubert Lubaczewski wants to make a change:

I once wrote about this problem, but given that we now have DO blocks and procedures I can make nicer, easy to use example.

Over the years there have been many improvements to how long ALTER TABLE can take. You can now (in some cases) change datatype without rewrite or add default value.

Regardless how fast the thing works, it still needs extremely heavy (though shortlived) lock: Access Exclusive.

Read on to see how you can write a SQL operation that waits for a lock and, if it does not get this lock, retries with backoff.

Leave a Comment

Ways to Land Data into Microsoft Fabric OneLake

James Serra puts on a cape and takes on an iconic laugh:

Microsoft Fabric is rapidly gaining popularity as a unified data platform, leveraging OneLake as its central data storage hub for all Fabric-integrated products. A variety of tools and methods are available for copying data into OneLake, catering to diverse data ingestion needs. Below is an overview of what I believe are the key options:

Read on for a baker’s dozen methods.

Leave a Comment