Press "Enter" to skip to content

Author: Kevin Feasel

Time Range Generation in Data Diluvium

Adron Hall extends Data Diluvium:

Following up on my previous posts about adding humidity and temperature data generation to Data Diluvium, I’m now adding a Time Range generator. I decided this would be a nice addition to give any graphing of the data a good look. This will complete the trio of generators I needed for my TimeScale DB setup. While humidity and temperature provide the environmental data, the Time Range generator ensures we have properly spaced time points for our time-series analysis.

Click through to see how it works.

Leave a Comment

Checking Who Created a Table in SQL Server

Burt King reviews the logs:

I have noticed new SQL Server database objects have been created and want to know how we can track down who created these objects. What options are available in SQL Server? In this article, learn how in SQL Server to check who created a table or other objects.

Burt shows a couple of techniques for this, though I’d lean heavily on using Extended Events over a server-side trace or Profiler for the task.

Leave a Comment

Reactable Tables with Sparklines in Shiny Apps

Osheen MacOscar continues a series:

This is the third blog in a series about the {sparkline} R package for inline data visualisations. You can read the first one about getting started with the package here and the second one about embedding them in HTML tables with the {reactable} package here.

In this blog I am taking it a step further and demonstrating how to use our sparkline reactable table in a Shiny app. Thankfully {reactable} has some helpful functions that make this super easy! I will also look at using a dynamic traffic light image in a reactable table at the end.

Click through to see how it all works.

Leave a Comment

Converting a CSV to Parquet with DuckDB and Polars in R

Michael Mayer makes a swap:

In this recent post, we have used Polars and DuckDB to convert a large CSV file to Parquet in steaming mode – and Python.

Different people have contacted me and asked: “and in R?”

Simple answer: We have DuckDB, and we have different Polars bindings. Here, we are using {polars} which is currently being overhauled into {neopandas}.

Click through for the comparison.

Leave a Comment

Calculating a Matrix Inversion in SQL Server

Sebastiao Pereira performs matrix math in-database:

There are numerous applications to obtain a Matrix inverse for a given Matrix. Is it possible to do it using only SQL Server? Read on to learn how to build a matrix inverse calculator using a set of SQL Server custom functions.

I expect this to be extremely slow in comparison to GPU-based methods using a language like C, but this approach maximizes style points.

Leave a Comment

Avoid Exposing PostgreSQL Port 5432 to the Internet

Christophe Pettus shares some good advice:

Sometimes, we run into a client who has port 5432 exposed to the public Internet, usually as a convenience measure to allow remote applications to access the database without having to go through an intermediate server appllication.

Do not do this.

This is the equivalent of exposing port 1433 on a SQL Server instance to the broader internet, and is a bad idea for many of the same reasons.

Leave a Comment

B-Tree Indexes in PostgreSQL vs SQL Server

Lukas Fittl compares and contrasts:

When it comes to optimizing query performance, indexing is one of the most powerful tools available to database engineers. Both PostgreSQL and Microsoft SQL Server (or Azure SQL) use B-Tree indexes as their default indexing structure, but the way each system implements, maintains, and uses those indexes varies in subtle but important ways.

In this blog post, we explore key areas where PostgreSQL and SQL Server diverge: how their B-Tree indexes implementations behave under the hood and how they store and access data on disk. We’ll also benchmark the impact of deduplication of values on index size in each database system.

I love this kind of post because you hear that SQL Server has indexes and PostgreSQL has indexes (or Oracle has indexes or whatever), and thus, all of your index building knowledge in one applies to the other…right?

One thing that changes the article a bit is that the author doesn’t use page-level compression on indexes in SQL Server. I’d expect the results to change a fair amount, even if the SQL Server non-clustered indexes still ended up larger in the end than PostgreSQL indexes.

Leave a Comment

Understanding Multi-Version Concurrency Control in PostgreSQL

Grant Fritchey explains a mechanism:

Let me start by giving you the short version of what MVCC is, and then the rest of the article explains more details. Basically, PostgreSQL is focused on ensuring, as much as possible, that reads don’t block writes and writes don’t block reads. This is done by always, only, inserting rows (tuples). No updates to an existing row. No actual deletes or updates. Instead, it uses a logical delete mechanism, which we’ll get into. This means that data in motion doesn’t interfere with data at rest, meaning a write doesn’t interfere with a read, therefore, less contention & blocking. There’s a lot to how all that works, so let’s get into it.

Click through for the dive. The pattern for MVCC is interesting, though quite different from pretty much any other implementation of concurrency management in a relational database system.

Leave a Comment

Azure SQL Mirroring in Microsoft Fabric Updates

Idris Motiwala announces a set of changes:

Attention data engineers, database developers, and data analysts! We’re pumped to reveal exciting upgrades to Mirroring for Azure SQL Database in Fabric today at the Fabric Conference in Las Vegas 2025. Since it became Generally Available, Mirroring for Azure SQL Database has been a game-changer, letting you replicate data seamlessly and integrate it within the Fabric environment. We’re talking near real-time data accessibility and some serious analytics power!

Click through to see what’s new, as well as what’s upcoming. This is specifically for Azure SQL DB and Azure SQL Managed Instance, versus SQL Server running as an Azure VM or on-premises (or some other cloud), so calibrate expectations accordingly.

Leave a Comment