Press "Enter" to skip to content

Day: November 10, 2025

Pulling Random Values from a Gaussian Distribution in T-SQL

Sebastiao Pereira has another way of populating a random variable:

Generating random numbers from a normal distribution is essential for accuracy and realistic modeling. Used for simulation, inference, cryptography, and algorithm design for scientific, engineering, statistical, and AI domains. Is it possible to create random Gaussian numbers in SQL Server using the Ziggurat algorithm without external tools?

I was not familiar with this technique, so it’s neat to see it in action.

Leave a Comment

Pandas vs Polars for DataFrame Modification

Russ Hyde compares Pandas and Polars:

In Data Science we are often working with rectangular data structures – databases, spreadsheets, data-frames. Within Python alone, there are multiple ways to work with this type of data, and your choice is constrained by data volume, storage, fluency and so on. For datasets that could readily be held in memory on a single computer, the standard Python tool for rectangling is Pandas, which became an open-source project in 2009. Many other tools now exist though. In particular, the Polars library has become extremely popular in Python over recent years. But when Pandas works, is well-supported, and is the standard tool in your team or your domain, and if you are primarily working with in-memory datasets, is there a value in learning a new data-wrangling tool? Of course there is.

Read on for a demonstration of fairly basic data operations and how they differ in Pandas vs Polars.

Leave a Comment

An Overview of Azure Managed Cassandra’s Architecture

Amy Abel describes an architecture:

I’ve been learning about Azure Managed Cassandra recently, and it’s very different from the usual relational SQL Server database. The documentation and tutorials can seem confusing at first, but once I broke things down it was easier to understand basic concepts.

Read on for a warning about different flavors of Cassandra, as well as how Microsoft has organized things in their implementation of Cassandra.

Leave a Comment

Business Continuity Options in Azure

Aleksey Vitsko enumerates available options:

You may be familiar with high availability (HA) and disaster recovery (DR) features that are available in SQL Server and have experience configuring and managing them. But you have ever heard of or tried Azure high availability or Azure disaster recovery features. How can I learn more about what Azure brings in terms of HA and DR for Azure SQL offerings – including SQL VMs?

Read on for a variety of options depending upon whether you’re using SQL Server on a VM, Azure SQL Database, or Azure SQL Managed Instance.

Leave a Comment

An Overview of PostgreSQL Internals

Elizabeth Christensen shows some of the ways to view internal information in PostgreSQL:

Postgres has an awesome amount of data collected in its own internal tables. Postgres hackers know all about this  – but software developers and folks working with day to day Postgres tasks often miss out the good stuff.

The Postgres catalog is how Postgres keeps track of itself. Of course, Postgres would do this in a relational database with its own schema. Throughout the years several nice features have been added to the internal tables like psql tools and views that make navigating Postgres’ internal tables even easier.

Today I want to walk through some of the most important Postgres internal data catalog details. What they are, what is in them, and how they might help you understand more about what is happening inside your database.

Click through for an overview of catalog tables and catalog views (similar to SQL Server’s system tables and Dynamic Management Views, respectively).

Leave a Comment