Press "Enter" to skip to content

Day: November 6, 2023

Fitting Distributions to Datasets in R

Steven Sanderson tests a distribution fit:

There are two main ways to fit a gamma distribution to a dataset in R:

  1. Maximum likelihood estimation (MLE): This method estimates the parameters of the gamma distribution that are most likely to have produced the observed data.
  2. Method of moments: This method estimates the parameters of the gamma distribution by equating the sample mean and variance to the theoretical mean and variance of the gamma distribution.

Click through to see which technique Steven uses and an example of how it all works.

Comments closed

From Probabilities to Odds

Bryan Shalloway explains how odds and probabilities intertwine:

However human understanding of odds predates our formal understanding of probability. You can find references to odds dating back to Shakespeare:

Knew that we ventured on such dangerous seas
That if we wrought out life ’twas ten to one;
– Shakespeare’s Henry IV, Part II, 1597

Yet, in most common settings, modern society has largely supplanted odds for probabilities. You can imagine if Shakespeare were writing today the line might end “’twas ten out of eleven.

Read the whole thing.

Comments closed

Creating and Connecting to an Azure Postgres Cluster

Louis Davidson shares some notes:

As I have dealt with other platforms, PostgreSQL has stood out to me as the platform I am most interested in because it feels like the one that is most competitive with SQL Server’s platform (Oracle is out there too, as is MySQL, and many others, but PostgreSQL feels like the balance of affordability and features that it has a similar feel enough to get started.)

There are a few high-level differences that can be confusing. A cluster is really just a server (or in SQL Server, an instance). Second, the way you execute a batch of code is very different, and sometimes this is based on the tool you are using. As you dig into how PostgreSQL works, some things will feel really normal, and some stuff will be very different from the other servers you have used.

Read on for the first post in the series, covering setup and connection.

Comments closed

Building Connect 4 with T-SQL

Tomaz Kastrun figures out a useful way to play a game while looking like you’re doing important work:

Connect 4 is a classical board game, consisting of a board of 6 rows and 7 columns and 42 tokens (of two colours, each player with 21). The first player to get four tokens in a row, column or diagonally, wins!

Simple game rules:
1. only one token can be added at the time
2. each player have their own colour (sign) of tokens
3. the game is ended, when the first player connects four tokens in a row, column or diagonally

Actually, I take that back: Tomaz has figured out a useful way for two people to play a game while looking like they’re doing important work. That’s twice as good.

Comments closed

Notes for Trying out Postgres

Ryan Booz offers some guidance:

PostgreSQL continues to be all the rage in 2023, whether in “vanilla” form of the fully open-source distribution or a variant like Amazon RDS, Neon, Yugabyte, and others. If you’re interested in trying PostgreSQL but only have experience with another database like SQL Server, it can feel a bit daunting to get started.

In this small series, we’ll walk you through the process of connecting to a version of PostgreSQL locally or in the cloud. Next, we’ll discuss some options for sample databases and show you how to restore them. And finally, we’ll finish by demonstrating a few ways to connect to and query the database.

Click through for the article.

Comments closed

Run-Length Encoding in Power BI

Chris Webb performs some experiments:

Recently I was involved in an interesting discussion on Twitter X about how partitioning a table in a Power BI dataset semantic model can affect compression and therefore its size and query performance. You can read the thread here. It got me thinking: is there a way to get more detail on how well compression, and in particular run-length encoding (RLE from hereon), is working for a column when you’re using Import mode or Direct Lake? After a bit of research I found out there is, so let’s see some examples that illustrate what I learned.

The results are interesting and Chris explains why we see what we do.

Comments closed

Deploying SQL Server with Availability Groups via HPE Serviceguard for LInux and Ansible

Amit Khandelwal packs a lot into a post:

It’s time for a new blog on how Ansible can simplify SQL Server deployment, configuration, and availability. If you’ve read my previous blogs on Ansible for SQL Server installation and configuration, and the pacemaker-based Always On availability group, you know how powerful Ansible can be. Now, let’s talk about HPE Serviceguard for Linux (SGLX), a high availability/disaster recovery solution that provides business continuity for critical applications like SQL Server.

Deploying SQL Server Always On availability groups on HPE SGLX is a fully supported solution for production workloads.

Today, let’s look at how you can configure Always On availability group based on HPE SGLX via Ansible. We have collaborated with our friends in HPE to enable the Ansible based deployment for HPE SGLX with SQL Server. This feature is now available for HPE SGLX 15.10. For this demonstration, you can download the evaluation bits from the ‘My HPE Software Center‘. The Ansible bits with the scripts are available on GitHub

Read on for instructions and what you need to make it all work.

Comments closed