Press "Enter" to skip to content

Day: October 25, 2023

Scree Plots in R

Steven Sanderson builds a scree plot:

A scree plot is a line plot that shows the eigenvalues or variance explained by each principal component (PC) in a Principal Component Analysis (PCA). It is a useful tool for determining the number of PCs to retain in a PCA model.

In this blog post, we will show you how to create a scree plot in base R. We will use the iris dataset as an example.

Read on to learn more about the plot, as well as examples of how to create scree plots.

Comments closed

Embrace the Power of 5th Normal Form

I have a new video up:

In this video, we drill into the other most important normal form, learning what Fifth Normal Form (5NF) is, why Boyce-Codd Normal Form is not enough, and examples of why 5NF can be such a challenge to implement.

Until I read CJ Date’s Database Design and Relational Theory (2nd edition), my level of appreciation for 5th Normal Form was somewhat limited, but that’s mostly because I didn’t understand it well at all. I liked the connection trap example in this article, but Date’s book was the first really good explanation of 5NF and just how powerful it is. My hope is that I was successfully able to convey that power to audiences.

Comments closed

Deadlock Analysis with SQL Sentry Plan Explorer

Rod Edwards looks at deadlocks:

I’ve also included a REALLY useful bit of functionality in a common free tool that you may not expect. Simply scroll to the to the end for it if you’re up to speed with Deadlocks already, or don’t want to read my inane rambles.

So we know what a deadlock is, multiple queries all contending for resources, one query holds some resources and is waiting on resources that another query has. That second query isn’t giving the resources up and is waiting for resources that the first query has. Both are stubborn, neither are releasing what they have until they completed…so we have a standoff.

Read on to learn a bit more about the mechanics behind deadlocks, as well as how Plan Explorer makes life easier.

Comments closed

Cross-Environment Schema Comparison for ADF

Teo Lachev has a script for us:

So, I got on a quest to find an ADF schema comparison tool for a quick and dirty way to identify ADF code discrepancies between two environments and promote changes manually. Ideally, the tool would be something like SQL Server Database projects in Visual Studio.

Read on to see what Teo was able to find, as well as a few code changes around it.

Comments closed

Comparing Service Endpoints and Private Endpoints in Azure

Khushbu Gandhi clarifies a choice:

For a long time, if you were using the multi-tenant, PaaS version on many Azure services, then you had to access them over the internet with no way to restrict access just to your resources. This restriction was primarily down to the complexity of doing this sort of restrictions with a multi-tenant service. At that time, the only way to get this sort of restriction was to look at using single-tenant solutions like App Service Environment or running service yourself in a VM instead of using PaaS.

This public access was a concern for many, and so Microsoft implemented new services that allow you to limit access to these multi-tenant services. Today, we have two solutions that on the face of it look quite similar, Service Endpoints and Private link/Endpoints. These two services are both designed to allow you to restrict who connects to your service, and how they do it. Because of this, it can be confusing to know which service to use and what the benefits are. In this article, we will look at these services and try to make your decision clearer.

Read on to see what the differences are between the two, as well as a comparison table and recommendations on which to choose in what circumstances.

Comments closed

Building a Multi-Tenant Database

Adron Hall looks at multi-tenancy within Postgres:

Music has always been a significant part of my life. From the melodies that accompany my daily routines to the anthems of my most memorable moments, it’s been a constant. As my collection grew, I realized I needed a better way to organize it. That’s when I stumbled upon the concept of multi-tenancy databases and decided to give it a shot with PostgreSQL. Here’s my experience.

Multi-tenancy is one case in which I’m much more relaxed about including the tenant ID on tables where it is not absolutely necessary in order to prevent a series of joins to get the appropriate tenant ID. We can quibble about whether that’s reasonable denormalization or appropriate use of a superkey—especially because, in SQL Server, tenant ID ends up being part of the clustered index and likely part of the primary key anyhow—but it’s extremely useful nonetheless.

Comments closed

CAST() and CONVERT() for Dates

Chad Callihan converts a date:

CAST and CONVERT can both be used to switch a value to a new data type. They are similar, but certainly not identical. While CAST is considered ANSI SQL and will get you across the finish line, CONVERT can give you more flexibility when it comes to formatting date values. Let’s look at an example comparing the usage of CAST and CONVERT with dates.

Most of the time, I’ll use CAST() over CONVERT(), not so much because the former is ANSI compliant, but rather because I think it’s more intuitive to remember. Date formatting is one of the few occasions in which I usually prefer CONVERT() and that’s precisely because of the format options. Of course, if you want more custom formatting options, you can use FORMAT(), though that function uses .NET in the background and is remarkably slow. It’s fine if you’re formatting a few dates, but if you’re outputting millions of rows, you will certainly see a marked difference.

Comments closed