Press "Enter" to skip to content

Curated SQL Posts

Landing Zone Layouts for Modern Data Warehouses

Paul Hernandez builds out a landing zone for a warehouse:

In this article I want to discuss some different layout options for a landing zone in a modern cloud data warehouse architecture. With landing zone, I mean a storage account where raw data lands directly from its source system (not to be confused with a landing zone to move a system or application into the cloud).

One of the things I appreciate a lot about this post is that it covers the history, showing us how we got to where we are. Paul’s well-versed in each step along the way and lays things out clearly.

Comments closed

Troubleshooting HADR_SYNC_COMMIT Waits

Jose Manuel Jurado Diaz reviews a wait stat:

Today, I worked on a service request that our customer faced a performance issue in their business critical service tier database. Their main suspect was that the syncronization with the secondary replica has a delay. 

During their troubleshooting process they found that they got many request with the wait type is HADR_SYNC_COMMIT.

Read on to learn more about this wait, including what can cause it. There’s also a bit of information about it in the SQLskills wait state compendium.

Comments closed

Working with Powershell Objects

Max Trinidad familiarizes us with objects in Powershell:

Working with PSObjects is essential to anyone using PowerShell. Especially, when there’s a need to extend the object to provide useful information.

Remember, the best way to get familiar with the PowerShell object(‘s)… (AKA PSOBject, .NET Object) is to use the ‘Get-Member’ Cmdlet.

Click through for several tips on how to add custom properties and learn more about what’s available on objects along the way.

Comments closed

When Power Query Might Fold

Chris Webb explains an ambiguity:

This query connects to the TripPin OData feed (which is public, so you’ll be able to try this yourself) and filters the People table so it only gets the rows where the FirstName column equals “Angel”. If you paste this query into the Advanced Editor in Power Query Online to create a dataflow, you’ll see that the filter shows the “Might Fold” step indicator:

Read on to see what that indicator looks like, as well as an explanation of how Power Query doesn’t always know whether or not something is foldable.

Comments closed

Testing Message Ordering in Kafka

Francesco Tisiot puts a claim to the test:

One of Apache Kafka®’s most known mantras is “it preserves the message ordering per topic-partition”, but is it always true? In this blog post we’ll analyze a few real scenarios where accepting the dogma without questioning it could result in unexpected, and erroneous, sequences of messages.

There’s a lot more to this than I realized, and Francesco does a great job of explaining it.

Comments closed

Alt Text in R

Nicola Rennie looks at different ways to incorporate alt text in R-based images:

Alt text (short for alternative text) is text that describes the appearance and purpose of an image. Alt text has multiple purposes, the main one being that it aids visually impaired users to better understand your content when the alt text is read aloud by screen readers. Alt text is also used in place of an image if it fails to load, which means that users with poor internet connection are more likely to be able to engage with your content.

The ggplot2 example was an interesting one, as I hadn’t ever added alt text to an image there.

Comments closed

Model Deployment using Azure Functions

Alexander Billington needs to get that new model out:

Deploying machine learning (ML) models into production can be challenging, as it requires careful consideration of various factors such as scalability, reliability, and maintainability. While developing an ML model is an exciting process, deploying it into production can be a daunting task. The challenges faced in productionising data science projects can range from infrastructure to version control, model monitoring to integration with other systems. This blog will take a look at how Azure Functions can simplify the deployment process, getting models into production quickly and robustly to maximise their value.

I like this approach and find it interesting, as most of the time, the MLOps model Microsoft recommends has you scheduling Azure DevOps pipelines / GitHub Actions periodically or when new training data hits a specific folder. If you have some non-standard trigger for an action, this is a good way to get you going.

Comments closed

Enabling Powershell Constrained Mode

Patrick Gruenauer slips on the straightjacket:

In this blog post I am going to show you how to enable the PowerShell Constrained Mode. What is the Constrained Mode? Microsoft explains this as follows:

Click through for that definition, as well as a demonstration and bit more explanation. It seems that there’s a specific use case for constrained mode, and it’s not one most of us are likely to work with.

Comments closed

Collaborating with External Individuals in Power BI

Marc Lelijveld talks to the outside world:

Let’s imagine you’re running a (fictive) company, and you’re short on data & analytics experts. Therefore, you decide to in hire expertise, and they will help you build your Power BI reports. As an employee of this organization, you rather have them starting sooner than later. But… if you need to request accounts for them at your IT organization, this might take weeks, if not a month to properly setup and run through this process. But what alternatives do you have?

In this blog I will further elaborate on the important things you should think about when working with Externals in Power BI. This blog is based on the session I’ve presented at the Dutch Power BI community day and at SQL Bits 2023 on the same topic together with my colleague Odeta Jankaitienė.

Click through for some of the important decisions you’ll need to make along the way.

Comments closed

Updating SQL Server Containers on Kubernetes

Amit Khandelwal rolls out some updates:

I’m sure you’ve thought about how to update SQL Server containers running on a Kubernetes cluster at some point. So, this blog attempts to answer the question. According to the Kubernetes documentation, there are two Update strategies for statefulset workloads. For your convenience, I’m quoting the summary below:

  1.  OnDelete update : When a StatefulSet’s .spec.updateStrategy.type is set to OnDelete, the StatefulSet controller will not automatically update the Pods in a StatefulSet. Users must manually delete Pods to cause the controller to create new Pods that reflect modifications made to a StatefulSet’s .spec.template.
  2. Rolling update : When a statefuleset’s .spec.updateStrategy.type is set to RollingUpdate, the StatefulSet controller will delete and recreate each Pod in the StatefulSet. It will proceed in the same order as Pod termination (from the largest ordinal to the smallest), updating each Pod one at a time. This is the default update strategy.

Read the whole thing to learn how these two strategies of updating containers work.

Comments closed