Press "Enter" to skip to content

Curated SQL Posts

Configuring and Troubleshooting SQL Server on Linux

I have a new video:

In this video, we will run through detailed configuration recommendations for SQL Server on Linux. We will also find out where you can find your error logs.

This is a mix of show and tell, as I couldn’t show some of the recommendations on my particular hardware. Still, there are a whole boatload of links to additional resources if you want to learn more about why Microsoft chose specific things to show in their SQL Server on Linux performance tuning recommendations.

Comments closed

Managing Azure Data Factory IP Ranges for Azure Firewalls

Meagan Longoria has a script for us:

While a private endpoint and vNets are preferred, sometimes we need to configure Azure SQL Database or Azure Storage to allow use of public endpoints. In that case, an IP-based firewall is used to prevent traffic from unauthorized locations. But Azure Data Factory’s Azure Integration Runtimes do not have a single static IP. So how do we keep the firewall updated so that ADF can access these resources?

One option is to run everything through a self-hosted integration runtime, which can have a static public IP. But that also means you have to install (and keep updated) a Java SDK if you are converting data to Parquet or ORC files.

Another option is to obtain the IP range list published by Microsoft and update the firewall. That is what I did for a recent project.

Read on to learn more about the latter option, including a Powershell script to do the work.

Comments closed

Advent of Code Day 9

Kevin Wilkie got to day 9:

Today, our elves are working hard on having us come up with the next number in a sequence. Thankfully, they do give us a few good examples to walk through on how they would like us to do it. And with that, it’s off to the races!

Click through for Kevin’s answer.

Comments closed

Reading and Writing JSON Files in Microsoft Fabric

Tom Martens performs two of the three R’s:

JSON is a straightforward, text-based data-interchange format fully independent of any programming language. This simplicity is why JSON documents have become so widespread.

Often the result of querying a REST API is returned as a JSON document, but this is not the only use case why I consider it important to get familiar with reading from and writing to a JSON file. I think a JSON document is ideal for ingesting values into a Python script or any code, values that are helpful to control the logic and behavior of the script. But of course, it’s also easy to store the value of a variable inside the JSON document, where this value will be used when the script is running again.

Click through for some Python code to perform these operations. There are about a dozen other methods you can use as well, but this is one of the best.

Comments closed

Building a Terraform Module for Azure SQL Database

Josephine Bush automates a deployment:

A well-structured Terraform module for Azure SQL DB typically consists of the following elements:

  • Main Configuration Files: main.tfvariables.tfoutputs.tf
  • Helper Files: (if necessary) locals.tfproviders.tf, etc.

If you want to learn more about the basics of Terraform, you can visit my previous blog post.

Click through to see how Josephine has put together the Azure SQL Database deployment module.

Comments closed

Parameter Sniffing in Stored Procedures

Erik Darling goes back to the well on a popular and important topic:

One of the most fun parts of my job is telling people that parameter sniffing is a good thing. They’re often shocked, because the phrase has such ominous connotations to developers.

The parameters! They get sniffed! Performance! Something! Stuff!

Parameter sniffing does not mean the sky is falling. Like I’ve said before, if it were such a bad idea, it would have been abandoned.

Read on for more thoughts on the topic.

Comments closed

Notes on Linear Markov Chains

John Mount has some thoughts for us:

I want to collect some “great things to know about linear Markov chains.”

For this note we are working with a Markov chain on states that are the integers 0 through k (k > 0). A Markov chain is an iterative random process with time tracked as an increasing integer t, and the next state of the chain depending only on the current (soon to be previous) state. For our linear Markov chain the only possible next states from state i are: i (called a “self loop” when present), i+1 (called up or right), and i-1 (called down or left). In no case does the chain progress below 0 or above k.

Click through for notes on two variants of this sort of linear Markov chain, as well as a pair of appendices containing derivation notes and Python code.

Comments closed

Reversion to the Mean

Holger von Jouanne-Diedrich explains an important statistical concept we all too often forget:

In the realm of business and leadership, one statistical phenomenon often goes unrecognized yet significantly influences our understanding of performance and success. This is the concept of reversion to the mean (also called regression to the mean). This seemingly simple statistical occurrence can profoundly impact how we perceive management strategies, leadership effectiveness, and even the fate of those gracing the covers of prominent magazines. To understand what is going on, read on!

Read on for a video in German and an article in English, with some bonus R code to sell the story.

Comments closed

Fabric and Databricks

A rare two-part compare and contrast!

First, Chen Hirsh directly contrasts Microsoft Fabric and Databricks:

Microsoft recently announced the general availability of Microsoft Fabric, which contains all (or most) cloud Data analytics services from Microsoft. This is a good opportunity to compare it with another popular data platform, which is also available in Azure (and other cloud services) – Databricks.

Before we start, I should note that Fabric is quite new, and it’s still hard to evaluate its performance and stability. Also, both products have many features, and I only try to discuss the main differences.

Then, Eugene Meidinger keys us in on the similarities:

One of the things that helps to understand Fabric is that it’s heavily influenced by Databricks. It’s built on delta lake, which is created and open sourced by Databricks 2019. You are encouraged to use a medallion architecture, which as far as I can tell, comes from Databricks.

You will be a lot less frustrated if you realize that much of what’s going on with Fabric is a blend of open source formats and protocols, but also is a combination of the idiosyncrasies of Databricks and then those of Microsoft. David Gomes has good post about data lake file formats, and it’s interesting to imagine the parallel universe where Fabric is built on Iceberg (which is also based on Parquet files) instead of delta lake. (Note, I found this post from this week’s issue of Brent Ozar’s Newsletter)

Comments closed

Joining Tables without Relationships in DAX

Marco Russo and Alberto Ferrari have ended a relationship:

In a previous article, we saw several examples of using the NATURALLEFTOUTERJOIN and NATURALINNERJOIN functions in DAX. In that article, we saw how to join tables in the data model using existing relationships. However, if we want to join tables in the model without using the relationships in the model, we must eliminate the data lineage of the columns to use in the join condition. This article describes how to achieve this.

I suppose my question here is, when would you want to do this? I’m sure there’s a good reason to, but I’m just as sure that I don’t know it.

Comments closed