Press "Enter" to skip to content

Curated SQL Posts

Uniform Random Number Generation in R

Steven Sanderson digs into the uniform distribution:

Randomness is an essential part of many statistical and machine learning tasks. In R, there are a number of functions that can be used to generate random numbers, but the runif() function is the most commonly used.

Something mildly embarrassing for me is that it took me a while to figure out why they call the command runif(). That’s because, at first, I didn’t pronounce it r unif but rather run if.

In reality, *unif() means “uniform distribution” and r stands for “random number.” There are several other functions based on the uniform distribution and Steven looks at those as well in this post.

Comments closed

An Overview of 4th Normal Form

I continue a series on database normalization:

In this video, [I] explain what Fourth Normal Form (4NF) is and why I consider 5NF to be significantly more important. Even so, 4NF does make it easy to explain a certain common class of problem, allowing it to provide some measure of utility.

4th Normal Form is a special case of the much more exciting 5th Normal Form, but I do have a bit of a soft spot for it.

Comments closed

Formatting DAX Expressions with Python

Sandeep Pawar makes the code a bit more readable:

There is an old Italian saying “If it’s not formatted, it is not DAX

When you get the list of measures from SemPy, it’s not formatted and is hard to read and understand. Thankfully, the SQLBI team has made the DAX parser and the formatter available via an API. I wrote a quick function to return the formatted DAX expression of a measure. You can either pass a DAX expression or the FabricDataFrame returned by fabric.list_measures()

Click through for the process, including the Python code to do the work.

Comments closed

Pulling XMLA-Modified Power BI Datasets into Source Control

Marc Lelijveld has a fix:

Have you ever found yourself stuck with a modified Power BI dataset, thanks to those well-intentioned but troublesome changes you made through the XMLA endpoint? Does that sound familiar to you? What seemed like a convenient solution quickly turned into a frustrating challenge when you encountered the error message in the Power BI Service.

You wanted to seamlessly continue your development journey in Power BI Desktop, avoiding the need for a full data refresh or just quickly making that one small change, but now hitting a roadblock when trying to download PBIX file. The error message declared that your data model had been modified with the XMLA endpoint. But now, with Git integration you can overcome this challenge!

Read on to see how.

Comments closed

An Overview of the Current State of Microsoft Fabric

Paul Andrew pulls no punches:

Despite playing with different parts of the Fabric ecosystem for a long time. Nothing ever prepares you for the challenges and “quirks” faced when building a solution for real. In this post I’ll call out some of the pain points we’ve faced and features of the product still requiring improvement. Excluding some of the obvious gaps in the product like security, that we know to be coming.

Read on for Paul’s analysis on what Fabric is currently missing, but as you do read it, keep in mind that this is still in public preview and even after it goes GA, Microsoft will continue development on Fabric.

Comments closed

Log-Log Plots in R

Steven Sanderson thinks in percentages:

A log-log plot is a type of graph where both the x-axis and y-axis are in logarithmic scales. This is particularly useful when dealing with data that spans several orders of magnitude. By taking the logarithm of the data, we can compress large values and reveal patterns that might be hidden on a linear scale.

Let’s start with a simple example using base R.

Read on to see how you can create these plots and what you can do to customize them.

Comments closed

An Analysis of Goal Line Runs out of Shotgun

I decided to test a common narrative:

A common theme among Buffalo Bills fans is the idea that the Bills run too many plays out of shotgun near the opposing team’s goal line, and this is hampering their ability to score points. Instead, these fans argue, they should run from under center, either a direct handoff or a quarterback sneak. If you were to press fans on this, I believe you’d also hear that the Bills are unique, or at least uniquely bad, at running such plays.

I’m going to use the nflfastR package to analyze play-by-play data and see just how well this bit of fan wisdom holds up.

Spoiler alert: it doesn’t.

Comments closed

Adding a Foreign Key while Creating a Table

Steve Jones points out one of the changes to T-SQL I really like:

This assumes I’ve added a table called dbo.Order with a PK of OrderID.

However, I can do this in the CREATE TABLE statement, like shown below. I add a new section after a column with the CONSTRAINT keyword. Then I name the constraint, which is always a good practice. I can then add the FK keyword, the column and the references that connects this child column to the parent column.

This came about in SQL Server 2014, along with In-Memory OLTP and the ability to create indexes inline with the table create script. It’s a minor quality of life thing but I do enjoy it.

Comments closed

Transaction Isolation Level Changes in Azure SQL MI

Emre Gokoglu goes through a customer issue:

In this technical article, we will delve into an interesting case where a customer encountered problems related to isolation levels in Azure SQL Managed Instance. Isolation levels play a crucial role in managing the concurrency of database transactions and ensuring data consistency. We will start by explaining isolation levels and providing examples of their usage. Then, we will summarize and describe the customer’s problem in detail. Finally, we will go through the analysis of the issue.

This post describes an interesting difference between on-premises SQL Server and Azure SQL Managed Instance in terms of how they handle wrapping multiple connections in a single transaction scope. It’s also the type of thing I would not have thought of when testing a cloud solution like Azure SQL MI or Azure SQL DB versus on-premises SQL Server.

Comments closed