Press "Enter" to skip to content

Curated SQL Posts

SQL Agent History on Azure SQL Managed Instances

Kenneth Fisher goes back in time:

The defaults for saving SQL Agent Job history are ok (at best), so you should probably check and update them if needed. Sadly, if you are using a Managed Instance this isn’t an option.

SQL Managed Instance currently doesn’t allow you to change any SQL Agent properties because they are stored in the underlying registry values.

That’s a real kick in the pants. Still, Kenneth shows us (via Jovan Popovic) a workaround to store the job history someplace else.

Comments closed

Prev and Next in KQL

Robert Cain lags and leads:

In this post we’ll continue our series on Kusto’s Windowing Functions by covering prev and next. If you’ve not read the introductory post, Fun With KQL Windowing Functions – Serialize and Row_Number, you should do so now as it introduced several important concepts needed to understand how these functions are used.

So what do prev and next do? They allow you to retrieve a value in a column from a previous row, or the next (or upcoming) row. This can be very useful in many situations. For instance, calculating the time between two rows based on a datetime column, or the change in a value from one row to the next.

As always, read on for a series of examples from Robert.

Comments closed

Percentage by Group in R

Steven Sanderson performs a breakdown:

Calculating percentages by group is a common task in data analysis. It allows you to understand the distribution of data within different categories. In this blog post, we’ll walk you through the process of calculating percentages by group using three popular R packages: Base R, dplyr, and data.table. To keep things simple, we will use the well-known Iris dataset.

The Iris dataset contains information about different species of iris flowers and their measurements, including sepal length, sepal width, petal length, and petal width. We will focus on the ‘Species’ column and calculate the percentage of each species in the dataset.

Read on for the three approaches. I think the Tidyverse approach is the easiest to understand in this case, though all three get you to the answer.

Comments closed

Power BI Dataset Data Source Credential Options

Angela Henry puts together a list:

If we are using enterprise-wide datasets, we don’t want Power BI datasets owned by individuals; we want them to be owned by a Service Principle so they aren’t relying on specific individuals when things go sideways (and because we all want to go on vacation at some point). However, it’s not always clear on what credentials will actually be used for our data sources in our datasets when using a Service Principle. In a previous post, I talked about how to set up a service principle to take over a dataset when using data gateways, but one of the pre-requisites I listed was that your data sources needed to be configured with appropriate credentials. That’s where this post comes in.

You essentially have three options for data source credentials, depending on your data source type.

Click through for that list of three options, as well as the trade-offs inherent in each.

Comments closed

Orchestrating Azure Data Explorer Queries via Apache Airflow

Michael Spector does some automation:

Apache Airflow is a widely used task orchestration framework, which gained its popularity due to Python-based programmatic interface – the language of first choice by Data engineers and Data ops. The framework allows defining complex pipelines that move data around different parts, potentially implemented using different technologies.

The following article shows how to setup managed instance of Apache Airflow and define a very simple DAG (direct acyclic graph) of tasks that does the following:

  • Uses Azure registered application to authenticate with the ADX cluster.
  • Schedules daily execution of a simple KQL query that calculates HTTP errors statistics based on Web log records for the last day.

Click through for the process.

Comments closed

New Script: sp_HealthParser

Erik Darling has a new script for us:

I recently found myself in the midst of a client issue that lead me to need a bunch of corroborating data from the system health extended event session.

There’s hardly any good documentation on it, and even fewer well-written queries or resources for parsing data out of it.

So now I’m making it easy for you, because I care about you more that Microsoft does.

Click through for a link to the script and to get an idea of what it provides.

Comments closed

Improving an Extra-Wide Column or Bar Chart

Mike Cisneros goes bar-hopping:

Let’s take a closer look at that extra-tall bar chart. I came across a graph like this one when working with a recent client. They created a visual to compare the quarterly sales revenues for each of 25 different sales associates against their individual sales targets.

Mike then gives us two routes to improving the quality of this visual, one which maintains the bar chart style and one which replaces it with a Cleveland dot plot.

Comments closed

An End-to-End Microsoft Fabric Implementation

Jordan Witcombe takes us through an example:

This blog will walk you through the entire data lifecycle of ingesting data from SharePoint and Azure Blob Storage, through the lakehouse pattern, and finally to the reporting stage – all using Microsoft Fabric.

As we’re well into summer now, festivals can be a great way of spending time with friends and family. But all too often, we each want to see various acts and activities. How can we make this easier? With this dataset I created, it takes a simple Excel / CSV file with acts, stages, who wants to see this, and a personal rating. We will run this through the lakehouse pattern and finish with a curated/gold layer which can be used for reporting directly on top of. For more information on lake layers, you can see Ed’s blog Medallion Architecture: What is it?. So, you can highlight those clashes and keep everyone happy during your festival fun!

Read on for the showcase.

Comments closed

Statistics and Ascending Keys in SQL Server

Matthew McGiffen talks stats:

The Ascending Key Problem relates to the most recently inserted data in your table which is therefore also the data that may not have been sampled and included in the statistics histograms. This sort of issue is one of the reasons it can be critical to update your statistics more regularly than the built-in automatic thresholds.

We’ll look at the problem itself, but also some of the mitigations that you can take to deal with it within SQL Server.

Read on to see how this behaved prior to SQL Server 2014’s new cardinality estimator and what has changed since then.

Comments closed

sp_HumanEventsBlockViewer Updates

Erik Darling has another update:

In this post, I’m going to talk about a couple cool changes to sp_HumanEventsBlockViewer, a procedure I wrote to analyze the blocked process report via Extended Events, and wish I had given a snazzier name to.

You see, when I wrote it, I pictured it as a utility script for sp_HumanEvents, which will set up the blocked process report and an extended event. But it turns out I use it a lot more on its own.

Read on for Erik’s update, including a neat trick around using an aggregate within a window function to generate ordering.

Comments closed