Press "Enter" to skip to content

Author: Kevin Feasel

Sending E-mails on Database Blocking

Thomas Williams combines a few tools:

If your SQL Servers are under pressure, you want to know. Blocking and blocked processes impact end-users, and if not addressed can slow or even stop a database. In this post, I’ll outline a method I use to get timely notifications of blocking processes that you can use too.

I adapted my approach from Tom Collins’s excellent – and still relevant – 2017 article “How to monitor Blocked Processes with SQL Alert and email sp_whoisActive report” at https://www.sqlserver-dba.com/2017/01/how-to-monitor-blocked-processes-with-sql-alert-and-email-sp_whoisactive-report.html. You could implement exactly what Tom covers in his post and come out on top. I’ve gone one small step further to send a formatted HTML e-mail with a table of blocking & blocked processes; like Tom, I generate the table using Adam Machanic’s fantastic sp_WhoIsActive stored procedure, which I’ve assumed is present in the master system database. The complete solution is the sp_WhoIsActive stored procedure (which you’ll need to download and create, see http://whoisactive.com/downloads/), a SQL Agent job with a job step that runs sp_WhoIsActive and sends the e-mail, and an alert that calls the SQL Agent job when there’s blocked processes.

Read on for the script, as well as some important notes.

Comments closed

Creating Calculation Groups in Power BI Desktop via Powershell

Phil Seamark helps those who are stuck lacking tools:

recently shared on Twitter a 7-module learning path on MS Learn that teaches all you need to know about calculation groups in Power BI. This learning path is an excellent course, and I highly recommend it. However, as part of this exchange, I received a reply from someone lamenting that calculation groups are unusable in organisations that will not allow non-Microsoft applications. Power BI Desktop does not currently have UX enabling you to create/manage calculation groups in a Power BI Model, so the most common method today is to use 3rd party tools such as Tabular Editor.

This exchange is not the first time I have heard this feedback, so I decided to share a technique showing how you can use Microsoft tools. The approach used in this article uses PowerShell but can quickly get translated to VS Code or other scripting environments.

Click through if you’re in that unfortunate situation.

Comments closed

Building a Rank in Power Query

Reza Rad ranks things:

Although, calculating rank dynamically is often useful, It is not rare to need to do ranking on a pre-calculated basis. Let’s say, for example, you want to create an aggregated table by customers and year. and you want to rank customers based on their sales amount in that table. the period and other factors are remaining static. This pre-calculated aggregated table can speed up the performance of your report significantly in the future. In a case like this, the rank calculation can be done as a pre-calculation. And when you do this as a pre-calculation, Power Query is a useful option. Because then you can do that even in a Dataflow, and the result can be used in multiple Power BI datasets.

Click through to see how you can do it, as well as what Power Query calls its equivalents to ROW_NUMBER(), RANK(), and DENSE_RANK().

Comments closed

Why that Plan Didn’t Go Parallel

Erik Darling looks at another update in SQL Server 2022:

The thing is, the reason always seemed to be “Could Not Generate Valid Parallel Plan” for most of them, even though more explicit reasons were available.

They started cropping up, as things do, in Azure SQL DB, and have finally made it to the box product that we all know and mostly love.

Let’s explore some of them! Because that’s what we do.

Also check out Rob Volk’s comment, as he lists out all the ones he could find, noting that most of these do exist in SQL Server 2019.

Comments closed

Automating Parallelism Decisions in Flink Batch Jobs

Lijie Wang and Zhu Zhu describe Apache Flink’s batch scheduler:

Deciding proper parallelisms of operators is not an easy work for many users. For batch jobs, a small parallelism may result in long execution time and big failover regression. While an unnecessary large parallelism may result in resource waste and more overhead cost in task deployment and network shuffling.

To decide a proper parallelism, one needs to know how much data each operator needs to process. However, It can be hard to predict data volume to be processed by a job because it can be different everyday. And it can be harder or even impossible (due to complex operators or UDFs) to predict data volume to be processed by each operator.

To solve this problem, we introduced the adaptive batch scheduler in Flink 1.15. The adaptive batch scheduler can automatically decide parallelism of an operator according to the size of its consumed datasets. 

Read on to see some of the benefits of using the adaptive batch scheduler, as well as some of the decision points it uses along the way.

Comments closed

Query Store Hints in SQL Server 2022

Erik Darling has thoughts:

When you’re dealing with untouchable vendor code full of mistakes, ORM queries that God has turned away from, and other queries that for some reason can’t be tinkered with, we used to not have a lot of options.

In SQL Server 2022, Query Store gains a new super power: you can add hints to queries without intercepting the code in some other manner.

There are a couple of useful hints which won’t be available but Erik seems mostly upbeat about what is there.

Comments closed

Multidimensional Bloom Filters

The Instaclustr team talks bloom filters:

Bloom filters are space-efficient probabilistic data structures that can yield false positives but not false negatives. They were initially described by Burton Bloom in his 1970 paper  “Space/Time Trade-offs in Hash Coding with Allowable Errors“. They are used in many modern systems including the internals of the Apache® projects Cassandra®, Spark™, Hadoop®, Accumulo®, ORC™, and  Kudu™.

Multidimensional Bloom filters are data structures to search collections of Bloom filters for matches. The simplest implementation of a Multidimensional Bloom filter is a simple list that is iterated over when searching for matches. For small collections (n < 1000) this is the most efficient solution. However, when working with collections at scale other solutions can be more efficient. 

Read on to learn more, including some discussion about an implementation in Cassandra.

Comments closed

Refreshing SQL Managed Instances which Use TDE

Bradley Ball keeps the dev environment up to date:

Hello Dear Reader!  I was working with some friends lately and we needed to set up a process to refresh their Development Environment databases from Production.  Additionally, the databases are encrypted using Transparent Data Encryption, The SQL MI instances are in different regions, and the SQL MI Instances are in different subscriptions.  To duplicate the environment, in order to match our friends, we did the following setup.

Click through for a high-level overview, step-by-step guidance, and a whole lot of detail.

Comments closed

Extracting Data from DAX Measures into CSV

Gilbert Quevauvilliers builds a process:

In this blog post I am going to demonstrate how to use the new Power Automate Flow to extract data from a DAX measure into a SharePoint CSV file.

I got this idea after reading the blog post from the Microsoft Power BI Team: Unlocking new self-service BI scenarios with ExecuteQueries support in Power Automate | Microsoft Power BI Blog | Microsoft Power BI

The great news is that this works on Power BI Pro, Premium Per User and Premium.

Read on to see how.

Comments closed