Press "Enter" to skip to content

Author: Kevin Feasel

Currency Conversion with priceR

Bryan Shalloway needs to make change for a trillion Zimbabwe dollars (prior to revaluation):

In this post I’ll walk through an example of how to convert between currencies. A challenge is that the conversion rate is constantly changing. If you have historical data you’ll want the conversion to be based on what the exchange rate was at the time. Hence the fields you need when doing currency conversion are:

1. Date of transaction

2. Start currency (what you’ll be converting from)

3. End currency (what you’ll be converting to)

4. Price (in units of starting currency)

Bryan also makes the smart move by memoizing the data first, as those API calls can get expensive otherwise.

Comments closed

Power BI Field Parameters and Type 2 SCDs with Bonus Fields

Koen Verbeeck extends the type 2 slowly changing dimension:

Power BI field parameters are a new feature in Power BI Desktop, and it’s one of the best of the past months. In short, Power BI field parameters allow you to easily switch between dimensions attributes or measures in a filter. Previously, you had to do all sorts of DAX wizardry to make this happen, but now it’s just a couple of clicks.

The goal of this blog post is not to tell you exactly how they work, but rather showcase an interesting use case. You can find more info about Power BI field parameters in the official blog post, but also herehere and here. The use case I’m talking about is slowly changing dimensions of Type 2, you know, the one where we insert a record for every change. Often, I also include an extra column for each column of which we’re tracking history: the “current value column”. For example, if we keep history of the department for an employee, I have a column “CurrentDepartment”. If a type 2 change occurs, the values of this columns are updated to the last known value for this dimension member. This allows to answer different types of questions, because sometimes users are interested in the historical values, but sometimes they just want to know the current value.

Read on for the use case as well as how you might combine field parameters with the idea of current values on type-2 slowly changing dimensions.

Comments closed

Sending E-mails on Database Blocking

Thomas Williams combines a few tools:

If your SQL Servers are under pressure, you want to know. Blocking and blocked processes impact end-users, and if not addressed can slow or even stop a database. In this post, I’ll outline a method I use to get timely notifications of blocking processes that you can use too.

I adapted my approach from Tom Collins’s excellent – and still relevant – 2017 article “How to monitor Blocked Processes with SQL Alert and email sp_whoisActive report” at https://www.sqlserver-dba.com/2017/01/how-to-monitor-blocked-processes-with-sql-alert-and-email-sp_whoisactive-report.html. You could implement exactly what Tom covers in his post and come out on top. I’ve gone one small step further to send a formatted HTML e-mail with a table of blocking & blocked processes; like Tom, I generate the table using Adam Machanic’s fantastic sp_WhoIsActive stored procedure, which I’ve assumed is present in the master system database. The complete solution is the sp_WhoIsActive stored procedure (which you’ll need to download and create, see http://whoisactive.com/downloads/), a SQL Agent job with a job step that runs sp_WhoIsActive and sends the e-mail, and an alert that calls the SQL Agent job when there’s blocked processes.

Read on for the script, as well as some important notes.

Comments closed

Building a Rank in Power Query

Reza Rad ranks things:

Although, calculating rank dynamically is often useful, It is not rare to need to do ranking on a pre-calculated basis. Let’s say, for example, you want to create an aggregated table by customers and year. and you want to rank customers based on their sales amount in that table. the period and other factors are remaining static. This pre-calculated aggregated table can speed up the performance of your report significantly in the future. In a case like this, the rank calculation can be done as a pre-calculation. And when you do this as a pre-calculation, Power Query is a useful option. Because then you can do that even in a Dataflow, and the result can be used in multiple Power BI datasets.

Click through to see how you can do it, as well as what Power Query calls its equivalents to ROW_NUMBER(), RANK(), and DENSE_RANK().

Comments closed

Creating Calculation Groups in Power BI Desktop via Powershell

Phil Seamark helps those who are stuck lacking tools:

recently shared on Twitter a 7-module learning path on MS Learn that teaches all you need to know about calculation groups in Power BI. This learning path is an excellent course, and I highly recommend it. However, as part of this exchange, I received a reply from someone lamenting that calculation groups are unusable in organisations that will not allow non-Microsoft applications. Power BI Desktop does not currently have UX enabling you to create/manage calculation groups in a Power BI Model, so the most common method today is to use 3rd party tools such as Tabular Editor.

This exchange is not the first time I have heard this feedback, so I decided to share a technique showing how you can use Microsoft tools. The approach used in this article uses PowerShell but can quickly get translated to VS Code or other scripting environments.

Click through if you’re in that unfortunate situation.

Comments closed

Why that Plan Didn’t Go Parallel

Erik Darling looks at another update in SQL Server 2022:

The thing is, the reason always seemed to be “Could Not Generate Valid Parallel Plan” for most of them, even though more explicit reasons were available.

They started cropping up, as things do, in Azure SQL DB, and have finally made it to the box product that we all know and mostly love.

Let’s explore some of them! Because that’s what we do.

Also check out Rob Volk’s comment, as he lists out all the ones he could find, noting that most of these do exist in SQL Server 2019.

Comments closed

Automating Parallelism Decisions in Flink Batch Jobs

Lijie Wang and Zhu Zhu describe Apache Flink’s batch scheduler:

Deciding proper parallelisms of operators is not an easy work for many users. For batch jobs, a small parallelism may result in long execution time and big failover regression. While an unnecessary large parallelism may result in resource waste and more overhead cost in task deployment and network shuffling.

To decide a proper parallelism, one needs to know how much data each operator needs to process. However, It can be hard to predict data volume to be processed by a job because it can be different everyday. And it can be harder or even impossible (due to complex operators or UDFs) to predict data volume to be processed by each operator.

To solve this problem, we introduced the adaptive batch scheduler in Flink 1.15. The adaptive batch scheduler can automatically decide parallelism of an operator according to the size of its consumed datasets. 

Read on to see some of the benefits of using the adaptive batch scheduler, as well as some of the decision points it uses along the way.

Comments closed

Multidimensional Bloom Filters

The Instaclustr team talks bloom filters:

Bloom filters are space-efficient probabilistic data structures that can yield false positives but not false negatives. They were initially described by Burton Bloom in his 1970 paper  “Space/Time Trade-offs in Hash Coding with Allowable Errors“. They are used in many modern systems including the internals of the Apache® projects Cassandra®, Spark™, Hadoop®, Accumulo®, ORC™, and  Kudu™.

Multidimensional Bloom filters are data structures to search collections of Bloom filters for matches. The simplest implementation of a Multidimensional Bloom filter is a simple list that is iterated over when searching for matches. For small collections (n < 1000) this is the most efficient solution. However, when working with collections at scale other solutions can be more efficient. 

Read on to learn more, including some discussion about an implementation in Cassandra.

Comments closed

Query Store Hints in SQL Server 2022

Erik Darling has thoughts:

When you’re dealing with untouchable vendor code full of mistakes, ORM queries that God has turned away from, and other queries that for some reason can’t be tinkered with, we used to not have a lot of options.

In SQL Server 2022, Query Store gains a new super power: you can add hints to queries without intercepting the code in some other manner.

There are a couple of useful hints which won’t be available but Erik seems mostly upbeat about what is there.

Comments closed