Press "Enter" to skip to content

Curated SQL Posts

Worker Migration in SQL Server 2019

Dong Cao explains an interesting improvement to SQL Server 2019’s internals:

Worker migration (AKA “worker stealing”) allows an idle SOS scheduler to migrate a worker from the runnable queue of another scheduler on the same NUMA node and immediately resume the task of the migrated worker. This enhancement provides more balanced CPU usage and reduces the amount of time long-running tasks spend in the runnable queue.
A long-running task that is enabled for worker migration is no longer bound to a fixed scheduler. Instead, it will frequently move across schedulers within the same NUMA node which naturally results in less loaded schedulers. Together with the existing load factor mechanism, worker migration provides SQL Server with an enriched solution for balanced CPU usage.

Click through to understand where this is particularly useful and what the performance implications are.

Comments closed

Seeing Filter Results with DAX Measures

Gilbert Quevauvilliers wants to see the specific values involved in a Power BI filter:

I was working with a customer trying to get them to better understand DAX and one if the things that is difficult to understand is how the filtering works in DAX.

The challenge I have found is that when using filters in a DAX measure I cannot visually see what is happening within the DAX Filter.

Below I will show you how I can see the values in the FILTER

Click through to see how.

Comments closed

rBokeh Tips for Missing Arguments

Matthias Nistler walks through troubleshooting rBokeh missing argument errors:

This approach is my go-to solution to change a rBokeh plot for which there is an argument missing in rBokeh that is available in python.
– Create the plot.
– Inspect the structure (str(plot)) of the rBokeh object.
– Search for the python’s argument name.
– Overwrite the value with the desired option as derived from python’s bokeh.

Given how nice the bokeh package looks, I really want rBokeh to work well. Hopefully this experience improves over time.

Comments closed

JDBC Resource Pools and Kerberos

Guy Shilo has a tip for us around JDBC connectivity when your Hadoop cluster is configured for Kerberos:

This is a quick tip about connecting to Hive or Impala via JDBC.

Accessing hive or impala using their JDBC driver is very convenient. Client programs s like beeline or Jetbrains DataGrip use it as the main way of accessing Hive/Impala and many people also use it in their own written programs.

Things get a little trickier when the cluster is kerberized. In this case you should add few extra parameters to the JDBC connect string.

Read on to see what to do.

Comments closed

Moving Dataflows Between Workspaces

Marc Lelijveld (along with Ton Swart) shows how to move dataflows from one workspace to another:

Now, here it starts to become interesting. Let’s try to automate the manual steps to an automated process. To automate task in the Power BI service, we have to “talk” to the Power BI REST API. The Power BI REST API provides programmatic access to the report server catalog. For example, basic CRUD (Create, Read, Update and Delete) operations can be done on folders, reports, KPIs, data sources, datasets, refresh plans, subscriptions, etc.

Click through to see how to do it by hand and then how to automate it.

1 Comment

Azure DevOps and Data Factory

Helge Rege Gardsvoll has a three-part series for us on using Azure DevOps to deploy Data Factories. Part 1 is all about environment setup:

Shared Data Factory
The shared Data Factory is there for one use; self-hosted integration runtimes. This is the component you will use to connect to on-premise or other sources that have restrictions on access such as IP restriction or other firewall rules. Migrating a Self Hosted Integration Runtime is not supported, but you can share the same Integration Runtime across different Data Factories. You can find a description for how to do this in this article.

Part 2 covers Git branching, linked services, and development:

Create datasets and pipeline
For this demo I create two datasets; one for source and one for target, and a simple pipeline that copies the data. Datasets have name that point to the data lake, like ADLS_datahelgeadls2_Brreg_MainUnits, but does not include environment information.

Part 3 covers the release process:

The release process will have these steps;
1. Stop any active triggers. We do not want any pipelines to start as we are changing things (and you should wait until running pipelines finish before publishing)
2. Release from development to target environment
3. Clean up target environment by removing objects that are not present in dev. Also start triggers

This is a great series of posts and also includes a bonus tidbit if you’re using Databricks.

Comments closed

Finding Columns and Measures Used in SSAS Tabular

Kasper de Jonge uses Power BI to figure out which columns are being used in an SSAS Tabular model:

I got an interesting question about being able to figure out which measures and columns are being used for all tabular models in a server, either by known reports or by self service users using Power BI. To solve this I decided to use Power BI :).

To get the required information I decided to capture and parse the queries being that are being send to AAS and parse the results Power BI desktop over a period of time. In this post I describe how to do it.

Check it out, and also read the comments for an additional tip from Bill Anton.

Comments closed