Press "Enter" to skip to content

Author: Kevin Feasel

Saving An ADF Pipeline As A Template

Rayis Imayev shares with us how you can save an Azure Data Factory pipeline as a template:

Azure Data Factory (ADF) provides you with a framework for creating data transformation solutions in the Microsoft cloud environment. Recently introduced Template Gallery for ADF pipelines can speed up this development process and provide you with helpful information to create additional activity tasks in your pipelines.

We naturally long to seek if something standard can be further adjusted. That custom design is like ordering a regular pizza and then hitting the “customize” button in order to add a few toppings of our choice. It would be very impressive then to save this customized “creation” for future ordering. And Azure Data Factory has a similar option to save your custom data transformation solutions (pipelines) as templates and reuse them later.

Click through to see how you can do just that.

Comments closed

No Type Equivalence In M

Imke Feldmann notes an oddity in types in Power Query:

But this function will not return any matches. I also tried out a (potentially) slower version using Table.SelectColumns(Types, each [Value] = x[Types]) – but still no match. 

What I found particularly frustrating here was, that in some cases, these lookups or filters on type-columns worked.

That behavior seems odd to me. Imke shares a link from Microsoft which explains that the behavior occurs, but the why behind it eludes me.

Comments closed

Saving To Excel From Azure Data Studio

Bob Pusateri shows us how you can export to Excel from Azure Data Studio:

In SQL Server Management Studio, there’s no single-step way to save a result set to Excel. Most commonly I will just copy/paste a result set into a spreadsheet, but depending on the size of the result set and the types of data involved, that doesn’t always play nicely.

But Azure Data Studio does it WAY better, trust me. If you want that result set in a spreadsheet, just save it as one and poof – you have an Excel file!

Considering that Excel is the most popular BI tool, it makes sense to support it.

Comments closed

Hiding Work: The Nested Loop Operator

Erik Darling explains that the nested loop operator is like a duck: there’s more going on beneath the surface than it lets on:

I’m going to talk about my favorite example, because it can cause a lot of confusion, and can hide a lot of the work it’s doing behind what appears to be a friendly little operator.

Something to keep in mind is that I’m looking at the actual plans. If you’re looking at estimated/cached plans, the information you get back may be inaccurate, or may only be accurate for the cached version of the plan. A query plan reused by with parameters that require a different amount of work may have very different numbers.

I like nested loop joins a lot, but there’s a big difference between a loop running a few dozen times and a loop running a couple hundred thousand times, even if the operator doesn’t show you that immediately.

Comments closed

Solving The Monty Hall Problem With R

Miroslav Rajter builds a Monty Hall problem simulator using R:

The original and most simple scenario of the Monty Hall problem is this: You are in a prize contest and in front of you there are three doors (A, B and C). Behind one of the doors is a prize (Car), while behind others is a loss (Goat). You first choose a door (let’s say door A). The contest host then opens another door behind which is a goat (let’s say door B), and then he ask you will you stay behind your original choice or will you switch the door. The question behind this is what is the better strategy?

This is something that puzzled me for a very long time. This is fundamentally a Bayesian problem built around processing new information, and once I understood that, the answer was a lot clearer. H/T R-Bloggers.

Comments closed

Control Table Keys In cdata

John Mount announces a new feature in the cdata package:

In our cdata R package and training materials we emphasize the record-oriented thinking and how to design a transform control table. We now have an additional exciting new feature: control table keys.
The user can now control which columns of a cdata control table are the keys, including now using composite keys (that is keys that are spread across more than one column). This is easiest to demonstrate with an example.

Read on for an example of how you can use this.

Comments closed

Using Calendar Tables

I have a post up on using calendar tables:

There’s one problem with picking a SQL Saturday in April: Easter and Passover tend to run right around that time, and nobody wants a SQL Saturday on Passover or the day before Easter. Unfortunately, our calendar table doesn’t include holiday information. So let’s add it!

Working with holidays and working with fiscal years versus calendar years are just two of the uses of calendar tables. But they’re the only two that I show.

Comments closed

Finding The Last Non-Null Value With Snowflake

Koen Verbeeck shows how two words makes solving a problem with Snowflake a lot easier than with SQL Server:

Sometimes you need to find the previous value in a column. Easy enough, the LAG window function makes this a breeze (available since SQL Server 2012). But what if the previous value cannot be null? You can pass a default, but we actually need the previous value that was not null, even if it is a few rows back. This makes it a bit harder. T-SQL guru Itzik Ben-Gan has written about the solution to this problem: The Last non NULL Puzzle. It’s a bit of tricky solution. 

Click through for the magic words and if you’re on the SQL Server side, upvote this issue to get that functionality in SQL Server too.

Comments closed

Syncing Slicers In Power BI

Prathy Kamasani takes us through a recently added feature in Power BI:

As per Microsoft docs:
“This feature lets you create a custom group of slicers to keep synchronized. A default name is provided, but you can use any name you prefer.
The group name provides additional flexibility with slicers. You can create separate groups to sync slicers that use the same field, or put slicers that use different fields into the same group.”

First, let’s look at creating groups to sync slicers that use the same field. The use case Syncing within a page, we can easily use the group functionality to do this.

Click through for a few demos of increasing complexity.

Comments closed