Press "Enter" to skip to content

Category: Power BI

Refreshing Power BI Datasets from SSIS

Teo Lachev solves a problem:

Scenario: You use SSIS to load data for on-prem BI solution. As a last step of the ETL pipeline, you want to refresh a Power BI dataset. There’s quite a bit of misinformation on the Internet about how to do this, hence this blog.

Read on to see how you can do this, supposing that you’re hosting in a Premium-Per-User or Premium workspace.

Comments closed

Getting a Power BI Dataset’s Refresh History

Chris Webb wants to know the lore:

In the last post in this series I showed how to get the status of an individual Power BI dataset refresh. In this post I’ll show how to get the history of a dataset’s refreshes and save it to a file in OneDrive for Business – useful if you want to do some analysis on refresh performance or the causes of failures.

This is one of the most straightforward things to do with the Power BI Enhanced Refresh API , especially once you’ve already got a custom connector created in the way I’ve shown in the previous posts in this series. You just need to add a new Action to your custom connector, fill in the General section:

Click through to see what to do.

Comments closed

Power BI: Merge Joins and Nullable Columns

Chris Bailiss dives into some join problems in Power BI against Snowflake and SQL Server:

Power BI generates SQL that performs very poorly against Snowflake (and possibly other database platforms) when joining nullable columns, e.g. in the “Merge Queries” action in the Power BI query editor.

Specifically, the SQL generated prevents the use of the “hash join” physical join algorithm that has been one of the main algorithms used for decades by database engines when executing joins between large tables. This forces database engines to fall-back to using very inefficient physical joins (e.g. loop/cartesian join) that can perform orders of magnitude slower.

The reason why Power BI generates different SQL for non-nullable for nullable columns in a join is explained in the “Problem Part 2” section below.

Read the whole thing—and there’s a lot of good information in there.

Comments closed

The Importance of Data Shaping

Paul Turley shapes the youth of data:

Power BI is a new tool and dimensional modeling is an old idea. One of the challenges is that, like other modern self-service analytics products on the market, Power BI doesn’t force self-service data jockeys to transform their data before reporting with it. If you want to import a big, wide spreadsheet full of numbers and create charts in a Power BI report, knock yourself out. But, the solution won’t scale and you will inevitably run into walls when you try to make future enhancements. Similar problems arise from importing many tables from different sources and transactional systems. Several tables all chained together with creative mashups and relationships present their own set of problems. The first iteration of such an effort is usually a valuable discovery method and learning experience. Great… treat it as such; take notes, make note of the good parts and then throw it away and start over! In Fredrick Brooks’ “The Mythical Man Month“, he cites that for most engineering projects, the first six attempts should be abandoned before the team will be prepared to start over and complete the work successfully. He was a chemical engineer before working for IBM; and hopefully, our methods in the data engineering business are more effective then his 6-to-1 rule. But, this makes the case the prototypes and proof-of-concept projects are a critical part of the learning path.

The tools don’t make the rules.

Unless you’re talking about the lambda architecture, in which case that’s kind of accurate. But we’re not talking about that here.

Comments closed

Extracting Multiple Pages from a Website in Power Query

Matt Allington has a new project:

Every now and then when I have a Power BI project of interest to me, I like to create a video of the end to end process of building a new report. This allows me to share some “warts and all” real-world examples of how to go about building a Power BI report. It gives me a chance to show some concepts (such as creating functions and extracting multiple pages from websites) but also to show that these things are seldom smooth and error free.

Click through for a video demonstration of website data extraction and combination in a Power BI report.

Comments closed

Checking Power BI Licensing Costs

Gilbert Quevauvilliers doesn’t want to waste money:

I recently was assisting a customer with their Power BI licensing and what I found is that in some instances they were having licenses for Power BI Pro and Power BI Premium Per User.

By going through their licenses and assigning the correct license I was able to save the customer approximately 20% on their Power BI licensing costs per month. And over a year this adds up to quite a bit!

This does look to be more confusing than it really ought to be. I’m not sure of any reason why you would want to have Pro + Premium at the same time, so that state should be unrepresentable.

Comments closed

Power Automate and Dataset-Driven Power BI Subscriptions

Dan English follows up on a prior topic:

In the last post I went over using Power Automate to perform a data driven report subscription using a Paginated report referencing an AAS database. The flow referenced an Excel file with the information to make the process data driven and generate 2000 PDF files that could then be emailed to users. In the flow the PDF files were simply placed in a OneDrive folder for testing purposes to validate the flow would run as expected and to review the metrics after the fact to evaluate the impact of running the process.

For the follow up there were two items that I wanted to compare against the original flow

1. Moving the AAS database being referenced to a Power BI dataset hosted in the same capacity as the Paginated report

2. Using a Power BI report instead of a Paginated report

In this post I will cover the first comparison. 

Check out what changes and what stays the same between using Azure Analysis Services and Power BI-hosted datasets.

Comments closed

RELATED() and RELATEDTABLE() in DAX

Alberto Ferrari and Marco Russo add some context:

RELATED is one of the most commonly used DAX functions. You use RELATED when you are scanning a table, and within that row context you want to access rows in related tables. RELATEDTABLE is the companion of RELATED, and it is used to traverse relationships in the opposite direction. When learning DAX, it is easy to get confused and use RELATED when it is not necessary, or to forget about RELATEDTABLE. In this article, we describe the most common uses of the two functions, along with common misperceptions.

Click through to learn more about these two functions.

Comments closed

DAX EVALUATEANDLOG() Function Outputs

Jeffrey Wang continues a series on EVALUATEANDLOG():

Last week, we learned how to interpret the output of the EvaluateAndLog function of scalar expressions. Albeit interesting, the new function would be much less useful if it could only return scalar values. After all, Power BI users can always define a scalar expression as a measure and then study its values in various contexts. The real power of EvaluateAndLog is that it can also wrap around any table expressions, the output of which was much harder to visualize until now.

This function exposes a lot of information, as you can see in the post.

Comments closed

Type 1 SCDs in Power BI

Soheil Bakhshi grabs some Excel data from SharePoint:

We have a retail company selling products. The company releases the list of products in Excel format, including list price and dealer price, every year. The product list is released on the first day of July when the financial year starts. We have to implement a Power BI solution that keeps the latest product data to analyse the sales transactions. The following image shows the Product list for 2013:

So each year, we receive a similar Excel file to the above image. The files are stored on a SharePoint Online site.

Read on to see how it works. Of course, the data source itself doesn’t affect how you implement slowly-changing dimensions, so the technique Soheil shows applies to a broad number of use cases.

Comments closed