Press "Enter" to skip to content

Category: Power BI

Power BI August Release And SSAS Performance Improvements

Chris Webb points out something new in the Power BI August 2018 release:

While I was playing around with the new release (August 2018) of Power BI Desktop I noticed there was an undocumented change: similar to the OData improvements I blogged about here, there is a new option in the AnalysisServices.Database() and AnalysisServices.Databases() M functions that turns on a newer version of the MDX generation layer used by the Power Query engine. Like the OData improvements it is an option called Implementation=”2.0”, used like this:

AnalysisServices.Databases(
	"localhost", 
	[
		TypedMeasureColumns=true, 
		Implementation="2.0"
	]
)

…and also, as with the OData improvements, you will need to manually edit any existing M queries to take advantage of this.

Read on for Chris’s test and analysis of the resulting MDX output.

Comments closed

Dataflows In Power BI

James Serra gives us a preview of Power BI Dataflows:

In short, Dataflows integrates data lake and ETL technology directly into Power BI, so anyone with Power Query skills (yes – Power Query is now part of Power BI service and not just Power BI Desktop and is called Power Query online) can create, customize and manage data within their Power BI experience (think of it as self-service data prep).  Dataflows include a standard schema, called the Common Data Model (CDM), that contains the most common business entities across the major functions such as marketing, sales, service, finance, along with connectors that ingest data from the most common sources into these schemas.  This greatly simplifies modeling and integration challenges (it prevents multiple metadata/definition on the same data).  You can also extend the CDM by creating custom entities.  Lastly – Microsoft and their partners will be shipping out-of-the-box applications that run on Power BI that populate data in the Common Data Model and deliver insights through Power BI.

A dataflow is not just the data itself, but also logic on how the data is manipulated.  Dataflows belong to the Data Warehouse/Mart/Lake family.  Its main job is to aggregate, cleanse, transform, integrate and harmonize data from a large and growing set of supported on-premises and cloud-based data sources including Dynamics 365, Salesforce, Azure SQL Database, Excel, SharePoint.  Dataflows hold a collection of data-lake stored entities (i.e. tables) which are stored in internal Power BI Common Data Model compliant folders in Azure Data Lake Storage Gen2.

Also check out the comments for some clarification on why you’d want to use Dataflows rather than doing the work directly in the data lake.

Comments closed

Dual Storage Mode In Power BI

Teo Lachev takes us through the Dual storage mode now available in Power BI:

As the name implies, the dual storage mode is a hybrid between Import and DirectQuery. Like importing data, the dual storage mode caches the data in the table. However, it leaves it up to Power BI to determine the best way to query the table depending on the query context. Consider the following schema where all tables come from the same data source. Let’s assume the following configuration:

  • FactInternetSales is imported

  • FactResellerSales is DirectQuery

  • DimDate and DimProduct are Dual

Read on for more.

Comments closed

Nested Calculations In Power Query

Michael Humpherys shows how to use nested calculations in Power Query to make financial calculations easier:

A central problem in finance is answering the simple question: How much is this contract worth? For example, Bob might say he’ll give me $102 in a year, and I want to know how much I should pay him for that guaranteed money. If I figure out that the value of the contract is a $100, then I’m saying that the guaranteed $102 in a year is worth $100 today. This means I get a 2% interest on my $100 investment. This is called the one-year spot rate, and there are similar rates for all sort of different time frames. Taking 1/ (1+.02) gives me the discount rate and multiplying this by the $102 payment gets me to the $100 value of the contract.

The next step is that I may want to know how much $102 two years from now is worth next year. So instead of figuring out what it is worth today, I want to know what it will be worth in a year. To figure this out, I need something called the forward rate, which tells me the annual interest rate one year in the future, in this example.

With the forward rates, I can take a complex series of future payments and find the value of all of those payments today, but also the value at different points in the future. The complexity is that depending on when I want to value them to and the timing of the payment I need to use different sets of forward rates and that’s the application I’ll walk through below.

That is a novel use of the “table in a cell” technique.

Comments closed

Performance Test: Loading CSV Versus Loading Excel In Power Query

Chris Webb lays out a performance test which shows how quickly Power Query can read data from a CSV versus from an Excel spreadsheet:

The black line in the graph above is the amount of data read (actually the offset values showing where in the file the data is read from, which is the same thing as a running total when Power Query is reading all the data) from the Excel file; the green line is the amount of data read from the CSV file (the same data shown in the first graph above). A few things to mention:

  • Running Process Monitor while this second query was refreshing had a noticeable impact on its performance – in fact it was almost 20 seconds slower

  • The initial values of 80 million bytes seem to be where data is read from the end of the Excel file. Maybe this is Power Query reading some file metadata? Anyway, it seems as though it takes 5 seconds before it starts to read the data needed by the query.

  • There’s a plateau between the 10 and 20 second mark where not much is happening; this didn’t happen consistently and may have been connected to the fact that Process Monitor was running

The results were remarkable; check them out.

Comments closed

Filtering Measures In Power BI

Marco Russo has a great post on filtering measures on Power BI dashboards:

Also consider the case of customer 19081. Even though it is only displayed in March, their Revenues YTD value is larger than Revenues. This is because the Revenues YTD measure considers the sum of previous months, even though Revenues may be lower than the threshold of 9,999.

Because the filter granularity is Year-Month-Customer, only the filtered combinations are also considered in the year total. This explains another unexpected result. The Revenues YTD computed in December is different from the one computed for the entire year – yet another unexpected behavior for a year-to-date calculation. At the month level, only customers with Revenues higher than 9,999 in December are considered, including all the months in their Revenues YTD calculation. However at the year level, all customers with revenue higher than 9,999 in at least one month are considered; their revenues for the entire year are summed to compute Revenues YTD regardless of the monthly filter applied to the Revenues measure.

Marco goes into detail regarding the nuances of filtering and also provides some good answers to common problems.

Comments closed

Watching Power BI Grow Up

Paul Turley argues that Power BI is getting to be a mature product:

In the opening keynote and again in his sessions, Christian demonstrated Power BI reports on the taxi driver activity database with over a trillion rows of raw data.  The larger dataset was in a Spark cluster, accessed using DirectQuery.  Aggregated tables were stored in the in-memory model using the new composite model feature.  As the data was explored in report visuals, the Power BI engine would seamlessly switch from tabular in-memory aggregate tables to DirectQuery source data in order to return low-level details.  Composite models will allow mashing-up imported database and file-based data with an DirectQuery.

There are limits and complexities with these new features.  You cannot mashup imported tables in a Power BI model based in a direct connection to SSAS, but enterprise-scale features in Power BI arguably may not steer a solution architect to select SSAS over Power BI for serious data modeling.  With incremental data refresh, large model support, row-level security and many other “big kid” features, Power BI might be a preferable choice.  I’m not ready to rule-out Analysis Services as the better option for most enterprise solutions – at least not in the near future, but Power BI is definitely heading in that direction.

Click through for several other features which help convince Paul that Power BI is threatening Analysis Services for enterprise data analysis solutions.

Comments closed

What Those Power BI Processes In Task Manager Mean

Kellyn Pot’vin-Gorman explains what you see in the Task Manager when loooking at an instance of Power BI Desktop:

We can see that there are numerous threads, with a few taking considerable memory over others-  The CefSharp.BrowserSubprocess can be a bit misleading-  It’s Power BI using Chromium to render the visuals that are part of the Power BI Desktop that’s part of the current run.  Chromium (CefSharp.BrowserSubprocess) subprocesses will always come in pairs, one for rendering and one for messaging.

In the Task Manager Details, we can see each of the PIDs that correspond with the processes IDs listed in the logs.  By updating our viewable columns, (right click, choose “threads” and click OK) you can now view how many threads are associated with a given PID.

Read on for more.

Comments closed

Layout Images In Power BI

Meagan Longoria has some tips for using layout images in Power BI:

Using layout images in Power BI has become a popular design trend. When I say layout images, I’m referring to background images with shapes around areas where visuals are placed. This is different from the new wallpaper feature that became available in the July release, which can be used to format the grey area outside your report page and extend the main color of background images.

Layout images can help with spacing and alignment within a report and can help create consistency across reports. They can also help create affordances, using consistent layout and design to make it obvious how users should interact with our reports.

I use layout images in some of my reports, but I don’t think they are necessary on every report. There are a couple of things to consider when using layout images.

Read on for an example of a good layout image versus a bad layout image as well as tips and tricks on how to create good layout images.

Comments closed