If you aren’t able to open the R Script Editor, check out our previous post, Getting Started with R Scripts. While it’s possible to develop and test code using the built-in R Script Editor, it’s not great. Unfortunately, there doesn’t seem to be a way to develop this script using an external IDE like RStudio. So, we typically export files to csv for development in RStudio. This is obviously not optimal and should be done with caution when data is extremely large or sensitive in some way. Fortunately, the write.csv() function is pretty easy to use. You can read more about it here.
It’s not a perfect experience, but Brad does show us how to get it done.
As you can see this new type of relationship is different. It is dashed line, compared to the active, which was a solid line. This is an inactive relationship. You can only have one active relationship between two tables. Any other relationships will become inactive.
An inactive relationship doesn’t pass filtering. It doesn’t do anything by itself. I still see many people creating inactive relationships in their model thinking that just the inactive relationship by itself will do some filtering. It doesn’t. If I use the FullDateAlternateKey from the DimDate table to slice and dice the SalesAmount from the FactInternetSales table, which field I’m filtering based on? The field that is related through an Active relationship of course. Here is a result for that (which is apparently same as what you have seen in the previous example because the inactive relationship doesn’t do anything. It is just the active relationship that passes the filter);
Read the whole thing.
The first line is related to the week ending on February 2nd, so Sales Amount includes only 2 days (February 1st and 2nd) excluding the amount of other 5 days in the same week (January 27th to 31st). The same happens in the last week, which includes June 29th and 30th but does not include sales for the remaining 5 days in the same week (July 1st to 5th). This also explains why the report includes a week ending in July 2008 even though the Month slicer only includes dates up to June 2018.
We can create a measure that removes incomplete weeks from the calculation, as shown in the following code. A similar technique could be used for incomplete months and quarters.
There are some interesting techniques that Marco shows off, including hiding incomplete weeks.
One of the most common issues faced when calling web services in M is that the the easiest way of doing so – creating a function that calls the web service, then calling the function once per row in a table using the Invoke Custom Function button – is very inefficient. It’s a much better idea to batch up calls to a web service, if the web service supports this, but doing this forces you to write more complex M code. It’s a problem I wrestled with last year in my custom connector for the Cognitive Services API, and in that case I opted to create functions that can be passed lists instead (see here for more information on how functions with parameters of type list work); I’m sure the developers working on the new AI features in dataflows had to deal with the same problem. This problem is not limited to web services either: calculations such as running totals also need to be optimised in the same way if they are to perform well in M. The good news is that there is a new M function Function.ScalarVector() that allows you to create functions that combine the ease-of-use of row-by-row requests with the efficiency of batch requests.
As Chris notes in the post and in the comments, this is mostly useful when you can batch together individual calls to improve performance. For functions which operate serially (like opening Excel workbooks), you won’t see much (if any) gain.
Although this might not be what the inventors of Power BI had in mind, large lots of folks are trying to create classical financial statements in it. And putting aside the afford that might go into getting the numbers right, there is still a major drawback to swallow:
Click through for a depiction of the problem and then go vote for this on Power BI Ideas.
The ability to add columns and measures to display folders in the new Modelling view is one of my favourite features of the November 2018 release of Power BI Desktop: it makes complex models a lot more user-friendly. Being an old-school SSAS developer, I immediately wondered if they worked in the same way that they do in SSAS – and the answer is yes and no.
The good news is that display folders can be nested.
Click through for a demo as well as the issues Chris spotted.
An online service provided by Microsoft as part of Power BI (software as a service, or SaaS).
In effect dataflows are an online data collection and storage tool.
- Collection: It uses Power Query to connect to the data at the source and transform that data as needed.
- You will need to be able to access the data either through a cloud service (such as Dynamics 365) or to your PC/Network via a gateway.
- You can also use Power Query to write queries from scratch, such as my Power BI calendar table.
- Storage: Dataflows then stores that data in a table in the cloud so it can be used directly inside PowerBI.com, but more importantly (from my view) directly from Power BI Desktop.
Dataflows leverage the Power Query skills you have learnt (or are learning) using other tools (like Power BI Desktop, Power Query for Excel) allowing you to reuse those same skills in this online tool.
Tables that are created as a result of the dataflow are stored in an Azure Data Lake.
- If you don’t know what that is, don’t worry – I don’t understand it either. The point is it doesn’t matter because it is all done automatically for you by the tool.
Dataflows include the concept of the common data service (CDS) or common data model directly in the tool and you don’t have to know what it is, nor care.
If you don’t know what that is, don’t worry – it doesn’t matter now/yet.
This will become very important in the future as it will make the process of getting data out of complex databases (such as MS Dynamics 365) much easier in the future.
Click through for more detail as well as some good uses for Dataflows.
The Forecast measure in the demo model is quite an advanced piece of DAX code that would require a full article by itself. The curious reader will find more information on how to reallocate budget at different granularities in the video Budgeting with Power BI. In this article, we use the Forecast measure without detailed explanations; our goal is to explain how to compute the next measure: Remaining Forecast.
The Remaining Forecast measure must analyze the Sales table, finding the last day for which there are sales, and only then computing the forecasts.
Read the whole thing.
I loved Nar’s post on Automated Testing using DAX. I especially like the rule of always including controls so that business readers can share responsibility for data quality. For my part, I sometimes use hidden pages in Power BI reports to assure myself of data quality. I also set alerts on testing dashboards in the Power BI Service to notify me if something is not right. Sometimes, however, a more proactive approach is needed. So, we’ll be doing automated testing with Power Query.
If the query can’t connect to the data source, it will fail. When this happens, the report in Power BI Service is stale, but accurate. I’m fine with this. It can also happen that the query succeeds but is incomplete. In this case, the result is that the report is wrong. Why does this happen? It can happen because of an overtaxed transactional data source. The ERP or CRM or work order system just can’t deliver the amount of data. Maybe it’s linked SQL tables using ODBC. For whatever reason, the query succeeds, but data is missing. I’m NOT fine with this. The long-term solution is to move to a more reliable data source (data warehouse, anybody?). In the short run, refreshes must be stopped. Stale data is better than bad data.
Also check out the comments.
The problem with self-service BI is that you never quite know what your users are up to. For example, what data sources are they using? Are there hundreds of Excel files being used as data sources for reports that you don’t know about? If so, where are they? Could they and should they be replaced by a database or something else more robust? In this post I’ll show you how you can use Microsoft Flow and the Power BI REST API (see part 1 to find out how to create a Flow custom connector to call the Power BI API) to get the details of all the data sources used in all of the workspaces of your Power BI tenant.
I’ll admit that doing this turned out to be a bit trickier than I had expected. My plan was to use the GetDatasetsAsAdmin endpoint to get a list of all datasets, loop over each one and then call the (undocumented, but in the REST API’s Swagger file and therefore in my custom connector) GetDatsourcesAsAdmin endpoint to get the datasources in each dataset. Both these endpoints require administrative permissions to call, so I made sure my custom connector had the correct permissions (at least Tenant.Read.All – you can check this in the Azure Portal on the app you registered in Azure Active Directory) and I ran the Flow as a user with Power BI Admin permissions. But I kept getting 404 errors when requesting the data sources for certain datasets .
Chris explains why those 404s appear and what you can do about them.