Press "Enter" to skip to content

Author: Kevin Feasel

Power BI as an Enterprise Data Warehouse

James Serra follows Betteridge’s Law of Headlines:

With Power BI continuing to get many great new features, including the latest in Datamarts (see my blog Power BI Datamarts), I’m starting to hear customers ask “Can I just build my entire enterprise data warehouse solution in Power BI”? In other words, can I just use Power BI and all its built-in features instead of using Azure Data Lake Gen2, Azure Data Factory (ADF), Azure Synapse, Databricks, etc? The short answer is “No”.

Read on to understand why Power BI shouldn’t be your data warehouse.

Comments closed

Data Lakehouse Cleanrooms in Databricks

Matei Zaharia, et al, announce an interesting idea:

We are excited to announce data cleanrooms for the Lakehouse, allowing businesses to easily collaborate with their customers and partners on any cloud in a privacy-safe way. Participants in the data cleanrooms can share and join their existing data, and run complex workloads in any language – Python, R, SQL, Java, and Scala – on the data while maintaining data privacy.

With the demand for external data greater than ever, organizations are looking for ways to securely exchange their data and consume external data to foster data-driven innovations. Historically, organizations have leveraged data sharing solutions to share data with their partners and relied on mutual trust to preserve data privacy. But the organizations relinquish control over the data once it is shared and have little to no visibility into how data is consumed by their partners across various platforms. This exposes potential data misuse and data privacy breaches. With stringent data privacy regulations, it is imperative for organizations to have control and visibility into how their sensitive data is consumed. As a result, organizations need a secure, controlled and private way to collaborate on data, and this is where data cleanrooms come into the picture.

Read on to learn more about how this all works. It’s definitely a lot better than sending off a bunch of CSVs…

Comments closed

Finding Key Influencers with Power BI

Gauri Mahajan looks at the key influencers visual in Power BI:

Once the Key Influencers are added to the Power BI report, it would look as shown below. The visual would be empty by default. The key areas that are required to make this visual works are Explain section and Analyze By section. The Analyze section is used to point to the variables or attributes that we intend to analyze. The Explain By section is used to point to the variables or attributes that may be influencing the attributes specified in the Analyze section.

I’ve found this visual to be pretty interesting if you have a good dataset.

Comments closed

Database Audit Specifications Creating Users

Kenneth Fisher asks, who audits the auditors?:

I love database audits. They are simple, easy to use, effective, not overly resource intensive, and can be turned on and off at need once created. That said, they do have a few gotchas. If you want every user put public as the principal. And if you don’t, and you put in an AD user, be aware that if that user will be created (along with a matching schema) when you create the Database Audit Specification.

Read on for Kenneth’s experience and a way to clean up these potentially-added users.

Comments closed

Power BI Desktop External Tools Not Opening

Gilbert Quevauvilliers ran into a problem:

I recently got a new laptop and I had to install all my programs again. Everything was going as expected, except when I went to use ALM Toolkit, the program would not open.

I would click on ALM Toolkit, I would see it open for a few seconds in task manager and then it would disappear.

That led me down a few rabbit holes, I thought could it be Windows Defender, could it be the anti-virus or could it be installed incorrectly.

It turns out that neither of those was the problem. Read on to learn what the issue was and how Gilbert corrected it.

Comments closed

Azure Synapse Analytics June 2022 Updates

Ryan Majidimehr has some updates for us:

Fuzzy matching with a sliding similarity score option has been added to the Join transformation in Mapping Data Flows. You can create inner and outer joins on data values that are similar rather than exact matches! Previously, you would have had to use an exact match. The sliding scale value goes from 60% to 100%, making it easy to adjust the similarity threshold of the match. 

Read on for the full list of updates.

Comments closed

Unicode Character Generation in Power Query

Meagan Longoria needs more Unicode:

You may have used the UNICHAR() function in DAX to return Unicode characters in DAX measures. If you haven’t yet read Chris Webb’s blog post on the topic, I recommend you do. But did you know there is a Power Query function that can return Unicode characters? This can be useful in cases when you want to assign a Unicode character to a categorical value.

Click through to see how this works.

Comments closed

Data Governance in Databricks with Unity Catalog

Paul Roome, et al, announce the upcoming GA for Databricks Unity Catalog:

Today we are excited to announce that Unity Catalog, a unified governance solution for all data assets on the Lakehouse, will be generally available on AWS and Azure in the upcoming weeks. Currently, you can apply for a public preview or reach out to a member of your Databricks account team.

In a previous blog, we set out our vision for a governed lakehouse and how Unity Catalog can help customers simplify governance at scale. This blog will explore the most recent updates to Unity Catalog and our growing partner ecosystem.

Click through for those updates and to sign up for the public preview if so inclined.

Comments closed

Customer Segmentation via Databricks Solution Accelerator

Gavita Regunath discovers customer segments in a dataset:

We will be using the German Credit dataset, a publicly available dataset provided by Dr. Hans Hofmann of the University of Hamburg. The German Credit dataset contains features describing 1000 loan applicants who have taken credit from the bank. Using this dataset, our aim will be to understand the following “How should the bank personalise its products for its customers?”.

Click through to see an example of clustering to generate customer segments.

Comments closed