HDInsight + Power BI + Spark

Reza Rad has a nice walkthrough on integrating several powerful technologies:

Power BI can connect to many data sources as you know, and Spark on Azure HDInsight is one of them. In area of working with Big Data applications you would probably hear names such as Hadoop, HDInsight, Spark, Storm, Data Lake and many other names. Spark and Hadoop are both frameworks to work with big data, they have some differences though. In this post I’ll show you how you can use Power BI (either Power BI Desktop or Power BI website) to connect to a sample of Spark that we built on an Azure HDInsight service. by completing this section you will be able to create simple spark on Azure HDInsight, and run few Python scripts from Jupyter on it to load a sample table into Spark, and finally use Power BI to connect to Spark server, load, and visualize the data.

If you’re totally unfamiliar with Spark but interested in data processing, now’s a good time to start digging into the topic.

Related Posts

Power BI Helper 3.0

Reza Rad has a new version of Power BI Helper out: It is a pleasure to announce the newest version of Power BI helper, version 3.0 July 2018 with the great feature of exporting model documentation. The documentation part of the insight from Power BI Helper has been always in our backlog, but haven’t had […]

Read More

Relationships In Power BI

Teo Lachev shows us the importance of defining relationships in Power BI: However, If there isn’t a direct relationship between ResellerSales and Employee, the moment you add an unsummarized field from the second table on the many side, such as Employee[FullName] after adding SalesTerritoryCountry and ResellerSales[SalesOrderNumber), you’ll get the error “Error: Can’t determine relationships between the […]

Read More

Categories

February 2016
MTWTFSS
« Jan Mar »
1234567
891011121314
15161718192021
22232425262728
29