Running Spark MLlib to Feed Power BI

Brad Llewellyn shows how you can take Spark MLlib results and feed them into Power BI:

MLlib is one of the primary extensions of Spark, along with Spark SQL, Spark Streaming and GraphX.  It is a machine learning framework built from the ground up to be massively scalable and operate within Spark.  This makes it an excellent choice for machine learning applications that need to crunch extremely large amounts of data.  You can read more about Spark MLlib here.

In order to leverage Spark MLlib, we obviously need a way to execute Spark code.  In our minds, there’s no better tool for this than Azure Databricks.  In the previous post, we covered the creation of an Azure Databricks environment.  We’re going to reuse that environment for this post as well.  We’ll also use the same dataset that we’ve been using, which contains information about individual customers.  This dataset was originally designed to predict Income based on a number of factors.  However, we left the income out of this dataset a few posts back for reasons that were important then.  So, we’re actually going to use this dataset to predict “Hours Per Week” instead.

Check it out. And Brad’s not joking when he says the resulting model is terrible. But that’s okay, because it was never about the model.

Related Posts

Reinforcement Learning with R

Holger von Jouanne-Diedrich takes us through concepts in reinforcement learning: At the core this can be stated as the problem a gambler has who wants to play a one-armed bandit: if there are several machines with different winning probabilities (a so-called multi-armed bandit problem) the question the gambler faces is: which machine to play? He could “exploit” one […]

Read More

Power Query FILTER()

Rob Collie takes us through a good use of FILTER() in DAX: The thing both of those formulas have in common is that they are using a measure in the filter argument of the CALCULATE function.  In both examples here, I’ve highlighted the offending measure in yellow. CALCULATE([Sightings per Year], [Avg Sighting Length in Mins]>6) CALCULATE([Sightings […]

Read More

Categories

April 2019
MTWTFSS
« Mar May »
1234567
891011121314
15161718192021
22232425262728
2930