Azure Data Factory Data Flows

Joost van Rossum takes a look at data flows in Azure Data Factory:

2) Create Databricks Service
Yes you are reading this correctly. Under the hood Data Factory is using Databricks to execute the Data flows, but don’t worry you don’t have to write code.
Create a Databricks Service and choose the right region. This should be the same as your storage region to prevent high data movement costs. As Pricing Tier you can use Standard for this introduction. Creating the service it self doesn’t cost anything.

Joost shows the work you have to do to build out a data flow. This has been a big hole in ADF—yeah, ADF seems more like an ELT tool than an ETL tool but even within that space, there are times when you need to do a bit more than pump-and-dump.

Related Posts

Calculated Columns and Memory Usage in Analysis Services

Teo Lachev troubleshoots a customer issue: Scenario: A client reports a memory spike during processing. They have a Tabular semantic model deployed to Azure Analysis Services. They fully process the model daily. The model normally takes less than 50 GB RAM but during processing, it spikes five times and Azure Analysis Services terminates the processing task […]

Read More

Performance Tuning Neural Network Training

Sean Owen takes us through a few techniques for speeding up neural network model training: Step #2: Use Early StoppingKeras (and other frameworks) have built-in support for stopping when further training appears to be making the model worse. In Keras, it’s the EarlyStopping callback. Using it means passing the validation data to the training process for evaluation […]

Read More

Categories

January 2019
MTWTFSS
« Dec Feb »
 123456
78910111213
14151617181920
21222324252627
28293031