Databricks Cluster-Scoped Init Scripts

Aayush Bhasin shares some background on a Databricks intern project, adding cluster-scoped initialization scripts to Databricks clusters:

One of the biggest pain points for customers used to be that init scripts for a cluster were not part of the cluster configuration and did not show up in the User Interface. Because of this, applying init scripts to a cluster was unintuitive, and editing or cloning a cluster would not preserve the init script configuration. Cluster-scoped init scripts addressed this issue by including an ‘Init Scripts’ panel in the UI of the cluster configuration page, and adding an ‘init_scripts’ field to the public API. This also allows init scripts to take advantage of cluster access control.

Read on to see how Aayush & co. solved this issue.

Related Posts

It’s All ETL (Or ELT) In The End

Robin Moffatt notes that ETL (and ELT) doesn’t go away in a streaming world: In the past we used ETL techniques purely within the data-warehousing and analytic space. But, if one considers why and what ETL is doing, it is actually a lot more applicable as a broader concept. Extract: Data is available from a source system Transform: We […]

Read More

Switching To Managed Disks In Azure

Chris Seferlis walks us through an easy method to convert unmanaged disks to managed disks in Azure: First off, why would you want a managed disk over an unmanaged one? Greater scalability due to much higher IOPs and storage limits. There’s no longer the need to add additional storage accounts when you’re adding disk space, […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

September 2018
MTWTFSS
« Aug  
 12
3456789
10111213141516
17181920212223
24252627282930