Scheduling Jupyter Notebooks

Matthew Seal, et al, explain how they schedule runs of Jupyter notebooks:

On the surface, notebooks pose a lot of challenges: they’re frequently changed, their cell outputs need not match the code, they’re difficult to test, and there’s no easy way to dynamically configure their execution. Furthermore, you need a notebook server to run them, which creates architectural dependencies to facilitate execution. These issues caused some initial push-back internally at the idea. But that has changed as we’ve brought in new tools to our notebook ecosystem.

The biggest game-changer for us is Papermill. Papermill is an nteract library built for configurable and reliable execution of notebooks with production ecosystems in mind. What Papermill does is rather simple. It take a notebook path and some parameter inputs, then executes the requested notebook with the rendered input. As each cell executes, it saves the resulting artifact to an isolated output notebook.

Papermill does look quite interesting.

Related Posts

Embedding Notebooks on a Website

Eduardo Pivaral shows how to embed the results of a Jupyter notebook created in Azure Data Studio on a website: Notebooks are a functionality available in Azure Data Studio, that allows you to create and share documents that may contain text, code, images, and query results. These documents are helpful to be able to share database […]

Read More

Azure Data Studio May Release

Alan Yu announces the May release of Azure Data Studio: Since its release two months ago, the community continues to love SQL Notebooks. This month, we had a laser-eyed focus on quality of life bug fixes instead of new features. These improvements include: – Markdown rendering improvements, including better support for notes and tables– Usability […]

Read More

Categories

August 2018
MTWTFSS
« Jul Sep »
 12345
6789101112
13141516171819
20212223242526
2728293031