Notebook Practices

Jonathan Whitmore has good practices for managing Jupyter notebooks:

Here’s an example of how we use git and GitHub. One beautiful new feature of Github is that they now render Jupyter Notebooks automatically in repositories.

When we do our analysis, we do internal reviews of our code and our data science output. We do this with a traditional pull-request approach. When issuing pull-requests, however, looking at the differences between updated .ipynb files, the updates are not rendered in a helpful way. One solution people tend to recommend is to commit the conversion to .py instead. This is great for seeing the differences in the input code (while jettisoning the output), and is useful for seeing the changes. However, when reviewing data science work, it is also incredibly important to see the output itself.

So far, I’ve treated notebooks more as presentation media and used tools like R Studio for tinkering.  This shifts my priors a bit.

Related Posts

Sharing R Notebooks

Hanyu Cui and Hossein Falaki show how to share a notebook using RMarkdown: RMarkdown is the dynamic document format RStudio uses. It is normal Markdown plus embedded R (or any other language) code that can be executed to produce outputs, including tables and charts, within the document. Hence, after changing your R code, you can just rerun all […]

Read More

Introducing Azure Notebooks

Zach Stagers has an introductory post to Azure Notebooks: No installation, no maintenance As with any PaaS solution, Azure Notebooks makes it far quicker and easier to get up and running, as there’s no download or installation required. Microsoft handles all the maintenance for you too! I’m working on a fairly big project using Azure […]

Read More

Categories

September 2016
MTWTFSS
« Aug Oct »
 1234
567891011
12131415161718
19202122232425
2627282930