Data Science Notebooks

Dan Osipov discusses data science notebooks:

Even though they’ve become prominent in the past few years, they have a long history. First notebooks were available in packages like Mathematica andMatlab, used primarily in academia. More recently they’ve started getting traction in Python community with iPython Notebook. Today there are many notebooks to choose from: Jupyter (successor to the iPython Notebook), R Markdown, Apache Zeppelin,Spark Notebook, Databricks Cloud, and more. There are kernels/backends to multiple languages, such as Python, Julia, Scala, SQL, and others.

Traditionally, notebooks have been used to document research and make results reproducible, simply by rerunning the notebook on source data. But why would one want to choose to use a notebook instead of a favorite IDE or command line? There are many limitations in the current browser based notebook implementations that prevent them from offering a comfortable environment to develop code, but what they do offer is an environment for exploration, collaboration, and visualization.

Back In The Day, developers and infrastructure staff used runbooks to make sure they listed and hit all of the steps in an operation.  I don’t really know of one which integrates directly with SQL Server, but Jupyter is probably the best-known cross-platform notebook.

Related Posts


John Mount explains the vtreat package that he and Nina Zumel have put together: When attempting predictive modeling with real-world data you quicklyrun into difficulties beyond what is typically emphasized in machine learning coursework: Missing, invalid, or out of range values. Categorical variables with large sets of possible levels. Novel categorical levels discovered during test, cross-validation, or […]

Read More

Wrapping Up A Data Science Project

I have finished my series on launching a data science project.  First, I have a post on deploying models as microservices: The other big shift is a shift away from single, large services which try to solve all of the problems.  Instead, we’ve entered the era of the microservice:  a small service dedicated to providing […]

Read More


May 2016
« Apr Jun »