Building A Python Project Template

Henk Griffioen shows how to create a standardized project in Python, focusing on data science scenarios:

Project structures often organically grow to suit people’s needs, leading to different project structures within a team. You can consider yourself lucky if at some point in time you find, or someone in your team finds, a obscure blog post with a somewhat sane structure and enforces it in your team.

Many years ago I stumbled upon ProjectTemplate for R. Since then I’ve tried to get people to use a good project structure. More recently DrivenData (what’s in a name?) released their more generic Cookiecutter Data Science.

The main philosophies of those projects are:

  • A consistent and well-organized structure allows people to collaborate more easily.

  • Your analyses should be reproducible and your structure should enable that.

  • A projects starts from raw data that should never be edited; consider raw data immutable and only edit derived sources.

This is a set of prescriptions and focuses on the phase before the project actually kicks off.

Related Posts

Explaining Confidence Intervals

Mala Mahadevan explains what confidence intervals are: Suppose I look at a sampling of 100 americans who are asked if they approve of the job the supreme court is doing. Let us say for simplicity’s sake that the only two answers possible are yes or no. Out of 100, say 40% say yes. As an […]

Read More

Introduction To Bayesian Statistics

Kennie Nybo Pontoppidan has just completed a course on Bayesian statistics: Last month I finished a four-week course on Bayesian statistics. I have always wondered why people deemed it hard, and why I heard that the computations quickly became complicated. The course wasn’t that hard, and it gave a nice introduction to prior/posterior distributions and […]

Read More

Categories

March 2017
MTWTFSS
« Feb Apr »
 12345
6789101112
13141516171819
20212223242526
2728293031