Building A Python Project Template

Henk Griffioen shows how to create a standardized project in Python, focusing on data science scenarios:

Project structures often organically grow to suit people’s needs, leading to different project structures within a team. You can consider yourself lucky if at some point in time you find, or someone in your team finds, a obscure blog post with a somewhat sane structure and enforces it in your team.

Many years ago I stumbled upon ProjectTemplate for R. Since then I’ve tried to get people to use a good project structure. More recently DrivenData (what’s in a name?) released their more generic Cookiecutter Data Science.

The main philosophies of those projects are:

  • A consistent and well-organized structure allows people to collaborate more easily.

  • Your analyses should be reproducible and your structure should enable that.

  • A projects starts from raw data that should never be edited; consider raw data immutable and only edit derived sources.

This is a set of prescriptions and focuses on the phase before the project actually kicks off.

Related Posts

Matrix Transposition In T-SQL

Phil Factor has some fun transposing a matrix using T-SQL: What I’m doing is simply converting the table into its JSON form, and then using this to create a table using the multi-row VALUES  syntax which paradoxically allows expressions. The expression I’m using is JSON_Value, which allows me do effectively dictate the source within the table, via […]

Read More

Taking A Random Walk

Dan Goldstein describes the basics of Brownian motion: I was sitting in a bagel shop on Saturday with my 9 year old daughter. We had brought along hexagonal graph paper and a six sided die. We decided that we would choose a hexagon in the middle of the page and then roll the die to […]

Read More

Categories

March 2017
MTWTFSS
« Feb Apr »
 12345
6789101112
13141516171819
20212223242526
2728293031