CRISP-DM

Steph Locke explains the CRISP-DM model for data mining projects and applies it to data science projects:

Within a given project, we know that at the beginning of our first ever project we may not have a lot of domain knowledge, or there might be problems with the data or the model might not be valuable enough to put into production. These things happen, and the really nice thing about the CRISP-DM model is it allows for us to do that. It’s not a single linear path from project kick-off to deployment. It helps you remember not to beat yourself up over having to go back a step. It also equips you with something upfront to explain to managers that sometimes you will need to bounce between some phases, and that’s ok.

This is another place in which “iterate, iterate, iterate” ends up being the best answer available.

Related Posts

Using word2vec To Model User Behavior

Nishan Subedi walks us through an Etsy project to model user journeys via semantic embedding techniques: We initially started training the embeddings as a Skip-gram model with negative sampling (NEG as outlined in the original word2vec paper) method. The Skip-gram model performs better than the Continuous Bag Of Words (CBOW) model for larger vocabularies. It […]

Read More

Testing Spatial Equilibrium Concepts With tidycensus

Ignacio Sarmiento Barbieri walks us through the concept of spatial equilibrium and tests using data from the tidycensus package: Let’s take the model to the data and reproduce figures 2.1. and 2.2 of “Cities, Agglomeration, and Spatial Equilibrium”. The focus are two cities, Chicago and Boston. These cities are chosen because both differ in how easy […]

Read More

Categories

January 2017
MTWTFSS
« Dec Feb »
 1
2345678
9101112131415
16171819202122
23242526272829
3031