CRISP-DM

Steph Locke explains the CRISP-DM model for data mining projects and applies it to data science projects:

Within a given project, we know that at the beginning of our first ever project we may not have a lot of domain knowledge, or there might be problems with the data or the model might not be valuable enough to put into production. These things happen, and the really nice thing about the CRISP-DM model is it allows for us to do that. It’s not a single linear path from project kick-off to deployment. It helps you remember not to beat yourself up over having to go back a step. It also equips you with something upfront to explain to managers that sometimes you will need to bounce between some phases, and that’s ok.

This is another place in which “iterate, iterate, iterate” ends up being the best answer available.

Related Posts

Defining TF-IDF

Bruno Stecanella explains the concept behind TF-IDF: TF-IDF was invented for document search and information retrieval. It works by increasing proportionally to the number of times a word appears in a document, but is offset by the number of documents that contain the word. So, words that are common in every document, such as this, what, and if, rank […]

Read More

Sentiment Analysis with Python

Bruno Stecanella shows us how to use MonkeyLearn to perform sentiment analysis in Python: Sentiment analysis is a set of Natural Language Processing (NLP) techniques that takes a text (in more academic circles, a document) written in natural language and extracts the opinions present in the text. In a more practical sense, our objective here is to take a text […]

Read More

Categories

January 2017
MTWTFSS
« Dec Feb »
 1
2345678
9101112131415
16171819202122
23242526272829
3031