Machine Learning Data Preparation Tips

Jen Underwood has some good tips when preparing data for a machine learning operation:

Data preparation for machine learning requires business domain expertise, bias awareness and an experimental thought process. Before preparing your data, you’ll first define a business problem solve. During that exercise, you’ll select an outcome metric and brainstorm potential input variables that influence it from many varied perspectives. From there you will begin identifying, collecting, cleaning, shaping and sampling data to run through automated machine learning model processes.

Note that it is also not unusual for relevant machine learning input data to occur outside of existing transactional processes. If that is the case, you can still start creating a first-generation machine learning model with existing data and continue to build new model versions over time as supplementary data is acquired.

Click through for the ten tips.

Related Posts

Performance Tuning Neural Network Training

Sean Owen takes us through a few techniques for speeding up neural network model training: Step #2: Use Early StoppingKeras (and other frameworks) have built-in support for stopping when further training appears to be making the model worse. In Keras, it’s the EarlyStopping callback. Using it means passing the validation data to the training process for evaluation […]

Read More

Text Analysis from Google Sheets

Federico Pascual shows how you can use MonkeyLearn to perform text analysis (including sentiment analysis and categorization) from a Google Sheets spreadsheet: Carrying out a customer survey, for example, can be useful to obtain crucial insights into the overall customer experience of your clients. But the data obtained from these surveys can be incredibly difficult […]

Read More

Categories

November 2017
MTWTFSS
« Oct Dec »
 12345
6789101112
13141516171819
20212223242526
27282930