Machine Learning Data Preparation Tips

Jen Underwood has some good tips when preparing data for a machine learning operation:

Data preparation for machine learning requires business domain expertise, bias awareness and an experimental thought process. Before preparing your data, you’ll first define a business problem solve. During that exercise, you’ll select an outcome metric and brainstorm potential input variables that influence it from many varied perspectives. From there you will begin identifying, collecting, cleaning, shaping and sampling data to run through automated machine learning model processes.

Note that it is also not unusual for relevant machine learning input data to occur outside of existing transactional processes. If that is the case, you can still start creating a first-generation machine learning model with existing data and continue to build new model versions over time as supplementary data is acquired.

Click through for the ten tips.

Related Posts

Estimating Used Car Prices

Kevin Jacobs wants to estimate the value of his car and shows how to set up a machine learning job to do this: As you can see, I collected the brand (Peugeot 106), the type (1.0, 1.1, …), the color of the car (black, blue, …) the construction year of the car, the odometer of […]

Read More

Introduction To Neural Nets

Ben Gorman has a two-part series introducing neural networks.  First, the basics behind neural networks: We can solve both of the above issues by adding an extra layer to our perceptron model. We’ll construct a number of base models like the one above, but then we’ll feed the output of each base model as input into another perceptron. This […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories

November 2017
MTWTFSS
« Oct Dec »
 12345
6789101112
13141516171819
20212223242526
27282930