Machine Learning Data Preparation Tips

Jen Underwood has some good tips when preparing data for a machine learning operation:

Data preparation for machine learning requires business domain expertise, bias awareness and an experimental thought process. Before preparing your data, you’ll first define a business problem solve. During that exercise, you’ll select an outcome metric and brainstorm potential input variables that influence it from many varied perspectives. From there you will begin identifying, collecting, cleaning, shaping and sampling data to run through automated machine learning model processes.

Note that it is also not unusual for relevant machine learning input data to occur outside of existing transactional processes. If that is the case, you can still start creating a first-generation machine learning model with existing data and continue to build new model versions over time as supplementary data is acquired.

Click through for the ten tips.

Related Posts

Building A Neural Network In R With Keras

Pablo Casas walks us through Keras on R: One of the key points in Deep Learning is to understand the dimensions of the vector, matrices and/or arrays that the model needs. I found that these are the types supported by Keras. In Python’s words, it is the shape of the array. To do a binary […]

Read More

Bayesian Neural Networks

Yoel Zeldes thinks about neural networks from a different perspective: The term logP(w), which represents our prior, acts as a regularization term. Choosing a Gaussian distribution with mean 0 as the prior, you’ll get the mathematical equivalence of L2 regularization. Now that we start thinking about neural networks as probabilistic creatures, we can let the fun […]

Read More

Categories

November 2017
MTWTFSS
« Oct Dec »
 12345
6789101112
13141516171819
20212223242526
27282930