Preparing Text Data For Natural Language Processing

Shirin Glander takes us through the process of preparing natural language data for machine learning using Keras:

As with any neural network, we need to convert our data into a numeric format; in Keras and TensorFlow we work with tensors. The IMDB example data from the keras package has been preprocessed to a list of integers, where every integer corresponds to a word arranged by descending word frequency.

So, how do we make it from raw text to such a list of integers? Luckily, Keras offers a few convenience functions that make our lives much easier.

This is a very nice tutorial if you’re new to the process.

Related Posts

Naive Bays in R

Zulaikha Lateef takes us through the Naive Bayes algorithm and implementations in R: Naive Bayes is a Supervised Machine Learning algorithm based on the Bayes Theorem that is used to solve classification problems by following a probabilistic approach. It is based on the idea that the predictor variables in a Machine Learning model are independent of […]

Read More

Exporting Data from Power Query with R

Leila Etaati shows how you can use R to export data from Power Query to disk or to SQL Server: There is always a discussion on how to store back the data from Power BI to local computer or SQL Server Databases, in this short blog, I will show how to do it by writing […]

Read More


January 2019
« Dec Feb »