Using NLP To Find Similar Facebook Posts

The folks at Knoyd put together a word embedding example by scraping a Python Facebook group:

We are going to represent the content of a Facebook post using word embeddings and comparing the transformed posts using word mover’s distance. The combination of both have shown lower k-nearest neighbor-document classification error rates compared to other state of the art techniques.

The advantage of word embeddings is that the words which have similar meanings but don’t have any letters in common will still have similar vectors (be close) in the embedded space (e.g. lion and tiger).

There’s a good high-level discussion of techniques in this post.

Related Posts

Combining Keras With Apache MXNet

Lai Wei, et al, show how to build a neural network in Keras 2 using MXNet as the engine: Distributed training with Keras 2 and MXNet This article shows how to install Keras-MXNet and demonstrates how to train a CNN and an RNN. If you tried distributed training with other deep learning engines before, you […]

Read More

Tuning xgboost Models In R

Gabriel Vasconcelos has a new series on tuning xgboost models: My favourite Boosting package is the xgboost, which will be used in all examples below. Before going to the data let’s talk about some of the parameters I believe to be the most important. These parameters mostly are used to control how much the model […]

Read More

Categories

August 2017
MTWTFSS
« Jul Sep »
 123456
78910111213
14151617181920
21222324252627
28293031