Press "Enter" to skip to content

Naive Bayes Against Large Data Sets

Catherine Bernadorne walks us through using Naive Bayes for sentiment analysis:

The more data that is used to train the classifier, the more accurate it will become over time. So if we continue to train it with actual results in 2017, then what it predicts in 2018 will be more accurate. Also, when Bayes gives a prediction, it will attach a probability. So it may answer the above question as follows: “Based on past data, I predict with 60% confidence that it will rain today.”

So the classifier is either in training mode or predicting mode. It is in training mode when we are teaching it. In this case, we are feeding it the outcome (the category). It is in predicting mode when we are giving it the features, but asking it what the most likely outcome will be.

My contribution is a joke that I heard last night:  a Bayesian statistician hears hooves clomping the ground.  He turns around and sees a tiger.  Therefore, he decides that it must be a zebra.  First time I’d heard that joke, and as a Bayesian zebra-spotter, I enjoyed it.