Feature And Text Classification Using Naive Bayes In R

I wrap up my series on the Naive Bayes class of algorithms, finally writing some code along the way:

Now we’re going to look at movie reviews and predict whether a movie review is a positive or a negative review based on its words. If you want to play along at home, grab the data set, which is under 3MB zipped in 2000 reviews in total.

Unike last time, I’m going to break this out into sections with commentary in between. If you want the full script with notebook, check out the GitHub repo I put together for this talk.

Assuming I ever get a chance to do this talk again, I’m probably going to change the data sets in the example given how overplayed iris is.

Related Posts

Bayes’ Theorem In A Picture

Stephanie Glen gives us the basics of Bayes’ Theorem in a picture: Bayes’ Theorem is a way to calculate conditional probability. The formula is very simple to calculate, but it can be challenging to fit the right pieces into the puzzle. The first challenge comes from defining your event (A) and test (B); The second […]

Read More

Tidying Video Game Data

Arvid Kingl has a fun article analyzing data from an open-source video game and applying tidy data principles to it: You will learn what key principles a tidy data set adheres to, why it is useful to follow them consequently, and how to clean the data you are given. Tidying is also a great way […]

Read More

Categories

January 2019
MTWTFSS
« Dec Feb »
 123456
78910111213
14151617181920
21222324252627
28293031