Naive Bayes In Python

Kislay Keshari explains the Naive Bayes algorithm and shows an implementation in Python:

Naive Bayes in the Industry

Now that you have an idea of what exactly Naive Bayes is and how it works, let’s see where it is used in the industry.

RSS Feeds

Our first industrial use case is News Categorization, or we can use the term ‘text classification’ to broaden the spectrum of this algorithm. News on the web is rapidly growing where each news site has its own different layout and categorization for grouping news. Companies use a web crawler to extract useful text from HTML pages of news articles to construct a Full Text RSS. The contents of each news article is tokenized (categorized). In order to achieve better classification results, we remove the less significant words, i.e. stop, from the document. We apply the naive Bayes classifier for classification of news content based on news code.

It’s a good overview of the topic and a particular implementation in Python.  Naive Bayes is a technique which you want in the bag:  there are a lot of techniques which tend to be better in specific domains, but Naive Bayes is easy to implement and usually provides acceptable performance.

Related Posts

Comparing TensorFlow Versus PyTorch

Anirudh Rao compares PyTorch to TensorFlow: For small-scale server-side deployments both frameworks are easy to wrap in e.g. a Flask web server. For mobile and embedded deployments, TensorFlow works really well. This is more than what can be said of most other deep learning frameworks including PyTorch. Deploying to Android or iOS does require a non-trivial amount of work in TensorFlow. You don’t have to rewrite the entire inference portion of your model in Java or C++. […]

Read More

Data Science And Data Engineering In HDP 3.0

Saumitra Buragohain, et al, show off some of the things added to the Hortonworks Data Platform for data scientists and data engineers: We leverage the power of HDP 3.0 from efficient storage (erasure coding), GPU pooling to containerized TensorFlow and Zeppelin to enable this use case. We will the save the details for a different […]

Read More

Categories

August 2018
MTWTFSS
« Jul Sep »
 12345
6789101112
13141516171819
20212223242526
2728293031