Object Detection Methods In Convolutional Neural Networks

Xindian Long explains some of the techniques which convolutional neural networks use to discern objects in images:

Can we do object detection in a smart way by only looking at some of the windows? The answer is yes. There are two approaches to find this subset of windows, which lead to two different categories of object detection algorithms.

  1. The first algorithm category is to do region proposal first. This means regions highly likely to contain an object are selected either with traditional computer vision techniques (like selective search), or by using a deep learning-based region proposal network (RPN). Once you have gathered the small set of candidate windows, you can formulate a set number of regression models and classification models to solve the object detection problem. This category includes algorithms like Faster R-CNN[1], R_FCN[2] and FPN-FRCN[3]. Algorithms in this category are usually called two-stage methods. They are generally more accurate, but slower than the single-stage method we introduce below.
  2. The second algorithm category only looks for objects at fixed locations with fixed sizes. These locations and sizes are strategically selected so that most scenarios are covered. These algorithms usually separate the original images into fixed size grid regions. For each region, these algorithms try to predict a fixed number of objects of certain, pre-determined shapes and sizes. Algorithms belonging to this category are called single-stage methods. Examples of such methods include YOLO[4], SSD[5] and RetinaNet[6]. Algorithms in this category usually run faster but are less accurate. This type of algorithm is often utilized for applications requiring real-time detection.

We’ll discuss two common object detection methods below in more detail.

This is a high-level explanation with no code, but it does a good job of describing at that level what is going on.

Related Posts

Analyzing Customer Churn With Keras And H2O

Shirin Glander has released code pertaining to a forthcoming book chapter: This is code that accompanies a book chapter on customer churn that I have written for the German dpunkt Verlag. The book is in German and will probably appear in February: https://www.dpunkt.de/buecher/13208/9783864906107-data-science.html.The code you find below can be used to recreate all figures and analyses from this […]

Read More

Working With Images In Spark 2.4

Tomas Nykodym and Weichen Xu give us an update on working with images in the most recent version of Apache Spark: An image data source addresses many of these problems by providing the standard representation you can code against and abstracts from the details of a particular image representation.Apache Spark 2.3 provided the ImageSchema.readImages API (see Microsoft’s post […]

Read More

Categories

November 2018
MTWTFSS
« Oct Dec »
 1234
567891011
12131415161718
19202122232425
2627282930