When Image Classifiers Look At Unknown Objects

Pete Warden explains that image classifiers aren’t magic:

As people, we’re used to being able to classify anything we see in the world around us, and we naturally expect machines to have the same ability. Most models are only trained to recognize a very limited set of objects though, such as the 1,000 categories of the original ImageNet competition. Crucially, the training process makes the assumption that every example the model sees is one of those objects, and the prediction must be within that set. There’s no option for the model to say “I don’t know”, and there’s no training data to help it learn that response. This is a simplification that makes sense within a research setting, but causes problems when we try to use the resulting models in the real world.

Back when I was at Jetpac, we had a lot of trouble convincing people that the ground-breaking AlexNet model was a big leap forward because every time we handed over a demo phone running the network, they would point it at their faces and it would predict something like “Oxygen mask” or “Seat belt”. This was because the ImageNet competition categories didn’t include any labels for people, but most of the photos with mask and seatbelt labels included faces along with the objects. Another embarrassing mistake came when they would point it at a plate and it would predict “Toilet seat”! This was because there were no plates in the original categories, and the closest white circular object in appearance was a toilet.

Read the whole thing.

Related Posts

Key Concepts of Convolutional Neural Networks

Srinija Sirobhushanam takes us through some of the key concepts around convolutional neural networks: How are convolution layer operations useful?CNN helps us look for specific localized image features like the edges in the image that we can use later in the network Initial layers to detect simple patterns, such as horizontal and vertical edges in […]

Read More

LSTM in Databricks

Vedant Jain shows us an example of solving a multivariate time series forecasting problem using LSTM networks: LSTM is a type of Recurrent Neural Network (RNN) that allows the network to retain long-term dependencies at a given time from many timesteps before. RNNs were designed to that effect using a simple feedback approach for neurons where the […]

Read More


July 2018
« Jun Aug »