Reducing Dimensionality

Antoine Guillot explains some of the basic concepts of variable reduction in a data analysis:

Each of these people can be represented as points in a 3 Dimensional space. With a gross approximation, each people is in a 50*50*200 (cm) cube. If we use a resolution of 1cm and three color channels, then can be represented by 1,000,000 variables.
On the other hand, the shadow is only in 2 dimensions and in black and white, so each shadow only needs 50*200=10,000 variables.
The number of variables was divided by 100 ! And if your goal is to detect human vs cat, or even men vs women, the data from the shadow may be enough.

Read on for intuitive discussions of techniques like principal component analysis and linear discriminant analysis.  H/T R-Bloggers

Related Posts

Bias Correction In Standard Deviation Estimates

John Mount explains how to perform bias correction and explains why it happens so rarely in practice: The bias in question is falling off at a rate of 1/n (where n is our sample size). So the bias issue loses what little gravity it ever may have ever had when working with big data. Most sources of noise will […]

Read More

Explaining Neural Networks With H2O

Shirin Glander explains some of the concepts behind neural networks using H2O as a guide: Before, when describing the simple perceptron, I said that a result is calculated in a neuron, e.g. by summing up all the incoming data multiplied by weights. However, this has one big disadvantage: such an approach would only enable our neural net […]

Read More


August 2017
« Jul Sep »