Reducing Dimensionality

Antoine Guillot explains some of the basic concepts of variable reduction in a data analysis:

Each of these people can be represented as points in a 3 Dimensional space. With a gross approximation, each people is in a 50*50*200 (cm) cube. If we use a resolution of 1cm and three color channels, then can be represented by 1,000,000 variables.
On the other hand, the shadow is only in 2 dimensions and in black and white, so each shadow only needs 50*200=10,000 variables.
The number of variables was divided by 100 ! And if your goal is to detect human vs cat, or even men vs women, the data from the shadow may be enough.

Read on for intuitive discussions of techniques like principal component analysis and linear discriminant analysis.  H/T R-Bloggers

Related Posts

Road Construction Incentive Contracts And R

Sebastian Kranz promotes an interesting RTutor project: Patrick Bajari and Gregory Lewis have collected a detailed sample of 466 road construction projects in Minnesota to study this question in their very interesting article Moral Hazard, Incentive Contracts and Risk: Evidence from Procurement in the Review of Economic Studies, 2014.They estimate a structural econometric model and find that […]

Read More

Analyzing Customer Churn With Keras And H2O

Shirin Glander has released code pertaining to a forthcoming book chapter: This is code that accompanies a book chapter on customer churn that I have written for the German dpunkt Verlag. The book is in German and will probably appear in February: https://www.dpunkt.de/buecher/13208/9783864906107-data-science.html.The code you find below can be used to recreate all figures and analyses from this […]

Read More

Categories

August 2017
MTWTFSS
« Jul Sep »
 123456
78910111213
14151617181920
21222324252627
28293031