Press "Enter" to skip to content

Identifying Distributions with knn in R

Abhijit Telang has an interesting post on identifying arbitrary distributions with the k-nearest-neighbor algorithm in R:

You can easily see how arbitrary the shapes can be almost magically discovered, through the principle of the nearest neighbor search.

The magic happens because the methodical approach of meeting and greeting the neighbors discovers more and more neighbors (and hence the visualization becomes denser and denser) as per the formation of the shape, and on the other hand, sparser and sparser as the traversal approaches the contours of those very shapes. The sparseness around the dense shapes provides the much-needed contrast to discover hidden shapes.

Read on for a very interesting explanation.