Identifying Distributions with knn in R

Abhijit Telang has an interesting post on identifying arbitrary distributions with the k-nearest-neighbor algorithm in R:

You can easily see how arbitrary the shapes can be almost magically discovered, through the principle of the nearest neighbor search.

The magic happens because the methodical approach of meeting and greeting the neighbors discovers more and more neighbors (and hence the visualization becomes denser and denser) as per the formation of the shape, and on the other hand, sparser and sparser as the traversal approaches the contours of those very shapes. The sparseness around the dense shapes provides the much-needed contrast to discover hidden shapes.

Read on for a very interesting explanation.

Related Posts

Dependencies as Risks

John Mount makes the point that packages dependencies are innately a risk: If your software or research depends on many complex and changing packages, you have no way to establish your work is correct. This is because to establish the correctness of your work, you would need to also establish the correctness of all of […]

Read More

Custom ggplot2 Fonts

Daniel Oehm shares two techniques for using custom fonts in your ggplot2 visuals: ggplot – You can spot one from a mile away, which is great! And when you do it’s a silent fist bump. But sometimes you want more than the standard theme. Fonts can breathe new life into your plots, helping to match […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

March 2019
MTWTFSS
« Feb  
 123
45678910
11121314151617
18192021222324
25262728293031