During the process of data analysis one of the most crucial steps is to identify and account for outliers, observations that have essentially different nature than most other observations. Their presence can lead to untrustworthy conclusions. The most complicated part of this task is to define a notion of “outlier”. After that, it is straightforward to identify them based on given data.
After reading this post you will know:
Most basic outlier detection techniques.
A way to implement them using
A way to combine their results in order to obtain a new outlier detection method.
A way to discover notion of “diamond quality” without prior knowledge of this topic (as a happy consequence of previous point).
Read the whole thing. H/T R-Bloggers