Bias Correction In Standard Deviation Estimates

John Mount explains how to perform bias correction and explains why it happens so rarely in practice:

The bias in question is falling off at a rate of 1/n (where n is our sample size). So the bias issue loses what little gravity it ever may have ever had when working with big data. Most sources of noise will be falling off at a slower rate of 1/sqrt(n), so it is unlikely this bias is going to be the worst feature of your sample.

But let’s pretend the sample size correction indeed is an important point for a while.

Under the “no bias allowed” rubric: if it is so vitally important to bias-correct the variance estimate, would it not be equally critical to correct the standard deviation estimate?

The practical answer seems to be: no. The straightforward standard deviation estimate itself is biased (it has to be, as a consequence of Jensen’s inequality). And pretty much nobody cares, corrects it, or teaches how to correct it, as it just isn’t worth the trouble.

This is a good explanation of the topic as well as the reason people make these corrections so rarely.

Related Posts

Variable Screening With vtreat

John Mount explains how you can use vtreat for determining variable importance: Part of the vtreat philosophy is to assume after the vtreat variable processing the next step is a sophisticated supervised machine learningmethod. Under this assumption we assume the machine learning methodology (be it regression, tree methods, random forests, boosting, or neural nets) will handle issues of redundant variables, joint […]

Read More

Reviewing Word Associations With R

Julia Silge does some exploratory analysis on the Small World of Words project: The Small World of Words project focuses on word associations. You can try it out for yourself to see how it works, but the general idea is that the participant is presented with a word (from “telephone” to “journalist” to “yoga”) and is then […]

Read More


November 2018
« Oct Dec »