Press "Enter" to skip to content

Why Does Empirical Variance Use n-1 Instead Of n?

Sebastian Sauer gives us a simulation showing why we use n-1 instead of n as the denominator when calculating the variance of a sample:

Our results show that the variance of the sample is smaller than the empirical variance; however even the empirical variance too is a little too small compared with the population variance (which is 1). Note that sample size was n=10 in each draw of the simulation. With sample size increasing, both should get closer to the “real” (population) sample size (although the bias is negligible for the empirical variance). Let’s check that.

This is an R-heavy post and does a great job of showing that it’s necessary, and ends with  recommended reading if you want to understand the why.