Standard Deviation Estimation

Kevin Feasel



Dan Goldstein gives a rule of thumb for getting standard deviations for various distributions:

Say you’ve got 30 numbers and a strong urge to estimate their standard deviation. But you’ve left your computer at home. Unless you’re really good at mentally squaring and summing, it’s pretty hard to compute a standard deviation in your head. But there’s a heuristic you can use:

Subtract the smallest number from the largest number and divide by four

Let’s call it the “range over four” heuristic. You could, and probably should, be skeptical. You could want to see how accurate the heuristic is. And you could want to see how the heuristic’s accuracy depends on the distribution of numbers you are dealing with.

Sometimes you just don’t have STDEV() available.

Related Posts

Building Dynamic Row Headers With ML Services

Dave Mason tries to get around his RESULT SETS limitation when using SQL Server Machine Learning Services: The columns in the data frame clearly have names, but SQL Server isn’t using them. The data frame columns have types in R too (more on this in a moment). Now that makes me wonder about the data […]

Read More

Defining Result Sets With ML Services

Dave Mason covers a pain point in SQL Server Machine Learning Services: The example above is so simple, defining the RESULT SETS poses no problems. But what if the format of the output isn’t known at design time? R (or Python) might take the input data set and add, remove, or change columns conditionally. Further, […]

Read More


June 2016
« May Jul »