Avoiding Statistical Mistakes

Adrian Sampson explains some common mistakes in statistical analysis, particularly in computer science papers:

It’s tempting to think, when p \ge \alphapα, that you’ve found the opposite thing from the p < \alphap<αcase: that you get to conclude that there is no statistically significant difference between the two averages. Don’t do that!

Simple statistical tests like the tt-test only tell you when averages are different; they can’t tell you when they’re the same. When they fail to find a difference, there are two possible explanations: either there is no difference or you haven’t collected enough data yet. So when a test fails, it could be your fault: if you had run a slightly larger experiment with a slightly larger NN, the test might have successfully found the difference. It’s always wrong to conclude that the difference does not exist.

It’s an interesting read.  H/T Emmanuelle Rieuf.

Related Posts

Learning About Big Data Clusters

Kevin Chant shares resources for getting started with SQL Server Big Data Clusters: In a previous post I shared current SQL Server 2019 learning resources, which you can view in detail here. However, SQL Server 2019 Big Data Clusters are very involved. So, I thought I better dedicate a whole post to further learning resources for […]

Read More

What Makes for Good Coding Style

Brent Yorgey spends some time thinking about good coding style: What is good code style? You probably have some opinions about this. In fact, I’m willing to bet you might even have some very strong opinions about this; I know I do. Whether consciously or not, we tend to frame good coding practices as a moral issue. Following good coding […]

Read More


December 2016
« Nov Jan »