John Mount and Nina Zumel explain what p-values are and how people routinely misuse them:

The many things I happen to have issues with in common mis-use of

`p`

-values include:

This includes censored data bias, repeated measurement bias, and even outright fraud.`p`

-hacking.**“Statsmanship” (the deliberate use of statistical terminology for obscurity, not for clarity).**For example: saying`p`

instead of saying what you are testing such as “significance of a null hypothesis”.**Logical fallacies.**This is the (false) claim that`p`

being low implies that the probability that your model is good is high. At best a low-`p`

eliminates a null hypothesis (or even a family of them). But saying such disproof “proves something” is just saying “the butler did it” because you find the cook innocent (a simple case of a fallacy of an excluded middle).**Confusion of population and individual statistics.**This is the use of*deviation of sample means*(which typically decreases as sample size goes up) when*deviation of individual differences*(which typically does not decrease as sample size goes up) is what is appropriate . This is one of the biggest scams in data science and marketing science: showing that you are good at predicting aggregate (say, the mean number of traffic deaths in the next week in a large city) and claiming this means your model is good at predicting per-individual risk. Some of this comes from the usual statistical word games: saying “standard error” (instead of “standard error of the mean or population”) and “standard deviation” (“instead of standard deviation of individual cases”); with some luck somebody won’t remember which is which and be too afraid to ask.

Even if you know what p-values are, this is definitely worth reading, as it’s so easy to misuse p-values (even when I’m not on my Bayesian post hurling tomatoes at frequentists).

Comments closed