Be Careful Of P-Hacking

Vincent Granville discusses the problem of p-hacking:

I read an article this morning, about a top Cornell food researcher having 13 studies retracted, see here. It prompted me to write this blog. It is about data science charlatans and unethical researchers in the Academia, destroying the value of p-values again, using a well known trick called p-hacking, to get published in top journals and get grant money or tenure. The issue is widespread, not just in academic circles, and make people question the validity of scientific methods. It fuels the fake “theories” of those who have lost faith in science.

The trick consists of repeating an experiment sufficiently many times, until the conclusions fit with your agenda. Or by being cherry-picking about the data you use, or even discarding observations deemed to have a negative impact on conclusions. Sometimes, causation and correlations are mixed up on purpose, or misleading charts are displayed. Sometimes, the author lacks statistical acumen.

Usually, these experiments are not reproducible. Even top journals sometimes accept these articles, due to

  • Poor peer-review process

  • Incentives to publish sensational material

Wansink is a charlatan.  But beyond p-hacking is Andrew Gelman and Eric Loken’s Garden of Forking Paths.  Gelman’s blog, incidentally (example), is where I originally learned about Wansink’s shady behaviors.  Gelman also warns us not to focus on the procedural, but instead on a deeper problem.

Related Posts

Data Science And Data Engineering In HDP 3.0

Saumitra Buragohain, et al, show off some of the things added to the Hortonworks Data Platform for data scientists and data engineers: We leverage the power of HDP 3.0 from efficient storage (erasure coding), GPU pooling to containerized TensorFlow and Zeppelin to enable this use case. We will the save the details for a different […]

Read More

Multi-Threaded R With Microsoft R Client

David Parr shows us how to get started with Microsoft R Client and performs some quick benchmarking: This message will pop up, and it’s worth noting as it’s got some information in it that you might need to think about: It’s worth noting that right now Microsoft r Client is lagging behind the current R version, and […]

Read More

1 Comment

Comments are closed

Categories

September 2018
MTWTFSS
« Aug Oct »
 12
3456789
10111213141516
17181920212223
24252627282930