# The Bayesian Trap

2017-05-08

If you get a blood test to diagnose a rare disease, and the test (which is very accurate) comes back positive, what’s the chance you have the disease? Well if “rare” means only 1 in a thousand people have the disease, and “very accurate” means the test returns the correct result 99% of the time, the answer is … just 9%. There’s less than a 1 in 10 chance you actually have the disease (which is why doctor will likely have you tested a second time).

Now that result might seem surprising, but it makes sense if you apply Bayes Theorem. (A simple way to think of it is that in a population of 1000 people, 10 people will have a positive test result, plus the one who actually has the disease. One in eleven of the positive results, or 9%, actually detect a true disease.)

This goes to sensitivity/recall (in the medical field, they call it sensitivity; in the documents world and in the Microsoft ML space, they call it recall):  True positives / (True positives + False negatives).  Supposing a million people, 1000 will have the disease.  Of those 1000, we expect the test to find 990 (99%).  Of the 999,000 people who don’t have the disease, we expect the test to produce 9990 false negatives (1%).  990 / (990 + 9990) = 9%.

## Multi-Shot Games

2017-05-26

Dan Goldstein explains a counter-intuitive probability exercise: Peter Ayton is giving a talk today at the London Judgement and Decision Making Seminar Imagine being obliged to play Russian roulette – twice (if you are lucky enough to survive the first game). Each time you must spin the chambers of a six-chambered revolver before pulling the trigger. […]

## Visualizing Emergency Room Visits

2017-05-26

Eugene Joh has a great blog post showing how to parse ICD-9 codes using regular expressions and then visualize the results as a treemap: It looks like there is a header/title at [1], numeric grouping  at [2] “1.\tINFECTIOUS AND PARASITIC DISEASES”,  subgrouping by ICD-9 code ranges, at [3] “Intestinal infectious diseases (001-009)” and then 3-digit ICD-9 […]