It’s one of those neat ideas that one should know to be able to reason competently about the world we live in. Thanks to the Indian education system, I had not been exposed to any hint of that fascinating rule in my 18 years of attending school in India. Anyway, better late than never.
The late great John McCarthy insisted that “he who refuses to do arithmetic is doomed to talk nonsense.” I think he who is ignorant of Bayes’ rule cannot avoid talking nonsense about probabilistic events. Consider the question posed in the image above.
The hypothetical diagnostic test at hand is not perfect but it appears to be fairly accurate since the error rate is only 5% false positives. (There is no mention of false negatives in the formulation above. So let’s assume that it has zero false negatives.) The question then is “what is the chance that a person who tested positive actually has the disease?”
We are not told if the person has symptoms of the disease or not. We are told that the disease is quite uncommon — only 1 out of 1000 people in the population actually has the disease.
Given that the person tested positive, is it 95% probable that he has the disease? Many would say yes. Bayes would slap them silly and say no. The correct answer is around 2%. Even if the test comes out positive, the actual chances that the person has the disease is very small. If you are the betting kind, bet that he’s disease free.
Bayes’ rule appears a little intimidating when written out as a formula with probabilities. So I will not go into it. Look up the wiki page if you wish. Here I will give the logic of the result using simple arithmetic.
Imagine that the population is only 1000, and the test is administered to everyone. The test will come back positive for 50 people — since the test has a 5% false positive rate. But we know that only 1 out of those 1000 people actually have the disease. That means of the 50 people who tested positive, only 1 person has the disease and the 49 others who also tested positive don’t have the disease. Therefore there is only around a 2% chance that a person who tested positive actually has the disease.
Without the test, we’d estimate the chances of a person having the disease to be 0.01 percent. Doing the test bumps up that probability 20 times — to 2 percent. That is a huge increase but still not all that compelling because of the low prevalence of the disease in the population, combined with a pretty flawed test. A positive test in this case is only an indicator that more investigation is warranted.
It’s worth noting that a test with a 5% false positives would be quite good to have if the actual prevalence of the disease is very high, say, 60 percent. I leave it to you to do the arithmetic.
If you wish to learn Bayes’ theorem, Steven Pinker has a lecture on it (the source of the question I discussed) in his “Rationality” series of lectures that I mentioned in a previous post. Check it out for a link. I highly recommend the entire course. You may become more rational, and who knows, you may become a Bayesian.