Today I came across an interesting mathematical puzzle on probability and wanted to share here. This could be a long read but I hope you would find it interesting 🙂
So here’s the Case: A certain disease “A” is known to have been affecting 1 out of 1000 people and it is quite difficult to identify its reliability. A pharmaceutical company has devised a new detection test “B” and claims that it is 99% accurate. (i.e. if a person is tested, it will give correct diagnosis in 99% of the cases). A person “C” is tested using the test B and it indicates that he is suffering from disease A.
What is the probability that the person C is actually suffering from the disease A?
Understanding the crux:
The first reaction of someone encountering this statement for the first time (even doctors apparently!) is likely to be “Of course 99%…The test is after all 99% reliable…..” Well, it isn’t quite cut and dried as all that.
I’ve made a probability tree to simplify things a little bit and tried representing cases with +ve tests in red coloured boxes.
As you can see here in the diagram, there can be 2 cases – person does suffer from the disease or does not! The probability that the person has the disease is 0.001 since 1 in 1000 people are said to be affected and the probability he doesn’t have the disease is 0.999 since 999 out of 1000 people are not affected. Further for each of these cases, 2 sub cases are further branched out. This is since the test B may either throw a positive reading or a negative reading for the person.
In the first sub case, if the person is actually suffering from disease A, the probability of the disease getting rightly detected by the test “B” here is 0.99 (remember, the accuracy is 99% as claimed by the maker?) and there’s of course 0.01 probability that the reading may throw up a -ve reading (considering the fact that test is 1% inaccurate too). The boxes in the last column represents the effective probability which is finally derived on multiplying the probabilities across each step.
Likewise the effective probability for the other subcases has been listed out here in the last column (refer image).
Coming back to our original question, we can now find out that the probability that the person C actually suffers from the
disease A [Consider both the red coloured boxes in last column – The positive test subcases] =
Total number of favorable outcomes/ Total number of all outcomes=
0.00099/(0.00099+0.00099)=0.0902 = 9%
Or in another words, there is less than 10% chance that person is actually suffering from the disease even when tested postive!
So how did this happen? Although the test is 99% accurate, it is 99% accurate for those who aren’t suffering from the disease as well i.e. It indicates that 1% of those who would test negative in the disease detection test are actually suffering from the disease.
And since the number of folks not suffering from the disease is very high (999 out of every 1000 persons), even 1% of that is very significant and the number of such “false positives” will outweigh “true positives” in this case by a factor of 10.
I hope this post was helpful . Whew, let’s take a break now!