Applied statistics/Tutorials: Difference between revisions
imported>Nick Gardner |
imported>Nick Gardner |
||
Line 23: | Line 23: | ||
====The truth==== | ====The truth==== | ||
That there is a better than 50 percent probability that 2 people in any group of 23 or more will have the same birthday. | |||
====Proof==== | ====Proof==== |
Revision as of 14:24, 11 January 2010
Rules of chance
The addition rule
For two mutually exclusive events, A and B,
the probability that either A or B will occur is equal to the probability that A will occur plus the probability that B will occur,
- P(A or B) = P(A) + P(B).
The multiplication rule
For two independent (unrelated) events, A and B,
the probability that A and B will both occur is equal to the probability that A will occur multiplied by the probability that B will occur,
- P(A and B) = P(A) x P(B)
Bayes' theorem
The probability that event A will occur, given that event B has occurred is equal to the probability that B will occur, given that A has occurred, mutiplied by the probability that A will occur divided by the probability that B will occur,
- P(A/B) = P(B/A) x P(A)/P(B).
Common fallacies
The double birthday fallacy
The fallacy
That it is very unlikely that 2 people in a group of 24 have the same birthday.
The truth
That there is a better than 50 percent probability that 2 people in any group of 23 or more will have the same birthday.
Proof
The false positive fallacy
The fallacy
Students at the Harvard Medical School estimated that if a test of a disease that has a prevalence rate of 1 in 1000 has a false positive rate of 5%, there is a 95 per cent probability that a person who has been given a positive result actually has the disease.
The truth
The true probability is 2 per cent.
Proof
Let A denote the event of having the disease and, B the event of having been tested positive (for the purpose of applying Bayes'theorem),
Then P(B/A) which is the probability of having been tested positive when having the disease, can be taken as equal to 1;
And P(A) is the probability of having the disease, which with a prevalence of 1 in 1000 must be equal to 1/1000<
And P(B) is the probability of being tested positive, which can be arrived at by 3 steps:
Step 1 is to observe that since the prevalence of the disease is 1 in 1000, 999 persons out of every 1000 are healthy.
Step 2 is to recall that for each healthy person the probability of being tested positive is 5% or 1 in 20.
Step 3 is to apply the multiplication rule and get the answer:
- P(B) = 999/1000 multiplied by 1/20 or, near enough 1/20.
- P(B) = 999/1000 multiplied by 1/20 or, near enough 1/20.
So applying Bayes' theorem, the probability of having the disease, given that you have been tested positive is given by
- P(A/B) = P(B/A) x P(A)/P(B), or:
- = 1 x (1/1000)/(1/20) - which is 0.02, or 2%.
- P(A/B) = P(B/A) x P(A)/P(B), or:
The prosecutor's fallacy
The fallacy (an example)
The fact that the accused's DNA matched that of the sperm found on the victim in a test which has a one in a thousand chance of giving a false positive result means that there is only a one in a thousand chance of the accused's innocence.
The truth
In fact it means nothing of the sort. One in a thousand of the rest of the population would give the same result, so if the accused is one of a population of a million, the test would have indicated a one in a thousand chance of guilt, not innocence. (This is not to argue that DNA evidence cannot be conclusive: it would be if it were also established that the crime must have been committed by, say, one out of ten suspects.}