Applied statistics/Tutorials: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Nick Gardner
imported>Nick Gardner
Line 23: Line 23:


====The truth====
====The truth====
That there is a better than 50 percent probability that 2 people in any group of 23 or more will have the same birthday.


====Proof====
====Proof====

Revision as of 14:24, 11 January 2010

This article is developing and not approved.
Main Article
Discussion
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
Tutorials [?]
 
Tutorials relating to the topic of Applied statistics.

Rules of chance

The addition rule

For two mutually exclusive events, A and B,
the probability that either A or B will occur is equal to the probability that A will occur plus the probability that B will occur,

P(A or B) = P(A) + P(B).

The multiplication rule

For two independent (unrelated) events, A and B,
the probability that A and B will both occur is equal to the probability that A will occur multiplied by the probability that B will occur,

P(A and B) = P(A) x P(B)

Bayes' theorem

The probability that event A will occur, given that event B has occurred is equal to the probability that B will occur, given that A has occurred, mutiplied by the probability that A will occur divided by the probability that B will occur,

P(A/B) = P(B/A) x P(A)/P(B).

Common fallacies

The double birthday fallacy

The fallacy

That it is very unlikely that 2 people in a group of 24 have the same birthday.

The truth

That there is a better than 50 percent probability that 2 people in any group of 23 or more will have the same birthday.

Proof

The false positive fallacy

The fallacy

Students at the Harvard Medical School estimated that if a test of a disease that has a prevalence rate of 1 in 1000 has a false positive rate of 5%, there is a 95 per cent probability that a person who has been given a positive result actually has the disease.

The truth

The true probability is 2 per cent.

Proof

Let A denote the event of having the disease and, B the event of having been tested positive (for the purpose of applying Bayes'theorem),
Then P(B/A) which is the probability of having been tested positive when having the disease, can be taken as equal to 1;
And P(A) is the probability of having the disease, which with a prevalence of 1 in 1000 must be equal to 1/1000<
And P(B) is the probability of being tested positive, which can be arrived at by 3 steps:
Step 1 is to observe that since the prevalence of the disease is 1 in 1000, 999 persons out of every 1000 are healthy.
Step 2 is to recall that for each healthy person the probability of being tested positive is 5% or 1 in 20.
Step 3 is to apply the multiplication rule and get the answer:

P(B) = 999/1000 multiplied by 1/20 or, near enough 1/20.

So applying Bayes' theorem, the probability of having the disease, given that you have been tested positive is given by

P(A/B) = P(B/A) x P(A)/P(B),  or:
   =     1    x  (1/1000)/(1/20)   - which is 0.02, or 2%.

The prosecutor's fallacy

The fallacy (an example)

The fact that the accused's DNA matched that of the sperm found on the victim in a test which has a one in a thousand chance of giving a false positive result means that there is only a one in a thousand chance of the accused's innocence.

The truth

In fact it means nothing of the sort. One in a thousand of the rest of the population would give the same result, so if the accused is one of a population of a million, the test would have indicated a one in a thousand chance of guilt, not innocence. (This is not to argue that DNA evidence cannot be conclusive: it would be if it were also established that the crime must have been committed by, say, one out of ten suspects.}