Probability distribution: Difference between revisions

From Citizendium
Jump to navigation Jump to search
imported>Aleksander Stos
m (cat. live)
imported>Ragnar Schroder
(Total rewrite after moving former content into new articles.)
Line 2: Line 2:




There are two main classes of probability distributions: Discrete and continuous. Discrete distributions describe variables that take on discrete values only (typically the positive integers), while continuous distributions describe variables that can take on arbitrary values in a continuum (typically the real numbers).
There are two main classes of probability distributions: Discrete and continuous. [[Discrete probability distribution|Discrete  distributions]] describe variables that take on discrete values only (typically the positive integers), while [[continuous probability distribution|continuous distributions]] describe variables that can take on arbitrary values in a continuum (typically the real numbers).


In more advanced studies, one also comes across hybrid distributions.
In more advanced studies, one also comes across hybrid distributions.




==A gentle introduction to the concept==
Faced with a set of mutually exclusive propositions or possible outcomes, people intuitively put "degrees of belief" on the different alternatives. 


== Introduction to discrete distributions ==


Faced with a list of mutually exclusive propositions or possible outcomes, people intuitively put "degrees of belief" on the different alternatives.  
===A simple example===
When you wake up in the morning one of three thing may happen that day:
*You will get hit by a meteor falling in from space.
*You will not get hit by a meteor falling in from space, but you'll be struck by lightning.
*Neither will happen.


For instance, consider the following 2 propositions:
Most people will usually intuit a small to zero belief in the first alternative (although it is possible,  and is known to actually have occurred),  a slightly larger belief in the second, and a rather strong belief in the third.


*During next week there will be rain in London.
In mathematics,  such intuitive ideas are captured,  formalized and made precise by the concept of a [[discrete probability distribution]].  
*During next week there will be no rain in London.


Based on available information about the record of past weather in England,  people tend intuitively to put more "belief" in the first possibility than the second.


For another, slightly more complex example, consider the following 6 propositions:
===A more complicated example===
Rather than a simple list of propositions or outcomes like the one above, one may have a to deal with a continuum.


*During next week no automobiles will enter I-45 from Galveston Island.
For example, consider the next new person you'll get to know. How tall will he or she be? 
*During next week 1-1,000 automobiles will enter I-45 from Galveston Island.
*During next week 1,001-10,000 automobiles will enter I-45 from Galveston Island.
*During next week 10,001-100,000 automobiles will enter I-45 from Galveston Island.
*During next week 100,001-1,000,000 automobiles will enter I-45 from Galveston Island.
*During next week more than 1,000,000 automobiles will enter I-45 from Galveston Island.


Based on local information about past traffic patternspeople will intuitively distribute a "degree of belief" among the propositions.
This can be formulated as an [[uncountably infinite set]] of propositionsor as a ditto set of possible outcomes of a [[random experiment]].


If every "degree of belief" is a real numbers ranging from 0 to 1 and their sum is exactly 1,  we have a [[discrete probability distribution]],  and each "degree of belief" is called a [[probability]].
Let's look at three of these propositions in detail:


A [[discrete probability distribution]] is thus nothing more than a mathematically precise version of a common intuitive phenomenon,  reflecting the human mind's ability to deduce and infer the physical propensities of external systems.
...
*The person is exactly 1.722222222... m tall.
...
*The person is exactly 2.3 m tall.
...
*The person is exactly 25.0 m tall.
...


As a simple illustration as to how the individual probabilities may be obtained in practice, consider the expected results for a coin toss experiment. While we don't know the results for any individual toss of the coin, we can expect the results to average out to be heads half the time and tails half the time (assuming a fair coin). 
Clearly, we don't believe the person will be 25.0 meters tall.  But neither do we believe any of the other propositions.  Why should any particular proposition turn out to be the ''exact'' correct one among an [[infinity]] of others?


But we still somehow feel that the first propostion listed is more "likely" than the second,  which again is more "likely" than the third.


=== Formal definition===
Also,  we feel that some "ranges" are more likely than others,  f.i. a height ''between'' 1.5 and 1.8 meters feels "likely",  a height ''between'' 2.2 and 2.5 m seems possible but unlikely,  and a height larger than that seems safe to exclude.


Given a countable set S={s0, ... ,sn, ... } of mutually exclusive propositions (or possible outcomes of an experiment).
In mathematics, such intuitive ideas are captured, formalized and made precise by the concept of a [[continuous probability distribution]].  
Let A=[0,1}, a proper subset of the [[real number]]s '''R'''.
A discrete probability distribution is then a subset T={(s0,t0),...,(sn,tn), ...} of the cartesian product <math>S \times A</math>,  such that all the ti sum to exactly 1.  




=== Important examples ===
==A formal introduction to the concept==
===Discrete probability distribution===
Let S be an [[enumerable set]].  S={..., s_0,s_1, ...}.
Let f be a function from S to <math>R</math> such that
*f(s) &isin; [0,1] for all s &isin; S
*The sum <math> \sum_{i=-Inf}^{i=Inf} f(s_i) </math> exists and evaluates to exactly 1.


[[Bernoulli distribution]] - Each experiment is either a 1 ("success") with probability p or a 0 ("failure") with probability 1-p.  An example would be tossing a coin.  If the coin is fair,  your probability for "success" will be exactly 50%.
Then f is a probability distribution over the set S.


An experiment where the outcome follows the Bernoulli distribution is called a Bernoulli trial.


[[Binomial distribution]] - Each experiment consists of a series of identical Bernoulli trials, f.i. tossing a coin n times, and counting the number of successes.
===Continuous probability distribution===
Let S be an [[ordered set|ordered]] [[uncountably infinite sets|uncountably infinite]] [[set]].  


[[Uniform distribution]] - Each experiment has a certain finite number of possible outcomes,  each with the same probability.  Throwing a fair die, f.i., has six possible outcomes,  each with the same probability.  The Bernoulli distribution with p=0.5 is another example.  
Let f be a function from S to <math>R</math> such that
*f(s) &isin; [0,1] for all s &isin; S
*The [[Riemann integral]] <math> \int_{-inf}^{inf} f(s)\,ds</math> exists and evaluates to exactly 1.


[[Poisson distribution]] - Given an experiment where we have to wait for an event to happen,  and the expected remainding waiting time is independent of how long we've already waited.  Then the number of events per unit time will be a Poisson distributed variable.
Then f is a probability distribution over the set S.


[[Geometric distribution]] -


[[Negative Binomial distribution]] -




==References==
*[http://www.time.com/time/magazine/article/0,9171,821063,00.html]Person actually hit by a meteorite.


== Introduction to continuous distributions ==
We may have a set of mutually exclusive propositions or possible outcomes that is neither discrete nor finite.
For instance,  consider the following infinite set of propositions:
*The height of the next individual we'll meet is 0.0 meters.
*...
*The height of the next individual we'll meet is 1.0 meters.
*...
The height of an individual is a real number that in principle may take on any value. 
The difficulty here is that there are too many propositions, we would never expect any one picked in advance to turn out to be the right one.
The solution is essentially a trick:  One accepts that the probability for any one proposition is essentially zero (or [[infinitesimal]]),  and instead concentrates on partitions of the set of propositions,  in practice intervals in R. The probability assigned to an interval around f.i. 1.7 meters should thus be much larger than the one assigned to a similar size interval around f.i. 2.5 meters.
=== Formal definition ===
A continuous probability distribution is a function that, whren integrated over a set representing an event, gives the probability of that event. In other words,
<math>p(A) = \int_A f\,d\mu</math>.
In practice, the "event" A corresponds to a range of values, say <math>a \le x \le b</math>, so the above integral will tae the more familiar form
<math>p(a \le x \le b) = \int_a^b f\,dx</math>
=== Important examples ===
[[Gaussian distribution]] - Also known as the normal distribution. 
[[Uniform continuous distribution]] -
[[Exponential distribution]] -  Given a sequence of events,  and the waiting time between two consequitive events is independent of how long we've already waited,  the time between events follows the exponential distribution.
[[Gamma distribution]] -
[[Rayleigh distribution]] -
[[Cauchy distribution]] -
[[Laplacian distribution]] -
==References==


==See also==
==See also==

Revision as of 20:12, 26 June 2007

A probability distribution is a mathematical approach to quantifying uncertainty.


There are two main classes of probability distributions: Discrete and continuous. Discrete distributions describe variables that take on discrete values only (typically the positive integers), while continuous distributions describe variables that can take on arbitrary values in a continuum (typically the real numbers).

In more advanced studies, one also comes across hybrid distributions.


A gentle introduction to the concept

Faced with a set of mutually exclusive propositions or possible outcomes, people intuitively put "degrees of belief" on the different alternatives.


A simple example

When you wake up in the morning one of three thing may happen that day:

  • You will get hit by a meteor falling in from space.
  • You will not get hit by a meteor falling in from space, but you'll be struck by lightning.
  • Neither will happen.

Most people will usually intuit a small to zero belief in the first alternative (although it is possible, and is known to actually have occurred), a slightly larger belief in the second, and a rather strong belief in the third.

In mathematics, such intuitive ideas are captured, formalized and made precise by the concept of a discrete probability distribution.


A more complicated example

Rather than a simple list of propositions or outcomes like the one above, one may have a to deal with a continuum.

For example, consider the next new person you'll get to know. How tall will he or she be?

This can be formulated as an uncountably infinite set of propositions, or as a ditto set of possible outcomes of a random experiment.

Let's look at three of these propositions in detail:

...

  • The person is exactly 1.722222222... m tall.

...

  • The person is exactly 2.3 m tall.

...

  • The person is exactly 25.0 m tall.

...

Clearly, we don't believe the person will be 25.0 meters tall. But neither do we believe any of the other propositions. Why should any particular proposition turn out to be the exact correct one among an infinity of others?

But we still somehow feel that the first propostion listed is more "likely" than the second, which again is more "likely" than the third.

Also, we feel that some "ranges" are more likely than others, f.i. a height between 1.5 and 1.8 meters feels "likely", a height between 2.2 and 2.5 m seems possible but unlikely, and a height larger than that seems safe to exclude.

In mathematics, such intuitive ideas are captured, formalized and made precise by the concept of a continuous probability distribution.


A formal introduction to the concept

Discrete probability distribution

Let S be an enumerable set. S={..., s_0,s_1, ...}. Let f be a function from S to such that

  • f(s) ∈ [0,1] for all s ∈ S
  • The sum exists and evaluates to exactly 1.

Then f is a probability distribution over the set S.


Continuous probability distribution

Let S be an ordered uncountably infinite set.

Let f be a function from S to such that

  • f(s) ∈ [0,1] for all s ∈ S
  • The Riemann integral exists and evaluates to exactly 1.

Then f is a probability distribution over the set S.



References

  • [1]Person actually hit by a meteorite.


See also

Related topics

External links