Hypergeometric distribution

MR_dice_icon.jpg Download a complete copy of this risk analysis resource for free here.

Format: VoseHypergeo(n, D, M, U)

Hypergeometric equations

The Hypergeo(n, D, M) distribution models the number of items of a particular type that there are in a sample of size n where that sample is drawn from a population of size M of which D are also of that particular type. Examples of the Hypergeometric distribution are shown below:

image250.gif

Examples
Tiles

A company has a stock of 2000 tiles which is known to contain 70 tiles that were not fired properly and will probably crack when exposed to the weather. The tiles are all mixed together and the inferior ones unfortunately cannot be visually identified. A customer orders 800 tiles. The number of faulty tiles he will receive can be estimated by Hypergeo(800, 70, 2000).

Capture-release-recapture experiment to estimate population size

An example of using the Hypergeometric distribution and Bayes' Theorem to estimate the number of tigers on an island is shown in the section uncertainty about a population size. Several animals are captured and tagged, then released back to the wild. Some time later, another set of animals is captured. The proportion that have tags provide a means, via Bayes' Theorem, to estimate the total population assuming complete diffusion of the tagged sample into the population.

Comments

The mathematics behind the Hypergeometric distribution assumes sampling from the population without replacement, which becomes more significant a restriction the closer the sample size n gets to the population size M. Where n is small compared to M and D (the general guideline is that n < 0.1*M), the Hypergeo(n, D, M) looks very similar to Binomial(n, p) where p = D/M. The Hypergeometric distribution is closely related to the Inverse Hypergeometric distribution.

Zero-modified version

When modeling or analyzing counting data, it is often desirable to modify probability of zero of the discrete distribution we use, to more accurately model the probability of "no event occurring". We can make two types of modifications to our distribution for this:

See also: Zero-modified counting distributions

VoseFunctions for this distribution

VoseHypergeo generates values from this distribution or calculates a percentile

VoseHypergeoObject constructs a distribution object for this distribution

VoseHypergeoProb returns the probability density or cumulative distribution function for this distribution

VoseHypergeoProb10 returns the log10 of the probability density or cumulative distribution function  

VoseHypergeoFit generates values from this distribution fitted to data, or calculates a percentile from the fitted distribution

VoseHypergeoFitObject constructs a distribution object of this distribution fitted to data

VoseHypergeoFitP returns the parameters of this distribution fitted to data 

See Also