# Hypergeometric distribution

Format: Hypergeo(n, D, M)

The Hypergeometric distribution is based on the Hypergeometric stochastic process. A Hypergeo(n, D, M) distribution models the number of items of a particular type that there will be in a sample of size n where that sample is drawn from a population of size M of which D are also of that particular type. Examples of the Hypergeometric distribution are shown below:

## Examples

### Tiles

A company has a stock of 2000 tiles which is known to contain 70 tiles that were not fired properly and will probably crack when exposed to the weather. The tiles are all mixed together and the inferior ones unfortunately cannot be visually identified. A customer orders 800 tiles. The number of faulty tiles he will receive can be estimated by Hypergeo(800, 70, 2000).

### Capture-release-recapture experiment to estimate population size

An example of using the Hypergeometric distribution and Bayes' Theorem to estimate the number of tigers on an island is shown in the section uncertainty about a population size. Several animals are captured and tagged, then released back to the wild. Some time later, another set of animals is captured. The proportion that have tags provide a means, via Bayes' Theorem, to estimate the total population assuming complete diffusion of the tagged sample into the population.

The mathematics behind the Hypergeometric distribution assumes sampling from the population without replacement, which becomes more significant a restriction the closer the sample size n gets to the population size M. Where n is small compared to M and D (the general guideline is that n < 0.1*M), the Hypergeo(n, D, M) looks very similar to Binomial(n, p) where p = D/M. The Hypergeometric distribution is closely related to the Inverse Hypergeometric distribution.

## Zero-modified version

When modeling or analyzing counting data, it is often desirable to modify probability of zero of the discrete distribution we use, to more accurately model the probability of "no event occurring". We can make two types of modifications to our distribution for this:

• Zero-inflated model - we increase the probability of zero.

• Zero-truncated model - we entirely remove the probability of zero events occurring.

## ModelRisk functions added to Microsoft Excel for the Hypergeometric distribution

VoseHypergeo generates random values from this distribution for Monte Carlo simulation, or calculates a percentile if used with a U parameter.

VoseHypergeoObject constructs a distribution object for this distribution.

VoseHypergeoProb returns the probability mass or cumulative distribution function for this distribution.

VoseHypergeoProb10 returns the log10 of the probability mass or cumulative distribution function.

VoseHypergeoFit generates values from this distribution fitted to data, or calculates a percentile from the fitted distribution.

VoseHypergeoFitObject constructs a distribution object of this distribution fitted to data.

VoseHypergeoFitP returns the parameters of this distribution fitted to data.

## ModelRisk functions added to Microsoft Excel for the Zero-Inflated Hypergeometric distribution

VoseZIHypergeo generates random values from this distribution for Monte Carlo simulation, or calculates a percentile if used with a U parameter.

VoseZIHypergeoObject constructs a distribution object for this distribution.

VoseZIHypergeoProb returns the probability mass or cumulative distribution function for this distribution.

VoseZIHypergeoProb10 returns the log10 of the probability mass or cumulative distribution function.

VoseZIHypergeoFit generates values from this distribution fitted to data, or calculates a percentile from the fitted distribution.

VoseZIHypergeoFitObject constructs a distribution object of this distribution fitted to data.

VoseZIHypergeoFitP returns the parameters of this distribution fitted to data.

## ModelRisk functions added to Microsoft Excel for the Zero-Truncated Hypergeometric distribution

VoseZTHypergeo generates random values from this distribution for Monte Carlo simulation, or calculates a percentile if used with a U parameter.

VoseZTHypergeoObject constructs a distribution object for this distribution.

VoseZTHypergeoProb returns the probability mass or cumulative distribution function for this distribution.

VoseZTHypergeoProb10 returns the log10 of the probability mass or cumulative distribution function.

VoseZTHypergeoFit generates values from this distribution fitted to data, or calculates a percentile from the fitted distribution.

VoseZTHypergeoFitObject constructs a distribution object of this distribution fitted to data.

VoseZTHypergeoFitP returns the parameters of this distribution fitted to data.