The Hypergeometric Process

Description

The hypergeometric process occurs when one is sampling randomly without replacement from some population (as opposed to sampling with replacement in the Binomial Process), and where one is counting the number in that sample that have some particular characteristic. This is a very common type of scenario. For example, population surveys, herd testing, and lotto are all hypergeometric processes. In many situations, the population is very large in comparison to the sample and we can assume that if a sample was put back into the population, the probability is very small that it would be picked again. In that case, each sample would have the same probability of picking an individual with a particular characteristic: in other words this becomes a binomial process. When the population is not very large compared to the sample (a good rule is that the population is less than ten times the size of the sample) we cannot make a binomial approximation to the hypergeometric. This section discusses the distributions associated with the hypergeometric process.

The figure above demonstrates the four parameters of the Hypergeometric process: The population one is sampling from (M); the sub-population of interest (D), the number being randomly sampled from the population (n) and the number (s) in that sample that come from D. We recommend that you draw out a diagram like this when you are faced with a hypergeometric problem to keep that all clear!

Summary of results for the hypergeometric process

*Quantity*	*Formula*	*Notes*
Number of sub-population in the sample	s = VoseHypergeo(n,D,M)
Number of samples to observe s from the sub-population	n = s + VoseInvHypergeo(s,D,M)
Number of samples there were to have observed s from the sub-population	n = s + VoseInvHypergeo(s,D,M)	Where the last sample is known to have been from the sub-population
Number of samples n there were before having observed s from the sub-population		Where the last sample is not known to have been from the sub-population. This uncertainty distribution needs to be normalized.
Size of sub-population D	D = VoseHypergeoD(s,n,M)	This uncertainty distribution needs to be normalized.
Size of population M	M = VoseHypergeoD(s,n,D,max)	A maximum upper limit has to be placed on the possible range of values for M

Read on: Mixture processes

The Hypergeometric Process

Description

Summary of results for the hypergeometric process

Navigation