A discrete variable with a long tail distribution | Vose Software

# A discrete variable with a long tail distribution

We are sometimes in a position where we wish to model a discrete variable that has a long-tailed distribution. This section describes a number of distributions one might use.

Infinite-tailed discrete distributions

The eight discrete distributions offered by ModelRisk that have a tail to infinity are the Negative Binomial (of which the Geometric is a special case), the BetaNegBin (and the BetaGeometric), the Delaporte, the Logarithmic, the Polya and the Poisson. The variance and the mean of a Poisson distribution are both equal to l. However, a NegBin(s,p) distribution, has a mean m and variance V as follows:

Thus, while a Poisson distribution has a ratio of variance to mean of one, the NegBin distribution has a ratio V/m = 1/p, which is always greater than one. Since a Negative Binomial distribution can be constructed as a Gamma mixture of Poisson distributions, it follows that a Negative Binomial distribution will always have a greater spread, and therefore a longer right tail, than a Poisson distribution with the same mean. So, the NegBin distribution is a natural contender for modelling a discrete variable with a long right tail.

A discretised Pareto distribution

Any continuous distribution can be made to be discrete by simply rounding generated values from a continuous distribution to whole numbers. For example, the formula =ROUND(VosePareto(2,3),0) will generate values from a Pareto(2,3) distribution and round off to whole numbers. The Pareto distribution has longer tails than the Negative Binomial distribution, and is the longest-tailed continuous distribution, so this is a quick and easy to use method of getting long-tailed discrete distributions.

A variable with a long left tail

It is a simple matter to use the above distributions to model a variable that extends with a long tail towards negative values rather than a long right tail. The technique is to subtract a long right-tailed distribution from some constant. For example, the variable =1000-VoseNegBin(2,0.03) has the shape given in the figure below. Care needs to be taken to ensure that such constructed distributions remain within the plausible bounds of the variable. For example, the variable =1000- VoseNegBin(2,0.03) can potentially extend into negative values, although as the plot below reveals, this is not probably sufficiently likely to matter.