The parametric Bootstrap | Vose Software

The parametric Bootstrap

The non-parametric Bootstrap makes no assumptions about the distributional form of the population or probability (parent) distribution. However, there will be many times that we will know which family of distributions the parent distribution belongs to. For example, the number of earthquakes each year and the number of Giardia cysts in liters of water drawn from a lake will logically both be approximately Poisson distributed; the time between phone calls to an exchange will be roughly exponentially distributed and the number of males in randomly sampled groups of a certain size will be binomially distributed.

The parametric Bootstrap gives us a means to use the extra information we have about the population distribution. The procedure is the same as the non-parametric Bootstrap approach except for the distribution estimation stage:

1. Estimate the distribution from the data

For the parametric Bootstrap, we select the distribution type we believe the data to come from and then find the MLE parameters for that distribution. This means, we find the parameter values for the distribution that give the highest probability of observing the data values we have.

Use VoseDistributionFitP to return MLE parameters of a data set, or directly construct the distribution object with MLE parameters using VoseDistributionFitObject.

2. Simulate the data collection

Just as with the non-parametric Bootstrap, we now replace each observation with a sample taken at random from the fitted population distribution. Use VoseSimulate to simulate random values from the distribution created in step 1.

3. Calculate the sample statistic

We now run a large number of iterations, each one generating a new Bootstrap replicate, and for each Bootstrap replicate we calculate the sample estimate of the statistic in question.

In summary, the parametric Bootstrap proceeds as follows:

• Collect the data set of n samples {x1, ...xn}

• Determine the parameter(s) of the distribution that best fits the data from the known distribution family using maximum likelihood estimators (MLEs)

• Generate B Bootstrap samples {x1*, ...xn*} by randomly sampling from this fitted distribution

• For each Bootstrap sample {x1*, ...xn*} calculate the required statistic . The distribution of these B estimates of q represents the Bootstrap estimate of uncertainty about the true value of q.

Example

Suppose we want to model the uncertainty about the population mean using parametric bootstrapping. Suppose we have reason to believe the data (stored in an array of size n named Data) comes from a LogNormal distribution.

1. Construct the fitted distribution by writing =VoseLogNormalFitObject(Data) in a spreadsheet cell. Name this cell FittedDistribution.

2. Write =VoseSimulate(FittedDistribution) in n cells, generating a random sample from the fitted LogNormal distribution on each recalculation.

3. Calculate the desired statistic of this sample (E.g. using the AVERAGE function from Excel). This gives us the uncertainty distribution of the population statistic.