The non-parametric Bootstrap | Vose Software

The non-parametric Bootstrap

The non-parametric Bootstrap is used to estimate a parameter or parameters of a population or probability distribution from a set of observations {xi} where we don't wish to make a guess of the distributional form (e.g. Normal, Gamma, lognormal). The non-parametric Bootstrap has three stages:

1.       Estimate the population (or probability) distribution from the data set;

2.       Simulate the sampling from the population distribution that led to the set of observations {xi};

3.       For each sampling, calculate the sample statistic of interest.

For the non-parametric Bootstrap, we simply use the frequency distribution of the n data values as our best guess of the population or probability) distribution. In other words the list of values {xi} is assumed to be our population distribution. Clearly this will be an increasingly poor estimation the fewer the observations.

Calculate the sample statistic

We now run a large number of iterations, each one generating a new Bootstrap replicate, and for each Bootstrap replicate we calculate the sample estimate of the statistic in question.  If we are interested in the standard deviation, for example, of the population, we simply calculate the sample standard deviation (STDEV in Excel) of the Bootstrap replicates.

In summary, the non-parametric Bootstrap proceeds as follows:

• Collect the data set of n samples {x1, ...xn}

• Create B Bootstrap samples {x1*, ...xn*} where each xi* is a random sample with replacement from {x1, ...xn}

• For each Bootstrap replicate {x1*, ...xn*} calculate the required statistic . The distribution of these B estimates of q represents the Bootstrap estimate of uncertainty about the true value of q.

ModelRisk offers the VoseNBoot functions that perform non-parametric bootstrap analyses directly for a range of different statistics.

Example

To estimate the uncertainty about the population standard deviation using non-parametric bootstrap, you can simply use VoseNBootStdev function. So if the data is in an array named Data, =VoseNBootStDev(Data) will directly return the standard deviation of a sample taken randomly with replacement from the data, on each recalculation.