Linear regression non-parametric Bootstrap

A slight modification to the linear regression parametric Bootstrap allows one to use a non-parametric Bootstrap, i.e. where we can remove the assumption of Normally distributed residuals which may often not be very accurate. For the non-parametric model, we must first develop a non-parametric distribution of residuals by changing them to have constant variance. We define the modified residual ri as follows:

where the leverage hi is given by:

The mean of the modified residuals is calculated. Then a Bootstrap sample ri* is drawn from the set of ri values and used to determine the quantity for each xj value which is used in step 2 of the algorithm above.

In certain problems, it is logical that the y-intercept value c be set to zero. In this situation, the leverage values are different:

The modified residuals are thus also different and won't sum to zero, so it is essential to mean-correct the residuals before they are used to simulate random errors.

Bootstrapping the data pairs is more robust than Bootstrapping the residuals as it is less sensitive to any deviation from the regression assumptions, but won't be as accurate where the assumptions are correct. However, as the data set increases in size, the results from Bootstrapping the pairs approaches that for Bootstrapping the residual and it is also easier to execute, of course. These techniques can be extended to non-linear, non-constant variance and to multiple linear regressions, described in detail in Efron and Tibshirani (1993) and Davison and Hinkley (1997).

Linear regression non-parametric Bootstrap

Navigation