White papers
Fitting distributions to data (by David Vose)
Abstract A common problem in risk analysis is fitting a probability distribution to a set of observations for a variable. One does this to be able to make forecasts about the future. The most common situation is to fit a distribution to a single variable (like the lifetime of a mechanical or electrical component), but problems also sometimes require the fitting of a multivariate distribution: for example, if one wishes to predict the weight and height of a random person, or the simultaneous change in price of two stocks. There are a number of software tools on the market that will fit distributions to a data set, and most risk analysis tools incorporate a component that will do this. Unfortunately, the methods they use to measure the goodness of fit are wrong and very limited in the types of data that they can use. This paper explains why, and describes a method that is both correct and sufficiently flexible to handle any type of data set.