Fitting a discrete non-parametric second-order distribution to data

Uncertainty can be added to the discrete probabilities assigned to a first-order non-parametric distribution fit to provide a second-order discrete distribution. Assuming that the variable in question is stable (i.e. is not varying with time), there is a constant (i.e. multinomial) probability pi that any observation will have a particular value xi (i =1 to t). If ki of the n observations have taken the value xi, then our joint estimate of the probabilities {pi} is given by a Dirichlet distribution, as shown in this example.

There remains a difficulty in selecting the range of this distribution, and it will be a matter of judgement how far one extends the range beyond the observed values, and any middle range that has no observed values. Using the Dirichlet for possible values xi for which one has not seen any data will assign the confidence distribution Beta(1,n+t-1) with mean 1/(n+t) for all corresponding pi values, no matter how extreme their position in the distribution's tail. This obviously makes no sense, and if it is important to recognise the possibility of a long tail beyond observed data, a modification is necessary.

Fitting a discrete non-parametric second-order distribution to data

Navigation