Simulating a Bayesian inference calculation

Download a complete copy of this risk analysis resource for free here .

The difficulty of performing any type of statistical analysis is that the stochastic process that runs from the parameter we wish to estimate, to the observations we have, can be very complicated indeed, in which case it can be quite impossible to calculate the likelihood function.

Monte Carlo simulation, however, provides an alternative. The principle is as follows:

Create a simulation model of the stochastic process;
Specify a distribution for the prior of the parameter to be estimated;
Run a simulation. Whenever the simulated outcome matches your observation, accept the value generated from the prior distribution, otherwise reject it;
The distribution of accepted values is your posterior distribution.

Bayesian inference simulation like this is an exact parallel to the algebraic method of specifying a prior density, multiplying by the likelihood function, and normalizing. However, the frequency of generating values and accepting values replaces the use of a prior density and a likelihood function:

The prior density is replaced by the relative frequency with which a value is sampled from the prior distribution;
The likelihood function for q is replaced by the fraction of the iterations that are accepted when the value q is generated from the prior.

An example of such technique is given below:

You are an R&D company that is planning to commercialize a new technology for oil refineries to help them meet a new EPA NOx emission requirement that will come into effect 1 year from now. However, you only want to develop this technology further if there are at least 55 refineries in the US that could benefit from your technology. Recently, a competing technology was brought in the market (technology X) that also helps refineries meet these emissions and quite a few refineries are already using this technology (before start of EPA requirements), but it is a lot more expensive. You are convinced that a refinery that used technology X is as likely to buy your new technology (if you develop it) as a refinery that doesn't use any technology yet. The total market size for your new technology (Y) is therefore any refinery that currently doesn't meet EPA requirements plus any refinery that used technology X. What is the probability that this total market is more than 55 refineries?

The solution to this problem (see Market_Size_Estimation.xls.) requires over a million iterations to produce a reliable output distribution. The reason for that is that it uses two uninformed prior distributions with quite a wide range. Sampling independently (the priors are not correlated) from these two priors makes the occurrence of a valid scenario rather unlikely. Most of the time the combination of the generated prior values produce 'unacceptable scenarios'.

This technique has enormous potential, but can be extremely inefficient if a large fraction of the generated prior values are rejected. It will work best when we have a limited number of values that the observations could take, or when we can assign bands around the observed values so that the prior values have a reasonable chance of being accepted. The latter technique would allow you to run a simulation, view approximately where the posterior is located, and alter the prior to take the same range, but without changing its shape within that range. For Uniform priors this is trivial, but for non-Uniform priors the contraction is easily achieved by using ModelRisk's VoseXBounds function.

Simulating a Bayesian inference calculation

Navigation