Predicting results of a random survey, and uncertainty about results
An example of a Monte Carlo simulation risk analysis model for general risk analysis
We often hear on the news from a recent poll of a population how people are expected to vote on some issue or at an election. If the issue is a simple "yes" or "no", and the people are randomly and representatively sampled from the population, then the poll is a binomial process. In this case, our uncertainty about the fraction of voters p who will ultimately vote "yes" is described by an uncertainty distribution as follows:
p = VoseBeta(s+1,n-s+1)
where n is the number of people surveyed and s is the number among them who stated they would vote "yes". Built into this analysis is the assumption that people won't change their minds between the time the poll was conducted and the date of the vote - which is always a tricky assumption!
A more interesting case is when there are more than two possible outcomes, for example, an election where there are three or more significant competing parties. This is a multinomial process, and we would therefore employ the Dirichlet distribution to represent our uncertainty about the fraction of the population who would vote for each party.
For example, imagine that we have surveyed 1027 people, asking them for which party they are intending to vote. The results are as follows:
Using the Dirichlet distribution and assuming that people don't change their mind between the poll and election time, we can answer questions like:
- How confident are we that SMP will win (get more votes than any other party)?
- If the SDP join forces with the EDP, and the SMP join forces with the PSM, how confident are we that SDP/EDP will get more votes than SMP/PSM?