# The Chi-Squared Goodness-of-Fit Statistic

The Chi-square (c2) statistic measures how well the expected frequency of the fitted distribution compares with the observed frequency of a histogram of the observed data. The Chi-squared test makes the following assumptions:

1.     The observed data consists of a random sample of n independent data points

2.     The measurement scale can be nominal (i.e. non-numeric) or numerical

3.     The n data points can be arranged into histogram form with N non-overlapping classes or bars that cover the entire possible range of the variable.

The Chi-Squared statistic is calculated as follows:

where O(i) is the observed frequency of the ith histogram class or bar and E(i) is the expected frequency from the fitted distribution of x values falling within the x range of the ith histogram bar. E(i) is calculated as:

where:    = distribution function of the fitted distribution

= the x-value upper bound of the ith histogram bar

= the x-value lower bound of the ith histogram bar

A ChiSq(i) distribution is the sum of i independent Normal(0,1)2. The formula above is based on the idea that of a total of n observations, the number of observations Oi in some interval F(imin) to F(imax) will be:

In n is sufficiently large, the Binomial looks roughly Normal:

Allowing:

and then rearranging, we get:

Then summing this expression for all i histogram classes is equivalent to a ChiSq(i) distribution.

In order for the Chi-squared GOF statistic to be robust therefore, we must have two conditions:

1. The Normal distribution is a good approximation to the Binomial, which happens the Binomial probability is near 0.5 (inconsistent with first condition), or the number of trials is large (very large is the probability is very small).

In other words, for the Chi-squared statistic to work correctly we need a very large number of data values, and the intervals have to be selected to have a significant number in each (making the intervals equally spaced in percentile terms of the fitted distribution is ideal).

c2 is the most commonly used of the goodness-of-fit statistics described here. From the explanation above, you'll realize that it is very dependent on the number of bars N that are used. By changing the value of N, one can quite easily switch ranking between two distribution types. Unfortunately, there are no hard and fast rules for selecting the value of N. A good guide, however, is Scott's Normal Approximation which generally appears to work very well:

N = (4n)2/5

where n is the number of data points.