See also: Determining the width of histogram bars

Sturges' rule is a rule for determining how wide to choose bars (i.e.
of the *bins*) when visually representing
data by a histogram. It says the data range should be split into *k*
equally spaced classes where

where is the ceiling operator (meaning take the closest integer above the calculated value).

Though not stated in Sturges
(1926), Herbert Sturges considered a histogram of *k* bins where
the number of data values in the *i*th bin (*i* = 0,...*k*-1)
is given by

Summing over all bins we get the total number of data values *n*:

Equation 1

The binomial expansion identity says:

(replacing *q* with (1-*p*) we get the binomial equation).
Setting *p* = *q* = 1 in Equation 1 we get:

Solving for *k* we get Sturges' formula:

and then we take the nearest integer above this value. Implicit in Sturges'
rule is the assumption of a Normally distributed data set that is being
well approximated by a Binomial distribution with probability 0.5 (which
gives a symmetric distribution). To see that, the expected number of data
points falling into the *i*th class is, from the __binomial
probability mass function__:

Setting from above, this reduces
to which is Sturges' idealized histogram.
Note: Sturges's paper actually gives a class width *w* as:

where *R* is the data range and 3.322 is 1/Log_{10}(2),
so *R*/*w* gives the formula quoted above for *n*.