Saturday, April 2, 2016

Statistics Arithmetic Mean Regular, Deviation and Coding Method Formula derivation


If $ x_1, x_2, x_3 $ have frequency of $ f_1, f_2, f_3 $ and deviations from a constant A(guessed or assumed arithmetic mean) are $ d_1 = x_1 - A, d_2 = x_2 - A, d_3 = x_3 - A $ then, \begin{align} \bar{x} & = \frac{\sum\limits_{j=1}^k f_j d_j}{\sum\limits_{j=1}^k f_j} + A \\ & = A + \frac{\sum fd}{N} \quad where, \sum\limits_{j=1}^k f_j = \sum f = N \end{align} Here I have only shown the formula with the frequency. Just remove the frequency if it is not needed, see statistics book for more detail.
The arithmetic mean formula,
The arithmetic mean of a set of N numbers $ x_1, x_2, x_3,..., x_k $ and occur $ f_1, f_2, f_3,..., f_k $ times, then the arithmetic mean is defined as, \begin{align} \bar{x} & = \frac{x_1 f_1 + x_2 f_2 + x_3 f_3 +...+ x_k f_k}{f_1 + f_2 + f_3 +...+ f_k} = \frac{\sum\limits_{j=1}^k f_j x_j}{\sum\limits_{j=1}^k f_j} = \frac{\sum f x}{\sum f} = \frac{\sum f x}{N} \\ \end{align}
Getting the deviation formula,
\begin{align} \bar{x} & = \frac{\sum\limits_{j=1}^k f_j x_j}{\sum\limits_{j=1}^k f_j} = \frac{\sum f_j(d_j + A)}{N} = \frac{\sum{f_j d_j} + \sum{f_j A} }{N} = \frac{\sum{f d} + AN }{N} = A + \frac{\sum{f d} }{N} \\ \end{align} where, $ x_j = d_j + A $.
Getting the coding method formula,
Let, $ d_j = x_j - A $ denote the deviations from any class mark $ x_j $ in a frequency distribution from given or, assumed class mark A.

If all of the class interval have the same size of $ c $, then the deviations are multiple of $ c $.
$ d_j = c u_j \quad $ where, $ u_j = 0, \pm 1, \pm 2, ... $

So, if $ x_1, x_2, ... $ are successive classes then they should differ by c,
$ x_2 - x_1 = c $

which can be rewritten as,
$ x_2 = x_1 + c $

similarly, \begin{align} x_3 & = x_2 + c \\ & = x_1 + c + c \\ & = x_1 + 2c \end{align} for $ x_4 $ it is,
$ x_4 = x_1 + 3c $

So, it can be formulated as, $ x_j = x_1 + (j - 1) c $

Now, if two class marks ( midpoint of class interval ) $ x_p $ and $ x_q $ will differ by, \begin{align} x_p - x_q & = [x_1 + (p - 1) c] - [x_1 + (q - 1) c] \\ & = [p - q] c \end{align} which is a multiple of $ c $.

Finally, \begin{align} \bar{x} & = A + \frac{\sum f_j d_j}{N} = A + \frac{\sum f_j(c u_j)}{N} = A + (\frac{\sum{f u}}{N}) c \end{align}

No comments: