Thursday, August 3, 2017

Standard Error

In the quantification of uncertainty, there is an important distinction that's often overlooked. This is the distinction between the dispersion of a distribution, and the dispersion of the mean of the distribution. 

By 'dispersion of a distribution,' I mean how poorly is the mass of that probability distribution localized in hypothesis space. If half the employees in Company A are aged between 30 and 40, and half the employees in Company B are aged between 25 and 50, then (all else equal) the probability distribution over the age of a randomly sampled employee from Company B has a wider dispersion then the corresponding distribution for Company A.

A common measure of dispersion is the standard deviation, which is the average of the distance between all the parts of the distribution and the mean of that distribution.