Saturday, April 25, 2015

Mean vs median - a careful balancing act

Two common measures of the location of a probability distribution are the mean and the median. While generally, they are quite different things, some familiar distributions have their mean and median at the same point (all such distributions are symmetric, (see comment, below) and vice versa).

The mean of a distribution, as we all know, is its average, while the median is, roughly speaking, the point at which the amount of probability mass to one side is the same as the amount on the other side. Upon hasty consideration, these definitions can appear to denote the same thing, and so confusion between the two concepts is common. Annoyingly, my own PhD thesis contains a sentence1 that explicitly confuses the mean for the median (and furthermore, none of the half dozen eminent scientists whose job it was to assess my thesis (who otherwise all did an excellent job!) reported noticing this blunder).

Confusion between the mean and the median is highly analogous to a difficulty experienced by many young children when they try to balance asymmetric blocks on top of one another, as has been reported by cognitive scientist Annette Karmiloff-Smith2.

In a radio interview a couple of years ago (BBC Radio 4, 'The Life Scientific', Jan. 22, 2013), Karmiloff-Smith described briefly the finding (starting about 12 minutes into the interview): young children were asked to try to balance various blocks - some symmetric, others invisibly loaded on one side - on a narrow beam. Children of a particular age group were, it seems, old enough to expect the balance point to correspond to the geometric midpoint of the object, and tried first to balance the blocks there. Obviously, in the case of the asymmetrically weighted blocks, the midpoint would not work, and the blocks would fall. Despite repeated attempts with the same outcome, however, often a child would remain apparently unshakable in its faith in the midpoint, and continue to try to balance the item there.

Interestingly, many slightly younger children, perhaps not yet old enough to have learned the significance of the midpoint of an object, had an easier time adjusting from the geometric centre to the actual centre of mass after a few trials.

The task of finding the centre of mass of a physical object is mathematically identical to the matter of locating the mean of a probability distribution. The mean of a distribution over x, given as

is the point at which (treating distances from the mean, to the left, as negative and distances to the right as positive) the products of the individual probabilities with their corresponding distances from the mean sum up to zero. (If we shifted the origin of our coordinate system to the mean of the distribution, in the above formula, the integral would be zero.)

From the law of the lever, however, the force with which a mass tends to tip an object on a fulcrum is given by the product of the mass with its distance from the fulcrum. Since an object will balance when the forces on one side have the same magnitude as the forces on the other side, the centre of balance also corresponds to the point at which the sum of these mass × distance products comes to zero. So, the centre of mass is also the mean of the mass distribution.

The midpoint of an object is also closely analogous to the median. If we reduce an object to a one-dimensional mental model, then the correspondence becomes exact. At the median, m, (assuming a continuous distribution) the amounts of mass on either side are equal:

In one dimension, length stands in for mass, and the median is the point equidistant from each end.

Note, however, that even encoding for differences in density along the length of an object / distribution, the mean and the median will only be the same in the case of symmetry about the centre of mass. The median involves the integral of mass, while the mean integrates the product between mass and distance. The mean pays more attention to masses situated further out, thanks to this product, while the median doesn't care where the mass happens to be. If a distribution has an extended tail on one side only, then the mean will typically be positioned further out into the tail than the median.
At some point in their development, children seem to learn to expect symmetry in the objects and phenomena they encounter. This is quite reasonable, as without symmetry, there can be no physics (all physical laws are realizations of symmetry of one kind or another).

The devil is in the details, though, and the symmetry need not always be of the simplest forms. As we approach adulthood, we presumably come to appreciate this, and I suspect that as adults we can look forward to much faster success in balancing exercises such as the ones those children described earlier struggled with. No doubt our continually built up experiences of mechanical interaction with reality contribute much to this attainment of maturity.

But in the course of our day-to-day existence, we have far less cause to experience and interact with explicit probability distributions, so the lessons pertaining to them can be harder to win (particularly given our excess exposure to symmetric distributions such the Gaussian). An intuitive grasp of the difference between mean and median is one presumably almost all adults possess, when it comes to simple physical objects, but banishing this confusion can be more than child's play when it comes to statistics. Hopefully, by noting (as I've tried to do here) the similarities between the mechanical and the abstract, we can ease the process.  


 [1]  Really, you think I'm going to give you a page number? Go find it yourself!
 [2]  Annette Karmiloff-Smith and Bärbel Inhelder, 'If you want to get ahead, get a theory,' Cognition, volume 3, issue 3, p 195-212 (1975)


  1. This comment has been removed by the author.

  2. "(all such distributions are symmetric, and vice versa)."

    Really? All symmetric distributions have the same mean and median, but the reverse is in general not true. Say income is distributed as a Gaussian and each person earns an integral number of pounds. Let the top earner earn a pound more. This moves the mean to the right but not the median. Move the mean back to its original position by giving 100 people to the left of the mean one penny more. The resulting distribution has the same mean and median, but is not symmetric.

    1. Quite right, thanks for pointing it out.

      Thinking about it drew a couple of other points to mind, which I've also reflected in a minor edit or two:

      (1) I tacitly assumed the uniqueness of the median, throughout, in effect assuming a continuous distribution.

      (2) The relationship between the mean and median is not uniquely determined by the direction of skew. Hence, I added the word 'typically' in the sentence:

      "If a distribution has an extended tail on one side only, then the mean will typically be positioned further out into the tail than the median. "