Maximum Entropy: 2015

Saturday, October 31, 2015

Multi-level modeling

In a post last year, I went through some inference problems concerning a hypothetical medical test. For example, using the known rate of occurrence of some disease, and the known characteristics of a diagnostic test (false-positive and false-negative rates), we were able to obtain the probability that a subject has the disease, based on the test result.

In this post, I'll demonstrate some hierarchical modeling, in a similar context of medical diagnosis. Suppose we know the characteristics of the diagnostic test, but not the frequency of occurrence of the disease, can we figure this out from a set of test results?

A medical screening test has a false-positive rate of 0.15 and a false-negative rate of 0.1. One thousand randomly sampled subjects were tested, resulting in 213 positive test results. What is the posterior distribution over the background prevalence of the disease in this population?

Mean vs median - a careful balancing act

Two common measures of the location of a probability distribution are the mean and the median. While generally, they are quite different things, some familiar distributions have their mean and median at the same point (~~all such distributions are symmetric~~, (see comment, below) and vice versa).

The mean of a distribution, as we all know, is its average, while the median is, roughly speaking, the point at which the amount of probability mass to one side is the same as the amount on the other side. Upon hasty consideration, these definitions can appear to denote the same thing, and so confusion between the two concepts is common. Annoyingly, my own PhD thesis contains a sentence¹ that explicitly confuses the mean for the median (and furthermore, none of the half dozen eminent scientists whose job it was to assess my thesis (who otherwise all did an excellent job!) reported noticing this blunder).

Confusion between the mean and the median is highly analogous to a difficulty experienced by many young children when they try to balance asymmetric blocks on top of one another, as has been reported by cognitive scientist Annette Karmiloff-Smith².

The Fundamental Confidence Fallacy

The title of this post comes from an excellent recent paper (as far as I can tell, still in draft form) on misunderstandings of confidence intervals. The paper, 'The fallacy of placing confidence in confidence intervals', by R. D. Morey et al.¹ is by almost exactly the same set of authors whose earlier paper on a very similar topic I criticized, before, but the current paper does a far better job of explaining the authors' position, and arguing for it.

The authors identify the fundamental confidence fallacy (FCF) as believing automatically that,

If the probability that a random interval contains the true value is X%, then the plausibility (or probability) that a particular observed interval contains the true value is also X%.

Maximum Entropy

Saturday, October 31, 2015

Multi-level modeling

Saturday, April 25, 2015

Mean vs median - a careful balancing act

Saturday, April 18, 2015

The Fundamental Confidence Fallacy