Andrew Gelman and Cosma Rohilla Shalizi have just published an interesting paper

^{1}, 'Philosophy and the practice of Bayesian statistics,' which is all about the underlying nature of science and Bayesian statistics, as well as practical elements of scientific inference that some researchers seem to be reluctant to indulge in. The paper comes accompanied by an introduction, five comments, and a final authors' response to comments (link to journal issue). I found the paper to be a highly thought-provoking read, and its extensive list of references serves as a summary of much of what's worth knowing about at the cutting edge of epistemology research. While I'm recommending reading this article, there are major components of its thesis that I disagree with.

Firstly, I love the title. Many would have been satisfied with 'The philosophy and practice of Bayesian statistics,' but clearly the authors have broader ambitions than that. Actually, I really applaud the sentiment.

In terms of their practical guidelines, Gelman and Rohilla Shalzi are spot on. They are talking about model checking - graphical or statistical techniques using simulated data generated according to the model supported by a statistical analysis, with the goal of assessing the appropriateness of that model. This is necessary, since any probability calculation is dependent on the context of the hypothesis space within which the problem is formulated. These hypothesis spaces, however, are not derived some some divine, infallible formula, but find their genesis in the whims of our imaginations. There is no guarantee, or even strong reason to suppose that the chosen set of alternative propositions actually contains a single true statement. A totally inappropriate theory, therefore can attain a very high posterior probability, depending on the environment of models in which it finds itself competing.

*Within a given system of models*, Bayes' theorem has no way to alert us to such calamities, and something additional to the standard Bayesian protocol is appropriate.
Some researchers committed to the validity of the Bayesian program feel, apparently, that this additional process is inappropriate, because it seems to step outside the confines of Bayesian logic. I contend that this is mistaken, which I will explain in laying out my disagreement with the paper under discussion.

Edwin Jaynes (in honour of whose work the title of this blog was chosen) was also a strong advocate of model checking, and he pointed out many times that Bayesian methodology exerts its greatest power when it shows us that we need to discard a theory that can serve us no more. I'm reminded of one of my favorite Jaynes quotes

^{2}:To reject a Bayesian calculation because it has given us an incorrect prediction is like disconnecting a fire alarm because that annoying bell keeps ringing.

In terms of the philosophical discussion, Gelman and Rohilla Shalzi argue that model checking, while vital, is outside Bayesian logic and furthermore is not part of inductive inference. They claim that falsification of scientific models is deductive in nature. These three claims represent the main points of departure between my understanding and theirs.

I posted a comment on Professor Gelman's blog, so I'll just paste in directly from there (the details of that specific model from their discussion I refer to are not important, its just an example they used):

You wrote:

“It turned out that this varying-intercept model did not fit our data, … We found this not through any process of Bayesian induction but rather through model checking.”

I agree on the value of model checking, but I wonder if this is really distinct from inductive inference. In order to say that the model was inappropriate, don’t you think that you must, at least informally, have assigned it a low probability? In which case, your model checking procedure seems to be a heuristic designed to mimic efficiently the logic of Bayesian induction.

Even if you didn’t formally specify a hypothesis space, what you seem to have done is to say ‘look, this model mis-matches the data so much that it must be easy to find an alternate model that would achieve a much higher posterior.’ As such, the process of model checking attains absolutely strict validity only when that extended hypothesis space is explicitly examined, and your intuition numerically confirmed. Granted, many cases will be so obvious that the full analysis isn’t needed, but hopefully you get the point.

There certainly is a strong asymmetry between verification and falsification, but I can’t accept your thesis that falsification is deductive. Sure, its typically harder for a model with an assigned probability near zero to be brought back into contention than it is for a model with currently very high probability to be crushed by new evidence, but its not in principle impossible. Newtonian mechanics might be the real way of the world, and all that evidence against it might just have been a dream. The problem is that this requires not just Newtonian mechanics, but Newtonian mechanics + some other implausible stuff, which as intuition warns (and mathematics can confirm) deserves very small prior weight. (A currently favorable model can always be superseded by another model with not significantly greater complexity, which accounts for the asymmetry between falsification and verification.) The mathematics that verifies this is Bayesian and, it seems to me, inductive.

That we can apparently falsify a theory without considering alternatives seems to be simply this strong asymmetry allowing Bayesian logic to be reliably but informally approximated without specifying the entire (super)model.

By the way, the mathematics that confirms the low prior for propositions like Newton + all that extra weird stuff is Ockham's razor (a.k.a. Bayes' theorem).

I have pointed out previously that non-Bayesian techniques certainly have their usefulness, but that their validity is limited to the extent that they succeed in replicating the result of a Bayesian calculation. Model checking seems to me to be no exception. Indeed, the recommended model checks can only work when the outcome is so obvious that the full, rigorous analysis is not needed. If the case is too close to call by these techniques, then you must roll out Bayes' theorem again, or stick with your current model. This should be obvious.

I have laid out the Bayesian basis for falsificationism elsewhere, but I did not discuss this asymmetry between falsifying and verifying theories, which I also think is important. Some Bayesian methodologists seem to hold the view that they have equal status, but they do not. They are not, however, as asymmetric as Popper felt - he did not accept that any form of verification was ever valid. One must wonder, then, why he had any interest whatsoever in science. Science does not, however, verify by making absolute statements about a theory's truth, but rather, its statements are to be seen as of the 'less wrong' type.

The idea that falsification is deductive seems to me indefensible. It is entirely statistical, and is just as incapable of absolute certainty as any other reasoning about phenomena in the real world (except perhaps where propositions are falsified on the basis that they are incoherently expressed, though perhaps it is better not to say such things are false, but rather null statements). If falsification is deductive, how big do the residuals between data and model need to be in order to reach that magic tipping point?

Oh, and Professor Gelman's response to my comment?

'Fair enough.'

Actually, he said a little more than that, but I think its fair to say he conceded that strictly, I had a good point. (You may judge for yourself if you wish.)

Anyway, I'm looking forward to dipping into some of the published comments and responses, as well as some of the many materials cited in this paper. Recommended reading for anyone interested in epistemology - what we can know, and under what circumstances we can know it. By the way, all the articles in that journal seem to be open access, so bravo, hats off, and three cheers to the British Journal of Mathematical and Statistical Psychology.

[1] 'Philosophy and the practice of Bayesian statistics,' British Journal of Mathematical and Statistical Psychology, February 2013, Vol. 66, Issue 1, pages 8 - 38 (link)

[2] 'Clearing up mysteries, the original goal,' by E.T. Jaynes, in 'Maximum entropy and Bayesian methods,' edited by J. Skilling, Kluwer Publishing, 1989

Regarding this topic, you may be interested in John Kruschke's article:

ReplyDeletePosterior predictive checks can and should be

Bayesian: Comment on Gelman and Shalizi,

‘Philosophy and the practice of Bayesian statistics’

http://www.indiana.edu/~kruschke/articles/Kruschke2012BJMSP.pdf

-YIF