Thursday, February 28, 2013

Legally Insane

David Spiegelhalter's blog, Understanding Uncertainty, informs us of a recent insane ruling from the England and Wales Court of Appeal, concerning the usability of probabilities as evidence in court cases. Lord Justice Toulson's ruling contains the following wisdom:
The chances of something happening in the future may be expressed in terms of percentage... But you cannot properly say that there is a 25 per cent chance that something has happened... Either it has or it has not.
This is wrong. Shockingly, scarily wrong. The judge is saying that probabilities only apply to future events, and not to past events, and is effectively decreeing that such evidence in inadmissible in a court of law. This ruling is based, it seems, on an earlier case, in which a judge ruled:
It is not, in my opinion, correct to say that on arrival at the hospital he had a 25 per cent. chance of recovery. If insufficient blood vessels were left intact by the fall he had no prospect of avoiding complete avascular necrosis, whereas if sufficient blood vessels were left intact on the judge's findings no further damage to the blood supply would have resulted if he had been given immediate treatment, and he would not have suffered the avascular necrosis.
It seems that Justice Toulson is not alone among judges for his profound ignorance of probability, logic and the basic principles of how knowledge is acquired. In fact, for almost 30 years, the legal profession has had its own special probabilistic fallacy, the prosecutor's fallacy, named after it. You might think that by now they should have made an effort to get to grips with basic methodology, but instead, they just keep on making stupid judgements. 

I have pointed out that a judge who can claim that probabilities in general can't be assigned to past events is incompetent and not fit to perform their duties. Though it might seem harsh, I stand by this analysis. You might think that this is some obscure point of epistemology with little or no practical importance, but it is much more than that.

There are three main reasons blunders like this, made by a high court judge, should scare the crap out of society at large. Firstly, as pointed out, it demonstrates a serious ignorance of what probabilities are. Probability theory is completely symmetric with regard to time, and probabilities are simply a systematic way of quantifying our current knowledge. Just an obscure mathematical point? Not at all. Such ignorance shows off a complete disregard for what knowledge is, what it means to be rational, and what is actually involved when evidence is evaluated. What has been asserted is that calculations of the kind when I worked out the probability that somebody has contracted a certain disease are completely meaningless. Maybe the judge won't find it meaningless next time he is in his doctor's office trying to plan his future. Call me overly strict, but I expect somebody in such a position of power, whose job consists to such a high degree of evaluating evidence, to be able to wield a modest understanding of what evidence actually is. How can any reasonable standard of statistical evidence be enforced, when judges are so ignorant about probability? 

But its not just ignorance thats on display here - also a shocking disrespect for logic. In the same paragraph as his pronouncement on the meaninglessness of probabilities, when applied to past events, the judge manages to immediately contradict himself rather blatantly:
In deciding a question of past fact the court will, of course, give the answer which it believes is more likely to be (more probably) the right answer than the wrong answer 
How can anybody capable of holding such obviously incompatible positions at the same time, on any topic, be capable of presiding over a court? What is clear, is that every single judgement of fact, in every single sphere of life relies on some kind of probability assessment. The question that remains, then, is whether we want to make that assessment as systematic and rigorous as we can, or are we happy relying on unexamined instinct and faulty logic? For high-ranked judges to favour the latter is a nightmare scenario.

Secondly, this ruling, if implemented, would make an enormous variety of important types of evidence impossible to use in legal cases, and would severely hinder the capacity of courts to efficiently determine what is the likely truth. Genetic evidence, for example, is based on Bayesian calculations, as it must be, in order to attain validity. 

To present probabilities is to make the highest quality of inference available. Why is this judge against high-quality inference? Indeed, why is he so overtly opposed to scientific method? In Toulson's ruling, we also have this:
When judging whether a case for believing that an event was caused in a particular way is stronger that the case for not so believing, the process is not scientific...
Why not scientific? Why not demand the highest standards of logic and inference? Why is he not complaining that the process is not scientific enough, instead of insisting that we rely on some inefficient and non-systematic procedure? The mind boggles. There is only one correct way to  assess the implications of evidence, and to quantitatively combine multiple pieces of evidence, and that is Bayes' theorem (techniques that successfully replicate its outcomes can occasionally be used also). Judge Toulson's ruling constitutes a rejection of Bayesian reasoning, and thereby demands that the legal profession turn its back on the rational evaluation of empirical facts.

Thirdly, the judges on this case have made a serious blunder with a technical issue, while obviously being unaware of their incompetence to reason about the topic. Certainly, a judge doesn't need to be an expert in all the technical subjects that may be relevant to any given case. But then they must be able to appreciate that the technical issues are beyond them. They can not perform technical analyses that they are unqualified to perform. If they want to base their decisions on philosophy, probability, mathematical theorems, or whatever, they damn well get it right, or ask somebody else, who knows what they are doing. In what other technical forensic issues are these judges hopelessly unaware of their complete lack of understanding? A society that aspires to be a free and enlightened society must not tolerate such oblivious overconfidence among people with such an important job.   

Saturday, February 9, 2013

Inductive inference or deductive falsification?

Andrew Gelman and Cosma Rohilla Shalizi have just published an interesting paper1, 'Philosophy and the practice of Bayesian statistics,' which is all about the underlying nature of science and Bayesian statistics, as well as practical elements of scientific inference that some researchers seem to be reluctant to indulge in. The paper comes accompanied by an introduction, five comments, and a final authors' response to comments (link to journal issue). I found the paper to be a highly thought-provoking read, and its extensive list of references serves as a summary of much of what's worth knowing about at the cutting edge of epistemology research. While I'm recommending reading this article, there are major components of its thesis that I disagree with.

Firstly, I love the title. Many would have been satisfied with 'The philosophy and practice of Bayesian statistics,' but clearly the authors have broader ambitions than that. Actually, I really applaud the sentiment.

In terms of their practical guidelines, Gelman and Rohilla Shalzi are spot on. They are talking about model checking - graphical or statistical techniques using simulated data generated according to the model supported by a statistical analysis, with the goal of assessing the appropriateness of that model. This is necessary, since any probability calculation is dependent on the context of the hypothesis space within which the problem is formulated. These hypothesis spaces, however, are not derived some some divine, infallible formula, but find their genesis in the whims of our imaginations. There is no guarantee, or even strong reason to suppose that the chosen set of alternative propositions actually contains a single true statement. A totally inappropriate theory, therefore can attain a very high posterior probability, depending on the environment of models in which it finds itself competing. Within a given system of models, Bayes' theorem has no way to alert us to such calamities, and something additional to the standard Bayesian protocol is appropriate.

Some researchers committed to the validity of the Bayesian program feel, apparently, that this additional process is inappropriate, because it seems to step outside the confines of Bayesian logic. I contend that this is mistaken, which I will explain in laying out my disagreement with the paper under discussion.

Edwin Jaynes (in honour of whose work the title of this blog was chosen) was also a strong advocate of model checking, and he pointed out many times that Bayesian methodology exerts its greatest power when it shows us that we need to discard a theory that can serve us no more. I'm reminded of one of my favorite Jaynes quotes2:

To reject a Bayesian calculation because it has given us an incorrect prediction is like disconnecting a fire alarm because that annoying bell keeps ringing. 

In terms of the philosophical discussion, Gelman and Rohilla Shalzi argue that model checking, while vital, is  outside Bayesian logic and furthermore is not part of inductive inference. They claim that falsification of scientific models is deductive in nature. These three claims represent the main points of departure between my understanding and theirs. 

I posted a comment on Professor Gelman's blog, so I'll just paste in directly from there (the details of that specific model from their discussion I refer to are not important, its just an example they used):

You wrote: 
“It turned out that this varying-intercept model did not fit our data, … We found this not through any process of Bayesian induction but rather through model checking.” 
I agree on the value of model checking, but I wonder if this is really distinct from inductive inference. In order to say that the model was inappropriate, don’t you think that you must, at least informally, have assigned it a low probability? In which case, your model checking procedure seems to be a heuristic designed to mimic efficiently the logic of Bayesian induction. 
Even if you didn’t formally specify a hypothesis space, what you seem to have done is to say ‘look, this model mis-matches the data so much that it must be easy to find an alternate model that would achieve a much higher posterior.’ As such, the process of model checking attains absolutely strict validity only when that extended hypothesis space is explicitly examined, and your intuition numerically confirmed. Granted, many cases will be so obvious that the full analysis isn’t needed, but hopefully you get the point. 
There certainly is a strong asymmetry between verification and falsification, but I can’t accept your thesis that falsification is deductive. Sure, its typically harder for a model with an assigned probability near zero to be brought back into contention than it is for a model with currently very high probability to be crushed by new evidence, but its not in principle impossible. Newtonian mechanics might be the real way of the world, and all that evidence against it might just have been a dream. The problem is that this requires not just Newtonian mechanics, but Newtonian mechanics + some other implausible stuff, which as intuition warns (and mathematics can confirm) deserves very small prior weight. (A currently favorable model can always be superseded by another model with not significantly greater complexity, which accounts for the asymmetry between falsification and verification.) The mathematics that verifies this is Bayesian and, it seems to me, inductive. 
That we can apparently falsify a theory without considering alternatives seems to be simply this strong asymmetry allowing Bayesian logic to be reliably but informally approximated without specifying the entire (super)model.

By the way, the mathematics that confirms the low prior for propositions like Newton + all that  extra weird stuff is Ockham's razor (a.k.a. Bayes' theorem). 

I have pointed out previously that non-Bayesian techniques certainly have their usefulness, but that their validity is limited to the extent that they succeed in replicating the result of a Bayesian calculation. Model checking seems to me to be no exception. Indeed, the recommended model checks can only work when the outcome is so obvious that the full, rigorous analysis is not needed. If the case is too close to call by these techniques, then you must roll out Bayes' theorem again, or stick with your current model. This should be obvious.

I have laid out the Bayesian basis for falsificationism elsewhere, but I did not discuss this asymmetry between falsifying and verifying theories, which I also think is important. Some Bayesian methodologists seem to hold the view that they have equal status, but they do not. They are not, however, as asymmetric as Popper felt - he did not accept that any form of verification was ever valid. One must wonder, then, why he had any interest whatsoever in science. Science does not, however, verify by making absolute statements about a theory's truth, but rather, its statements are to be seen as of the 'less wrong' type.

The idea that falsification is deductive seems to me indefensible. It is entirely statistical, and is just as incapable of absolute certainty as any other reasoning about phenomena in the real world (except perhaps where propositions are falsified on the basis that they are incoherently expressed, though perhaps it is better not to say such things are false, but rather null statements). If falsification is deductive, how big do the residuals between data and model need to be in order to reach that magic tipping point? 

Oh, and Professor Gelman's response to my comment?

'Fair enough.'

Actually, he said a little more than that, but I think its fair to say he conceded that strictly, I had a good point. (You may judge for yourself if you wish.)

Anyway, I'm looking forward to dipping into some of the published comments and responses, as well as some of the many materials cited in this paper. Recommended reading for anyone interested in epistemology - what we can know, and under what circumstances we can know it. By the way, all the articles in that journal seem to be open access, so bravo, hats off, and three cheers to the British Journal of Mathematical and Statistical Psychology.

[1]  'Philosophy and the practice of Bayesian statistics,' British Journal of Mathematical and Statistical Psychology, February 2013, Vol. 66, Issue 1, pages 8 - 38 (link)

[2] 'Clearing up mysteries, the original goal,' by E.T. Jaynes, in 'Maximum entropy and Bayesian methods,' edited by J. Skilling, Kluwer Publishing, 1989