Maximum Entropy

Saturday, September 2, 2017

Disruptive Writing Style

In the pursuit of science, under whose umbrella I consider all intellectually rigorous activity to fall, the formulation and communication of ideas are critical. Here, I'll outline aspects of my own attitude to scientific communication.

In an earlier post on jargon, I advocated against over reliance on familiar terminology, as this can often give a false impression of understanding. I recommended occasionally throwing out unusual pieces of vocabulary, in the hope of ensuring that one's audience is engaged in the concepts, and not just semi-consciously following the signposts. This technique is a significant part of a communication strategy we might call 'disruptive writing style.'

When I say disruptive, I'm not talking about the content of an essay, article, speech, or whatever. I don't mean that the matter under discussion is disruptive, the way the birth of digital electronics represented a disruptive technology. Instead, I'm talking about the vehicle by which one conveys one's ideas to a wider appreciation. I'm talking about a style that occasionally strives to prevent the smooth progress of the reader (or listener) from beginning to end of your piece, in order to ensure that your thesis is being taken in.

Sums and Differences of Random Numbers

Here's a problem that cropped up in the course of some calculations I have been working on recently. In stating the problem, I'll include the physical context, thought this context isn't important for the rest of the discussion, which applies generally to certain manipulations on probability distributions:

A high-energy massive particle (e.g. a proton or α-particle), whose initial kinetic energy is governed by a certain probability distribution, passes through a slab of material. As it travels through the material, it scatters some of the electrons inside the slab, dissipating a small fraction of its energy. The amount of its energy that it loses also has some probability distribution (the so-called 'straggling function'). What is the probability distribution over the energy of the particle as it exits the slab of material?

I'm sure all of us wrestle with exactly this question several times each day. The problem concerns the probability distribution over the difference between two independent random variables¹ (in this case the particle's initial energy and the energy it deposits in the slab of material).

The route to solving this problem involves utilizing the solution to a related problem, namely the probability distribution over the sum of two random variables, so first let's look at that.

Standard Error

In the quantification of uncertainty, there is an important distinction that's often overlooked. This is the distinction between the dispersion of a distribution, and the dispersion of the mean of the distribution.

By 'dispersion of a distribution,' I mean how poorly is the mass of that probability distribution localized in hypothesis space. If half the employees in Company A are aged between 30 and 40, and half the employees in Company B are aged between 25 and 50, then (all else equal) the probability distribution over the age of a randomly sampled employee from Company B has a wider dispersion then the corresponding distribution for Company A.

A common measure of dispersion is the standard deviation, which is the average of the distance between all the parts of the distribution and the mean of that distribution.

Multi-level modeling

In a post last year, I went through some inference problems concerning a hypothetical medical test. For example, using the known rate of occurrence of some disease, and the known characteristics of a diagnostic test (false-positive and false-negative rates), we were able to obtain the probability that a subject has the disease, based on the test result.

In this post, I'll demonstrate some hierarchical modeling, in a similar context of medical diagnosis. Suppose we know the characteristics of the diagnostic test, but not the frequency of occurrence of the disease, can we figure this out from a set of test results?

A medical screening test has a false-positive rate of 0.15 and a false-negative rate of 0.1. One thousand randomly sampled subjects were tested, resulting in 213 positive test results. What is the posterior distribution over the background prevalence of the disease in this population?

Mean vs median - a careful balancing act

Two common measures of the location of a probability distribution are the mean and the median. While generally, they are quite different things, some familiar distributions have their mean and median at the same point (~~all such distributions are symmetric~~, (see comment, below) and vice versa).

The mean of a distribution, as we all know, is its average, while the median is, roughly speaking, the point at which the amount of probability mass to one side is the same as the amount on the other side. Upon hasty consideration, these definitions can appear to denote the same thing, and so confusion between the two concepts is common. Annoyingly, my own PhD thesis contains a sentence¹ that explicitly confuses the mean for the median (and furthermore, none of the half dozen eminent scientists whose job it was to assess my thesis (who otherwise all did an excellent job!) reported noticing this blunder).

Confusion between the mean and the median is highly analogous to a difficulty experienced by many young children when they try to balance asymmetric blocks on top of one another, as has been reported by cognitive scientist Annette Karmiloff-Smith².

The Fundamental Confidence Fallacy

The title of this post comes from an excellent recent paper (as far as I can tell, still in draft form) on misunderstandings of confidence intervals. The paper, 'The fallacy of placing confidence in confidence intervals', by R. D. Morey et al.¹ is by almost exactly the same set of authors whose earlier paper on a very similar topic I criticized, before, but the current paper does a far better job of explaining the authors' position, and arguing for it.

The authors identify the fundamental confidence fallacy (FCF) as believing automatically that,

If the probability that a random interval contains the true value is X%, then the plausibility (or probability) that a particular observed interval contains the true value is also X%.

Science is for Everyone

In the previous post, I explained that science is suitable for investigating all matters. Pursuing a similar theme, I want now to discuss how science is for all people, not just bearded academics with white lab coats. (Pardon the stereotype, and let me emphasize that there is no good reason why 50% of all scientists should not be women.)

I mentioned something in that last post that is also central to this discussion: scientific method is a graded affair - not black or white. Whatever we can learn by implementing a low level of scientific rigour, we can learn a little more, in a little more detail, and with a little more confidence, by applying a slightly more systematic procedure.

Scientism

It perplexes me that the word 'scientism' is predominantly used as a slur to put people down and criticize their world view and methodology. I realized something recently, however, that helped me understand the error that is often being made, and how that error compounds the problem that is often being called out when people make the accusation of scientism.

First off, lets settle what scientism is. Wikipedia gives a good definition, that fits well with the contexts in which I see the term used:

Scientism is belief in the universal applicability of the scientific method and approach, and the view that empirical science constitutes the most authoritative worldview or most valuable part of human learning to the exclusion of other viewpoints.

Probability Trees and Marginal Distributions

In a blog post earlier this year about medical screening, On the hazards of significance testing. Part 1: the screening problem, statistical expert David Colquhoun demonstrates a simple way of visualizing the structure of certain probabilistic problems. This diagram, which we might call a probability tree, makes the sometimes counter-intuitive solutions to such problems far more easy to grasp (and in the process, helps put over-inflated claims about the effectiveness of screening into perspective).

Fear of Science

Many people react negatively to the idea that moral principles can be inferred entirely using scientific method. There is a general feeling that this is impossible. This seems to be partly why quite a lot of people view the decline of traditional sources of moral instruction as a serious threat. This is a major, double mistake.

In August last year, I attended an event, 'Answers in Science,' at Houston Museum of Natural Science, aimed at raising awareness of the way that a number of christian fundamentalists have been trying to sabotage the quality of scientific education in Texas schools. Among several that spoke there, two people raised points that struck me as highly significant, given the line of thought I've been pursuing for some time, with regard to the relationship between science and morality. They were Kathy Miller, from Texas Freedom Network, and Mike Aus, a former pastor.

Pass / Fail Mentality

(Following on from The Calibration Problem: Why Science Is Not Deductive)

Recently, I was talking about calibration (here and here), and how it should be more than just identifying the most likely cause of the output of a measuring instrument. The calibration process should strive to characterize the range of true conditions that might produce such an output, along with any major asymmetries (bias) in the relationship between the truth and the instrument's reading. In short, we need to identify all the major characteristics of the probability distribution over states of the world, given the condition of our measuring device.

Failure to go beyond simply identifying each possible instrument reading with a single most probable cause is a special case of a very general problem that in my opinion plagues scientific practice. Such a major failure mode should have a special name, so lets call it 'pass / fail mentality.' It is the most extreme possible form of a fallacy known as spurious precision, and involves needlessly throwing away information.

Announcing: Moral Science Index

Continuing the paradigm established by my glossary and my mathematical index, I've put together an index to and summary of the material I've accumulated on the topic of moral science. The index can be reached here, or from the link, 'Moral Science', on the right-hand side, beneath my profile.

The idea is simply to provide a point of entry for people interested in knowing what I have to say on this topic. People can see everything I have presented on this theme, the order in which the different pieces were published (and hence, approximately their dependency), a short description of each piece's function, together with some global motivating and qualifying remarks.

The relationship between science and morality represents a significant percentage of the material on my blog. It's an important (by definition) and highly overlooked topic, so I think it is important for people to have a single point of access to this material, the same way that the mathematical index provides a consolidated resource for learning about statistics, and the same way that the glossary represents the most definitive statement of my philosophy available, anywhere. (In some respects, I now view the blog as secondary to the glossary.)

I will try to keep the moral science index current - as I release more material, I'll update the index accordingly.

As always, I welcome your comments, questions, criticisms, outraged indignation, etc. If anything needs clarification, the fault is mine. If you're curious about some detail I can help with, then I'm delighted to do so (that's the whole point of the website, actually). Comments are open here and on the index itself, and alternative contact details exist on the right hand side of this page.

Some Highlights:

For your convenience, I'll reproduce here some of the major points from the moral science page.

(1) As of the publication date of this blog post, the index stands at:

Blog entries on this topic (in order of publication):

Scientific Morality

Crime and Punishment

Is Rationality Desirable?

Practical Morality, Part 1

Practical Morality, Part 2

Glossary entries on relevant concepts:

Absolutism

Consequentialism

Morality

Rationality

(2) To disclaim any extraordinary expertise in any specific realm of moral decision:

My writing on ethics is not to prescribe how to behave, but to inform on how to know how to behave.

(3) Quoting from the overview:

The founding principle behind my writing on this blog is that there is no better method to learn about anything than science. If a thing is meaningful - has consequences - science can measure it, by virtue of those consequences....

It is often said that science has nothing to say on the matter of what constitutes moral behaviour. If correct, this leaves us with only one option: morality has no meaning, it is a non-concept. It seems to me absurdly trivial that this is not so. Anyway, only a moderate amount of reflection is required to prove it. Thus, it is equally trivial to prove that science can guide us - in fact, is the optimal guide - concerning moral prescription.

(4) Another feature on the moral science page is a short list of blog articles I expect to write on the topic in the near future, covering (in no particular order):

the correspondence, if any, between correct consequentialism and classic utilitarianism

the correspondence, if any, between correct consequentialism and political libertarianism

(Spoiler alert: the answer in both cases is, not so much.)

some necessary aspects of the nature of human decision criteria

the limited insight offered by the classic thought experiments in the philosophy of ethics

the potential for correct moral realism to significantly reduce reliance on superstition, leading to a better informed and more rationally directed society

Saturday, May 17, 2014

The Calibration Problem: Why Science Is Not Deductive

Here is perhaps the most important fact about scientific method that anybody can ever learn: the optimal course of a scientific investigation is to provide probability assignments for propositions about the universe, and when scientific method deviates from this optimum path, it is valid only to the extent that it successfully approximates this ideal. There is a simple reason for this:

We would love to be able to say that we are 100% certain about X, that Y is guaranteed to be true, or that fact Z about the universe has somehow entered my head and impressed infallible knowledge of its necessary truth on my mind, but of course, except for the most trivial propositions, none of these is possible.

Firstly, every measurement is subject to noise, so there will always be a degree of uncertainty about what caused a particular experience.

Secondly, and far more fundamentally, calibration of any instrument requires certain symmetries of physical law to be hypothesized. Here's what I mean:

Calibrating an X-ray Spectrometer - Spectral Distortion

Calibration is a process whereby a relationship is inferred between the output of some measuring instrument and the physical process responsible for that output. An instrument may be something as simple as a ruler, or something as complicated as the Human Genome Project or the Planck cosmic background survey. Calibration is fundamental to science. We might even say that it is science.

When we think about calibration, we often think simply about finding the most probable value for some physical parameter, given some reading from an instrument. In the previous part, I described this simple process for a device used to characterize the distribution of photon energies in a stream of x-rays.

But we really ought to think of calibration as more than this. To make the best inferences possible from a reading, we should formulate the entire probability distribution, not just the location of its maximum, for the state of the world when the machine goes "bing," or when the display reads "42." When the readout says 7, it's good to know that I've most probably just found a black hole (perhaps), but it's also good to know what alternative explanations there are, and what amounts of probability mass they command.

Calibrating an X-ray Spectrometer - First Steps

Recently, I've been working with a borrowed piece of equipment - an x-ray spectrometer - whose response I need to understand, so I can take measurements with it. This is a special case of the general problem of calibration, which is a crucial topic in science, so I'd like to take some time to describe the procedure I went through. As you'll see later, the problem is not fully solved yet, which I suppose illustrates the trial-and-error nature of scientific work. Regardless of the degree of ultimate success, though, the process I'll describe strikes me as a fine illustration of the basic logic of experimental science.

The Exponential Distribution

The exponential distribution holds a special significance for me. My PhD thesis was all about optical transients, the simplest mathematical models of which are exponential distributions. Currently, I work in x-ray science, which is heavily concerned with the depletion of an (x-ray) optical field as it traverses some distribution of matter (both in an object being imaged, and in the detector) - this time the exponential distribution is over space, rather than time, but the mathematics is the same.

Any kind of involvement with mathematical science quickly brings us into intimate contact with exponential functions, as these arise left, right, and centre, in the solutions of differential equations. The reason for this is related to the fact that the exponential is the only mathematical function that is its own derivative. This is closely related to a special property of the exponential distribution, known as memorylessness (what will happen next - its rate of change - is entirely governed by the current state). So let's take a quick look into how the exponential distribution comes about, and what its major characteristics are.

Whose confidence interval is this?

This week, yet again, I was confronted by yet another facet of the nonsensical nature of the frequentist approach to statistics. The blog of Andrew Gelman drew my attention to a recent peer-reviewed paper studying the extent of misunderstanding of the meaning of confidence intervals, among students and researchers. What shocked me, though, was not the only findings of the study.

Confidence intervals are a relatively simple idea in statistics, used to quantify the precision of a measurement. When a measurement is subject to statistical noise, the result is not going to be exactly equal to the parameter under investigation. For a high quality measurement, where the impact of the noise is relatively low, we can expect the result of the measurement to be close to the true value. We can express this expected closeness to the truth by supplying a narrow confidence interval. If the noise is more dominant, then the confidence interval will be wider - we will be less sure that truth is close to the result of the measurement. Confidence intervals are also known as error bars.

The Full Adder Circuit

I recently wrote a very brief introduction to Boolean algebra for the glossary, so I thought it would be worth describing a very simple but important application example. There are two main reasons why I'm interested in Boolean algebra. The first is that in probability theory, the hypotheses we investigate are assumed to be Boolean in character (true or false, with no intermediates allowed). The second is that Boolean algebra is an important branch of logic, and therefore intimately linked to science and rationality.

In an earlier post, I discussed how all transfer of information comes down to a sequence of answers to yes/no questions. In this spirit, therefore, consider the following:

By answering only yes/no type questions, calculate the sum 234 + 111. In other words, if you were a digital computer, how would you perform this calculation?

Practical Morality, Part 2

It has been said that democracy is the worst form of government, except all those others that have been tried.

Winston Churchill

(The second of two parts. Read the first installment here.)

Politics & Science

I have a funny little feeling that Churchill actually knew a small bit about politics. According to dear, old Winston, democracy sucks. But why does it suck? And does it necessarily suck?

A full analysis of these questions could run into thousands of pages, and obviously stretches far beyond any area in which I could claim expertise, but for now at least, I want to point out just one aspect of democracy's poor performance to date that can most definitely be fixed. That is, the failure so far of both politicians and the electorate to explicitly recognize the necessarily rational basis for morality.

Practical Morality, Part 1

(The first of two parts. Part 2 is here.)

The Social Contract

Where-ever you are right now, take a quick look around. Do a quick survey of all the stuff you can see. Think about the number of things you have around you that other people have made. If you are in your own home, then great, the experiment works even better - the things around you probably belong to you, you make some kind of use of them, and quite possibly your life would be less satisfying without them. Some of these things may even be, if not essential for life, indispensable for a comfortable modern existence.

Maximum Entropy

Saturday, September 2, 2017

Disruptive Writing Style

Friday, August 18, 2017

Sums and Differences of Random Numbers

Thursday, August 3, 2017

Standard Error

Saturday, October 31, 2015

Multi-level modeling

Saturday, April 25, 2015

Mean vs median - a careful balancing act

Saturday, April 18, 2015

The Fundamental Confidence Fallacy

Friday, December 12, 2014

Science is for Everyone

Scientism

Saturday, November 8, 2014

Probability Trees and Marginal Distributions

Saturday, September 20, 2014

Fear of Science

Saturday, May 24, 2014

Pass / Fail Mentality

Tuesday, May 20, 2014

Announcing: Moral Science Index

Saturday, May 17, 2014

The Calibration Problem: Why Science Is Not Deductive

Tuesday, May 6, 2014

Calibrating an X-ray Spectrometer - Spectral Distortion

Saturday, May 3, 2014

Calibrating an X-ray Spectrometer - First Steps

Saturday, April 26, 2014

The Exponential Distribution

Saturday, March 22, 2014

Whose confidence interval is this?

Thursday, February 27, 2014

The Full Adder Circuit

Sunday, February 2, 2014

Practical Morality, Part 2

Tuesday, January 28, 2014

Practical Morality, Part 1