Maximum Entropy: Calibrating an X-ray Spectrometer

Recently, I've been working with a borrowed piece of equipment - an x-ray spectrometer - whose response I need to understand, so I can take measurements with it. This is a special case of the general problem of calibration, which is a crucial topic in science, so I'd like to take some time to describe the procedure I went through. As you'll see later, the problem is not fully solved yet, which I suppose illustrates the trial-and-error nature of scientific work. Regardless of the degree of ultimate success, though, the process I'll describe strikes me as a fine illustration of the basic logic of experimental science.

Digital x-ray detectors work because when x-rays are absorbed in the detector, the energy goes into liberating large numbers of electrons, which get collected in the detector's read-out circuitry. To make a detector that can record the energy of the incoming x-ray, we can exploit the fact that on average, each electron gets a certain amount of energy, so that the number of electrons liberated is proportional to the energy in the absorbed x-ray photon. This will work as long as the detector sampling rate is high compared to the photon flux (a condition whose violation we call 'pile-up').

Each absorbed x-ray, therefore, creates a current pulse proportional to the x-ray's energy. For a multi-channel x-ray spectrometer, each current pulse is analyzed and assigned by the electronics to one of a range of available channels, each corresponding to a particular range of energies. Each time a pulse is assigned to a channel, a counter corresponding to that channel is incremented by 1. The calibration problem for such an instrument, therefore, is to find the relationship between the channel being incremented and the energy of the photon.

Most often, the calibration problem is considered to consist of finding the mean (or very often, just the mode) of the probability distribution, P(energy | channel), though a more complete calibration consists of characterizing the entire distribution, not just its peak or mean. This becomes particularly important if the probability distribution is notably asymmetric.

Finding the peak is a good place to start, however. In the present case, one method to do this relies on the phenomenon of K-shell fluorescence, which I'll briefly explain. The diagram below represents the spectrum of energies available to the electrons in an atom:

The number subscripts on the right indicate the so-called principle quantum number, n. At n = 1, the electron has the lowest energy it can have for that atom - its orbit is also closest (on average) to the nucleus. Higher-energy levels exist, getting more close together (energetically) as n increases, until a certain critical energy, at which the electron is no-longer bound to the nucleus - the electron is liberated to wander the vacuum, hence the 'V' subscript.

The n = 1 orbital is also referred to as the K-shell, n = 2 is known as the L-shell, and so on. If an electron in the K-shell absorbs a photon and is given enough energy to exceed E_V, then the electron leaves the atom altogether. The minimum energy required for this, the difference between E₁ and E_V , is known as the K-edge.

If an atom with several electrons is ionized in this way, an electron from a higher orbital must drop down to the vacated level to restore equilibrium, and this process often produces K-shell fluorescence - the excess energy of the electron that moves down to fill the K-shell is released in the form of a photon. Most often, the relaxing electron will come from either the n = 2 or the n = 3 orbital, and these are the transitions I've marked on the diagram - the emitted light is depicted as the green oscillations. For the transition n = 2 to n = 1, a K_α photon is emitted, while relaxation from n = 3 to n = 1 produces a K_βphoton.

For hydrogen, the transitions terminating at the K-shell produce ultraviolet photons, and are termed the Lyman series (the Balmer series are the transitions terminating at the L-shell (n = 2), and so on). For larger atoms, the K-photon energies are in the x-ray range. Because of the discrete nature of the energy levels participating in these fluorescence events, a fluorescence spectrum for a pure metal will consist of a series of very sharp lines, whose energies are unvarying properties of the atoms of the metal. Here is a fluorescence spectrum I measured for a pure sample of tin using my borrowed cadmium-telluride (CdTe) spectrometer. The tin was exposed to the photon flux coming from my tungsten x-ray tube, and the fluorescence was collected in a 90° back-scattering geometry:

The energies of these peaks can be looked up, (for example, in tables in this x-ray data booklet, from Lawrence Berkeley Lab) and compared to the channels at which the recorded signal peaks. Repeating for several fluorescent metals (zinc, zirconium, and tungsten, in my case) gives a series of channel-energy pairs, which can be fitted with some calibration model using maximum likelihood, or some other method. The spectrometer I was using is quite well designed, with the consequence that a linear fitting model was suitable for finding the expected energy for photons registered in each channel.

Because the measurements are noisy, simply taking the channel at which the signal is maximum is not the best way to to find the peak channel. To find the peak channel, then, the fluorescence spectra were fitted with reasonable line shape functions, in this case a Gaussian function for each emission line, using maximum likelihood. In each case, the fitting software I used gave an error bar for the fitted mean of each Gaussian, which gives the standard deviation of the assumed Gaussian error distribution for each inferred peak position. From this information, the following table was drawn up:

One thing to notice is that the α and β emissions actually can have substructure, such that for tungsten, three different β lines take part, though two of them are not resolved (that's why I used their average position for the calibration).

The third and fourth columns in the table give the fitted peak positions and their associated standard deviations. The known peak energies are plotted against these fitted peak channels, and fitted with a linear model. The linear model uses the a and b parameters given in the little box on the right of the main table. The sixth column in the table has the values of the linear model at each channel number in the third column.

From my earlier description of parameter estimation, the joint likelihood function for for any set of model parameters, θ, can often be calculated from

where the d's are the data (the known peak energies, corresponding to the measured channels), and the y's are the model values. We can therefore maximize the likelihood function by minimizing the sum of the squared residuals, divided by the square of the standard deviation (from column 4). These weighted residuals are in the last column, and their sum is given as the χ² parameter, in the little box. This χ² is optimized numerically (it can also be done analytically, using linear algebra), by adjusting the a and b parameters until χ² is at its minimum. The resulting fit is given in the table, and is shown in the plot below:

Each data point has been plotted with its associated error bar, but most of the error bars are smaller than the data markers.

That was the easy part. The difficulty appears when we look to see if there is any systematic distortion of a measured spectrum - that asymmetry in P(energy | channel), I was talking about. Take a look at this spectrum I measured directly for the tungsten x-ray tube:

The spectrum has quite a bit of structure, and most of it reflects the true nature of the source very well. The x-rays are produced by firing a stream of electrons at a tungsten target. In this case, the electrons are accelerated by a 100 kV potential difference (giving them each exactly 100 kilo electron volts of energy). These electrons can continuously lose energy as they fly through the metal, causing a continuum of radiation to be emitted. That's the main, broad peak in the spectrum, known as 'bremsstrahlung.'

These accelerated electrons can also knock away inner electrons from the tungsten atoms, leading to rearrangement of the outer electrons, and associated fluorescence emissions, exactly as described above for photo-absorption. At a little over 10 keV, there are two sharp emission lines corresponding to the L-shell characteristic fluorescence from the tungsten in the x-ray tube. At about 59 and 67 keV, two more groups of lines appear, due to the K-shell transitions for tungsten.

There are, however, some step-like artifacts in the spectrum, at about 27 and 32 keV, which are not characteristics of the spectrum from the x-ray tube. Instead, these are properties of the detector. These energies happen to match the K-edges for the cadmium and tellurium atoms in the detector, and it's a safe bet that these step-like drops in intensity are due to fluorescent photons carrying absorbed energy out of the detector, before it gets a chance to be collected at the readout electrode. In the next part, I'll describe my efforts so far to correct such effects, by calculating sampling distributions for these and a number of other spectral distortion mechanisms that I confidently believe to be influencing the detected signal.

Big thanks to Charles Willis and Bill Erwin at M.D. Anderson for lending me their spectrometer. It's a nice piece of kit.