Last year I wrote an essay about cladistics and parsimony. It made some rounds and then the deadline for the following month’s essay came up and I turned my thoughts to other things. I suppose this hit-and-run mentality is what we call “blogging.”
This year I took a midterm exam for this first time in more than a decade. While I did well (as one would hope that a full-time professor might), confronting the errors I made on the exam has forced me to revisit my thoughts on parsimony. Today I have to admit: I committed a small error on the exam that’s since revealed a greater error in my approach to cladistics. So, you know, mea culpa.
(That’s right! Philosophers do admit mistakes. The admissions are just expressed in languages they don’t speak.)
Here’s what you’re now getting into: first I’m going to talk about significant figures in scientific measurement and draw out their broader meaning in scientific research; then I’m going to turn back to cladistic parsimony and argue, against the ghost of my 2016 self, for an a priori reason to favor that approach in paleontology.
Some day will be my last and on that day I’ll be able to tell you the convention for significant figures in science writing. I’ll be able to do that because I forgot it on my midterm and took some knocks for the forgetting. In that sense my error was a productive one: I won’t make it again. I can’t imagine wanting to talk about this on my last day, but at least the option is there.
What is the convention? Scientists aim for precision as well as accuracy in their measurements, but different fields have different standards of precision. A geologist studying rock strata laid down over thousands of years can hardly be expected to resolve measurements to the same timescale as a particle physicist studying nearly-instantaneous subatomic reactions. Hoping to standardize reporting of numbers in scientific publications, Eisenhart (1968) recommended the following: state a measurement’s potential error to two significant digits and state the measurement itself to the resulting number of digits given for the potential error.
In enrolling for paleontology classes, my goal was not only to learn what paleontologists do, but why they do it. So I did some digging into why this should be the convention.
Alas! Like a bad field season, my digging didn’t turn much up. Eisenhart didn’t justify his recommendation. His paper is a master class in the “it’s right because it is said to be right (by me, who would never use the word ‘I’ in print)” school of science writing.
For my own part, however, I see two goals accomplished through maintenance of Eisenhart’s convention. First, it relativizes scientific measures to the appropriate degree of uncertainty in the discipline. If a discipline’s measurements are uncertain within a calculated range, then that range of uncertainty then determines the discipline’s most precise measurements (i.e., the ones that ought to be reported). The second goal accomplished is to ensure that the expression of a measurement’s uncertainty is never trivial. All scientific results are uncertain to some degree and reporting of those results should reflect that.
Bear in mind that “error” and “uncertainty” are terms of art in scientific practice. They aren’t assessments of truth function or attitudes towards other beliefs. They are, in fact, measurements. When Renne, et al. (2013) measured the age of the Cretaceous-Paleogene event at 66.043 ± 0.043 Ma, the uncertainty—± 0.043 million years—measures not anything about the given value of 66.043, but instead about the range of values that the natural world would yield. Very roughly, scientific error and uncertainty measure just how much our perceptions (i.e., statistical samples) represent the fullness of reality (i.e., populations).
This is (perhaps) another reason why Eisenhart might have thought that significant figures should be determined by uncertainty measures rather than by reported values. A realist assumes that reality determines perception and not the other way around, after all.
What I got wrong in my (relative) youth
In my parsimony post, I wrote:
…we would recognize paleontology as a distinct and matured discipline (rather than as a handmaiden to other life sciences) if we could find some kinds of information uniquely valued in paleontological research. Qualitative similarity and temporal placement seem to be two such kinds of information. Inability to account for these kinds of information also happens to be the greatest weakness with cladistic parsimony.
In other words: phylogenetic reconstruction of extinct taxa depends on features that are difficult to quantify (although some cladists have given it the old college try), and so using cladistic parsimony to reconstruct evolutionary relations puts paleontologists at a unique disadvantage. Put yourselves in a position to succeed, people!
I maintain that there are philosophical problems in defining the populations from which fossil samples are drawn. But this is not to say that phylogenetic reconstruction of extinct taxa is purely a matter of perception. Rather, it's a matter of which element of reality we're trying to describe. It might be the relations between features of the fossils themselves (as I've hinted elsewhere) or it might be the evolutionary histories of the once-living animals that left those fossils behind. I'm leaning towards the former view because extinct populations are unobservable in principle rather than in practice, but in either case our phylogenetic reconstructions attempt to capture something real.
In scientific practice measurement of reality (rather than of perception) is expressed in terms of error and uncertainty. If paleontology is to be recognized as a mature and distinctive science, then, its research outcomes must include uncertainty measures. Those uncertainty measures relativize the degree of precision possible in paleontological measurements. All well and good so far.
Cladistic parsimony is one among several analytical methods in phylogenetic reconstruction. The purpose of these methods is to quantify and to measure relatedness between taxa. But among those paleontologists who don't share my esoteric views regarding the metaphysics of fossils, the reality to be captured in phylogenetic reconstruction is not relatedness per se. Relatedness is only a proxy for evolutionary history.
What parsimony measures do that other measures don't is attempt to quantify macroevolution itself. It does this by counting the number of evolutionary changes required to generate an hypothesized evolutionary history. In so doing, the uncertainty of the analysis—its relation to (unobserved) real evolutionary history—can be quantified. Without that measure, hypotheses about evolutionary history would lack an appropriate quantified context.
My original argument was that paleontological hypotheses about evolutionary history would always be imprecise relative to hypotheses generated by other life sciences. That's still true: molecular biologists, for example, can work with much higher-resolution data about changes in the genome. But this is only a problem if paleontologists are held to the same standards of precision as (say) molecular biologists. I now recognize this assumption as false. Maintaining the significant figure convention ensures that measurements in a scientific discipline are held to a standard of precision appropriate to that discipline. The defense against unfair critiques of imprecision (which was my original concern) is built into reporting practices.
I still think that there are problems with the integration of paleontological and neontological data. Nothing that I've studied so far has allayed those suspicions, and if anything I'm now more convinced of that. But the harsh lesson about significant figures (delivered with the deduction of a full point of credit!) has at least tempered my views in this one respect: maybe paleontology can be more amenable to parsimony analysis than I originally thought.
- Eisenhart, C. (1968). Expression of the Uncertainties of Final Results: Clear statements of the uncertainties of reported values are needed for their critical evaluation. Science, 160(3833), 1201-1204.
- Renne, P. R., Deino, A. L., Hilgen, F. J., Kuiper, K. F., Mark, D. F., Mitchell, W. S. 3rd, Morgan, L.E., & Smit, J. (2013). Time scales of critical events around the Cretaceous-Paleogene boundary. Science, 339(6120), 684-687.