Colorful Betting Practices


Joyce writes…

We are all wrong sometimes, and it is a mark of a great thinker and good person to be honest about the intellectual mistakes one has made.  Our colleague here at Extinct, Derek Turner, has a fantastic (2016) paper in which he not just admits to being wrong, but also explores what can be learned from his error.

In earlier (2005, 2007) work, Derek had predicted that we would probably never determine or know what the colors of long-extinct dinosaurs once were.  This turned out to be an erroneous thing to say at precisely the wrong time—just as significant advances in the fields of molecular paleontology (see Schweitzer 2003; Gilbert, Bandelt, Hofreiter, and Barnes 2005; Willerslev and Cooper 2005; alternatively called molecular taphonomy) and experimental taphonomy (see Briggs 1995; Raff et al. 2006) had been, were being, and were about to be made.

Shortly thereafter, the first molecular (rather than speculative) work on long-extinct avian and dinosaur coloration was published (e.g., Vinther et al. 2008, 2010; Zhang et al. 2010), and other philosophers of science did not hesitate to point out that Derek had been proved wrong (e.g., Jeffares 2010; Stanford 2010; Cleland 2011).  In his (2016) paper, Derek asks: what are the implications of this failed epistemic bet?  He considers several such implications: for taking epistemic bets on science; for adopting either optimism or pessimism about the historical sciences; for making predictions at different temporal scales; for generating self-fulfilling pessimistic prophecies; and more. 

It is a fantastic paper, on a neglected topic, and one that is filled with plenty of novel and compelling work.  (I agree with Derek’s claim that his failed epistemic bet supports neither optimism nor pessimism about the historical sciences, for instance.)  I am going to set a lot of that lovely work aside, however, in order to focus on two main points of contention.

(1) Philosophers of science are sometimes concerned with the problem of underdetermination (Duhem 1906; Quine 1951).  Underdetermination occurs when it is not possible to discriminate between rival scientific hypotheses on the basis of the evidence (for an excellent introduction to this topic, please see the relevant SEP article by P. Kyle Stanford).  There are global underdetermination problems and local underdetermination problems.  As Derek puts it in an earlier (2005) piece, “Whereas local underdetermination problems arise during the course of scientific inquiry, global underdetermination problems are imposed upon science by philosophers” (Turner 2005, 219).  Local underdetermination problems constrain scientific investigation even within the bounds of what scientists aim to know.

Throughout his recent (2016) article, Derek maintains that “it was probably correct to say, pre-2008, that dinosaur coloration was an example of a local underdetermination problem” (Turner 2016, 64).  But I am not sure about this.  The claim rests on what Derek calls condition (d) for identifying local underdetermination problems: “Background theories give us some reason to think that H and H* are also strongly empirically equivalent” (63).

Derek thinks condition (d) is quite weak.  This is because, applying it to the case at hand, all Derek has to establish is that, pre-2008, differing hypotheses about the colors of dinosaurs “are (or would be) equally well supported by all the empirical evidence that will ever be available to us” (Turner 2005, 217).  He argues in his (2016) paper that unexpected methodological innovations and startling taphonomic revelations are jointly responsible for the post-2008 change in our epistemic situation.  Prior to those innovations and revelations, Derek thinks that background theory did indeed provide us with “some reason” to believe that hypotheses about dinosaur coloration were empirically equivalent.

I agree that, pre-2008, some background theory provided us with “some reason” to believe that hypotheses about dinosaur coloration were empirically equivalent.  But I also think that, pre-2008, other background theory provided us with “some reason” to believe that hypotheses about dinosaur coloration were not empirically equivalent.  So, I am not sure that condition (d) is satisfied—it depends on the scope of the “background theories” being characterized by the condition.  Is the condition met by considering just some background theory?  Or must all relevant background theory be considered?

Here is some other, potentially relevant, pre-2008, background theory: scientists have known for a long time that certain molecules are more stable than others, and that pigments and dyes can be especially long-lasting.  Think for a second about some of the most prized coloration agents, how they work, and why they have been prized for so long—throughout human history, and since well before the advent of modern chemistry.  Think of fabrics with colors still bright after many washings, and manuscripts radiant with illustration despite the passing of centuries.

Now consider melanin, an especially important molecular component of animal pigmentation.  Melanin comes in three basic types: black/brown eumelanin, dark brown neuromelanin, and yellow-to-red pheomelanin.  Melanin is essentially insoluble, and eumelanin is especially stable (Liu and Simon 2003).  The special stability of eumelanin means that hypotheses about black/brown coloration patterns in animal pigmentation are not empirically equivalent to hypotheses about lighter coloration patterns.  Empirical evidence of darker coloration is more likely to be available than is evidence of lighter coloration, because of the relative differences in molecular stability.

So, I think Derek’s claim—that pre-2008 background theory gives us “some reason” for thinking we will probably never know about dinosaur coloration—attends to some relevant background theory (of the obviously paleontological variety) while neglecting other relevant background theory (of what might be termed the physical or biochemical variety).  I think this point has implications for the general project of establishing local underdetermination problems, as it makes it harder than expected to satisfy the supposedly weak condition (d).  But I look forward to hearing what others think about this.

(2) Now, I want to transition to my second point of contention, by discussing what might be termed a sort of “stability gradient” for ancient biomolecules.  At one end of the potential-for-preservation spectrum—the favorable end—are certain structural macromolecules (like lignin) and some lipids (like carotenoids, steroids, and triterpenoids).  At the other, unfavorable end of the potential-for-preservation spectrum are nucleic acids (like DNA) and many proteins.  In between are many aromatics and carbohydrates (like cellulose and chitin).  Of course, the set-up of a simple spectrum or gradient like this is complicated by molecular idiosyncrasies, the potential for contamination, what are called “cross-linking processes,” and many more factors (please see Briggs and Summons 2014 for an excellent introduction to ancient biomolecules and their preservation).

Derek lost his epistemic bet against us ever knowing about dinosaur coloration in part because certain biological components and structures are quite a bit more stable than others.  As it turns out, traces of eumelanin can last for hundreds of millions of years (e.g., Tanaka et al. 2014), so placing a bet specifically against us ever knowing about the color of dinosaurs is an especially unfavorable move.  We can use the stability gradient along with other bits of related background theory to gauge the likelihood of scientific progress being made on other aspects of dinosaur physiology as well.

Consider the possibility of future work on dinosaur endocrinology.  Several factors work in favor of these efforts: the location of steroids on the stability gradient, the general allure of hormones, and the excitement generated by claims of ever-more-ancient molecules.  But other factors work against the ability of scientists to ever detect and study dinosaur hormones: the relative scarcity of these molecules, their small size, and the fact that they are not so densely packed into particular, protective locations (the way melanin is packed into melanosomes).  I think we should expect plenty of further work on questions of dinosaur physiology, and that particular aspects of dinosaur physiology (such as coloration or endocrinology) will probably be differentially targeted due to differences in relative epistemic accessibility.

And I think we should expect this work on dinosaur physiology to continue even if such questions seem narrow or silly to us.  One theme of Derek’s (2016) paper is the supposed (dare I say it!) triviality of work on dinosaur coloration, especially in contrast with purportedly grander paleontological work on larger-scale questions.  In the introduction he writes that “Inferring the colors of the dinosaurs is not too relevant to the big questions about evolutionary patterns and processes that many paleontologists care most about” (60), and in the conclusion he writes that “Figuring out the colors of the dinosaurs is somewhat peripheral to paleontologists’ efforts to reconstruct the big picture of evolutionary history” (67).

But in between, Derek also acknowledges that “It’s plausible that our epistemic resources inform our judgments about what counts as interesting. In cases where we know we have no scientific tools that give us any traction we might be more likely to dismiss questions as trivial or uninteresting.  On the other hand, the fact that we do have tools that give us some empirical traction with respect to some question can make that question seem interesting and important, if only because it affords us an opportunity to put our epistemic tools to work” (65).

I would like to suggest (as my second point of contention) that our epistemic resources might inform our judgments about what counts as interesting to a much stronger extent than Derek countenances here—and I want to use his own epistemic interests to support my conjecture.  The “big questions about evolutionary patterns and processes” that Derek seems so keen on—the ones he considers constitutive of paradigmatic paleontological concern—are questions whose ascendance dates back to the paleobiological revolution of the 1960s and 1970s (see Sepkoski 2012 for more).  And the asking of those questions, at that time, was driven by a methodological revolution in modeling capabilities.

So I just want to playfully enquire: are we sure that questions about dinosaur coloration are (merely, contingently) interesting because of their empirical traction, while questions about larger-scale paleontological phenomena are (more than merely, independently) interesting, despite their parallel emergence from a comparable period of enhanced empirical traction?  Note that we are already seeing work on dinosaur coloration extended to ecological hypotheses (e.g., pattern of coloration on dinosaur tail indicates residence in open rather than closed habitat, challenging regional assumptions of predominately forest ecology; Smithwick, Nicholls, Cuthill, and Vinther 2017).

In his (2016) paper, Derek cautions against adopting a no-betting policy, even though his own epistemic bet regarding dinosaur coloration failed.  This allows me to shamelessly place a pair of epistemic bets of my own—one based on each of the two main points of contention outlined here.  (1) I bet that asking narrow, physiological questions will only become more popular in upcoming paleontological practice, and that we can use background theory in biochemistry, experimental taphonomy, and molecular paleontology to gauge the epistemic accessibility of particular physiological questions.  And (2) I bet that those narrow, physiological questions will start to seem ever more interesting and central to paleontologists—just as the “big” questions started to seem ever more interesting and central to paleontologists, as their ability to ask and answer them grew.

Adrian writes…

Derek’s paper has influenced me, like, a lot, and I think it’s a great example of how to carry out (his term) philosophical error analysis.  When we philosophers muck up, instead of using our well-honed argumentative skills to in some sense double-down on the error, why not just admit the mistake and turn those skills to figuring out why the mistake was made, and what the philosophical upshots of it are?  (Also, to be quite frank, it bugged me how unreflectively some philosophers were willing to dump on Derek’s bad bet: yeah, it was an ironic turn of events, but as Derek points out it is very unclear what the philosophical upshots are supposed to be – bagging on philosophers when they stick their necks out and get unlucky is hardly a good way to foster productive, risky work).  In addition to the error-analysis, what makes Derek’s paper important for my own intellectual development was a big-picture upshot he draws when considering the nature of bets about science’s future.  Against the idea that we should be agnostic regarding inferences about about future scientific success (or failure), he doesn’t simply point out that scientists themselves need to make such bets, but opens the door to this betting being a properly-speaking epistemic activity that is central to scientific practice.  Why is this important?  Derek captures it well:

Most of the recent work done by philosophers of historical science has focused on the ways in which scientists confirm or disconfirm claims about the past. (61)

If you go read most of the work us philosophers of historical science do, the questions we’re interested in are things like ‘why believe this is true’, ‘what evidence is there for this hypothesis’ and so on…  These are—no doubt—important questions, especially if you want to know when you should believe in a hypothesis.  But it in no way exhausts the epistemology of historical science:

But what about the conclusions that historical scientists draw about the future?  Historical scientists and the institutions that fund their work have to make decisions about which questions are worth pursuing and which are best left for another day, or bypassed completely. (61)

Derek is suggesting we shift our analysis from what has been called the context of justification (what is the evidential relationship between some set of scientific observations or data and some set of scientific hypotheses?) to the context of pursuit (which hypotheses should I examine further?).  And the context of pursuit is, in my view, a thrilling prospect for us philosophers of science.  First, it relatively smoothly allows the discussion of (what are traditionally thought of as) non-epistemic values alongside epistemic values. One reason for pursuit is I reckon this might be true, while another is I’m likely to get funding if I do this, and another if I can answer this question it will truly help the world, for instance.  Considerations of pursuit involve re-orientating our conception of values in science. Second, it allows us to think about science in terms of resource distribution.  Given my available resources, how should I spend my scientific time to maximize bang-for-my-buck?  Third, it highlights the problems with thinking about science in terms of resource distribution: just what is it to maximize scientific bang-for-my-buck?  What counts as ‘bang’? Fourth, it allows us to analyze scientific research strategies and the skills involved in picking and developing those strategies.  How do scientists make decisions about pursuitability, is there a kind of skill or rationality involved, or are they simply buffeted by the winds of fate and fashion?  I’ll return to this final point in my discussion of Joyce’s discussion of Derek’s paper…

A big part of Derek’s pessimism about our knowledge of the past is drawn from a kind of optimism about our background theories.  He thinks that our knowledge of processes like fossilization are solid, and moreover grant solid grounds for pessimistic bets concerning the fossil record.  We know it is super hard for biological squishy bits to fossilize, so shouldn’t expect much help from the fossil record vis-à-vis squishy bits.  I and others have responded to Derek by saying that he’s making a mistake by betting against the ingenuity of future science.  I’ve in particular agreed that we shouldn’t expect our basic knowledge of fossilization to change, but argued that there are plenty of examples of new kinds of preservation being uncovered.  Even if our existing stock of background theory doesn’t change, this doesn’t mean that the stock won’t increase.  I think Joyce’s point is original and interesting here.  While the focus of the argument has been on what might change in the future, she points out that actually working out which knowledge is relevant for making such bets is really tricky.  Even if our knowledge from taphonomy gives us little reason to think color is preserved, our understanding of pigment in animals and the kinds of structures involved might.  Who is to say that there isn’t some area of science that you hadn’t thought of where, if you were to look, you’d see lots of reason for hope in our uncovering past knowledge?  And, I want to add: paleontologists are often really good at hunting this stuff down, which brings me to Joyce’s second point.

Joyce suggests that what explains the pursuitworthiness of hypotheses in historical science is not really the importance of the questions—their significance—but that we’ve the goods required to make some progress on that question.  Derek is quick to point out that much of the really big, important questions in paleontology are not really the domain of vertebrate paleontology but of invertebrate paleontology: only with those great, big data bases of inverts can we really get an empirical grip on macro-evolutionary process and pattern.  But why think that those questions are more important or significant than the color of dinosaurs?  Arguing about scientific significance ain’t easy.  But moreover, Joyce points out, there’s another explanation.  Scientific questions get interesting and exciting when the ‘epistemic resources’ available make those questions accessible, answerable.  No surprise that macroevolution came to the fore once we had the tech-game to run computational simulations of those processes (and then later the databases to couple these with empirical studies). 

We might worry that Joyce’s suggestion has a whiff of technological determinism about it.  Technological determinism is the idea that history is driven by technology—that social, political and economic movements can all be understood as reacting to changes in tech.  That’s a very unpopular idea in history concerned with the social, political and economic spheres, but perhaps might be a bit more tempting in science.  No doubt, the development of computational techniques transformed how we might think about and study the deep past.  But a strict technological determinism I find really unattractive for science as well: for one thing, it deemphasizes the role that wider society plays on how science develops, for another—and more relevantly here—it deemphasizes the skills, practices and hunches that I suspect drive pursuit-decisions in sciences like paleontology.  Another way of putting this latter idea is that technological determinism denies the autonomy and creativity of scientific communities.  Happily, the term ‘epistemic resources’ need not just mean the technologies at the scientists’ disposal—it can also mean their skill and training in figuring out what is pursuitworthy.

Something which strikes me about paleontologists is their often highly creative, highly opportunistic (‘methodologically omnivorous’) approach to pursuit.  The sense I’ve often got is that they are attracted to hypotheses not because they find them plausible, but because they get the sense that if I test that hypothesis, cool stuff will happen.  The upshots are often not direct, but indirect.  Quite often, the hypothesis being tested turns out to be false, but this doesn’t mean that the only epistemic benefit is knowing that something isn’t true: often new techniques, perspectives, and understandings arise from the process of exploring the ultimately false hypothesis.  This point, I think, shows how approaching historical science from the perspective of justification fails to understand paleontological practice.  Justification leads us to narrowly ask ‘well, is the hypothesis true or not?’ but in pursuit, we ask ‘what do we get from studying this hypothesis (or using this technique, or doing this fieldwork, etc…)?’ and I think it is this latter question which more-often drives paleontological practice, and paleontological success.

For me, then, Derek’s paper is a timely intervention in how we philosophers think about paleontology in particular and science in general: let’s shift from justification to pursuit!


Briggs, D. E. G. 1995. Experimental taphonomy. PALAIOS 10(6): 539–550.

Briggs, D. E. G.; Summons, R. E. 2014. Ancient biomolecules: Their origins, fossilization, and role in revealing the history of life. Bioessays 36: 482–490.

Cleland, C. 2011. Prediction and explanation in historical natural science. British Journal for Philosophy of Science 62(3): 551–582.

Duhem, P. 1906. La Théorie Physique, son objet et sa structure. Paris: Chevalier et Rivière.

Gilbert, M. T. P.; Bandelt, H.-J.; Hofreiter, M.; Barnes, I. 2005. Assessing ancient DNA studies. TRENDS in Ecology and Evolution 20(10): 541–544.

Jeffares, B. 2010. Guessing the future of the past. Biology and Philosophy 25: 125–142.

Liu, Y.; Simon, J. D. 2003. Isolation and Biophysical Studies of Natural Eumelanins: Applications of Imaging Technologies and Ultrafast Spectroscopy. Pigment Cell Res 16: 606­–618.

Quine, W. V. O. 1951. Two Dogmas of Empiricism. The Philosophical Review 60(1): 20–43.

Raff, E. C.; Villinski, J. T.; Turner, F. R.; Donoghue, P. C. J.; Raff, R. A. 2006. Experimental taphonomy shows the feasibility of fossil embryos. PNAS 103(15): 5846–5851.

Schweitzer, M. H. 2003. The future of molecular paleontology. Palaeontologia Electronica 5(2)/e2: 1–11.

Sepkoski, D. 2012. Rereading the fossil record: The growth of paleobiology as an evolutionary discipline. Chicago: University of Chicago Press.

Smithwick, F. M.; Nicholls, R.; Cuthill, I. C.; Vinther, J. 2017. Countershading and Stripes in the Theropod Dinosaur Sinosauropteryx Reveal Heterogeneous Habitats in the Early Cretaceous Jehol Biota. Current Biology 27: 3337–3343.

Stanford, P. K. 2010. Getting real: The hypothesis of organic fossil origins. Modern Schoolman LXXXVII: 219–241.

Stanford, P. K. 2017. Underdetermination of Scientific Theory. The Stanford Encyclopedia of Philosophy.

Tanaka, G.; Parker, A. R.; Hasegawa, Y.; Siveter, D. J.; Yamamoto, R.; Miyashita, K.; Takahashi, Y.; Ito, S.; Wakamatsu, K.; Mukuda, T.; Matsuura, M.; Tomikawa, K.; Furutani, M.; Suzuki, K.; Maeda, H. 2014. Mineralized rods and cones suggest colour vision in a 300 Myr-old fossil fish. Nature Communications 5(5920)/6920: 1–6.

Turner, D. D. 2005. Local underdetermination in historical science. Philosophy of Science 72: 209–230.

Turner, D. D. 2007. Making prehistory: Historical Science and the scientific realism debate. Cambridge: Cambridge University Press.

Turner, D. D. 2016. A second look at the color of the dinosaurs. Studies in History and Philosophy of Science 55: 60–68.

Vinther, J.; Briggs, D. E. G.; Clarke, J.; Mayr, G.; Prum, R. O. 2010. Structural coloration in a fossil feather. Biology Letters 6: 128–131.

Vinther, J.; Briggs, D. E. G.; Prum, R. O.; Saranathan, V. 2008. The colour of fossil feathers. Biology Letters 4: 522–525.

Willerslev, E.; Cooper A. 2005. Ancient DNA. Proceedings of the Royal Society B 272: 3–16.

Zhang, F.; Kearns, S. L.; Orr, P. J.; Benton, M. J.; Zhou, Z.; Johnson, D.; Xu, X.; Wang, X. 2010. Fossilized melanosomes and the colour of Cretaceous dinosaurs and birds. Nature 463(25): 1075–1078.