Picturing Data, Narrating History

Guest blogger David Sepkoski writes...

A sketch of the author with Hallucigenia.

A sketch of the author with Hallucigenia.

For well over a century, the visual “language” used for narrating patterns in the history of life has settled on a few kinds of emblematic images [note 1].  The first is the classic genealogical “tree,” made famous especially in Darwin’s and Ernst Haeckel’s evolutionary studies:

Darwin’s “tree of life” from Origin of Species, 1859.

Darwin’s “tree of life” from Origin of Species, 1859.

A tree is a good kind of diagram for showing phylogenetic or cladistic relationships, but evolutionary trees like this aren’t especially useful for conveying information about quantitative changes in phenomena like diversification or extinction—a tree simply records whether a lineage is present and when it ends, without telling much about what else is happening at the time. 

A better kind of image for telling that sort of story is a line graph, which can “narrate” the kinds of patterns in data that give us an idea of what’s happening in the overall ebb and flow of the history of life.  For example, this graph depicts changes in “faunal diversity” (i.e., the number of groups of animals of different kinds alive at any given moment) among marine invertebrates over the Phanerozoic eon:

“A Kinetic Model of Phanerozoic Taxonomic Diversity: III. Post-Paleozoic Families and Mass Extinctions” Paleobiology 10 (1984).

“A Kinetic Model of Phanerozoic Taxonomic Diversity: III. Post-Paleozoic Families and Mass Extinctions” Paleobiology 10 (1984).

This particular graph was published by my father, Jack Sepkoski, in 1984.  My dad built the first comprehensive electronic database for the fossil record, and this image is a good summary of the major finding of his data analysis: that life appears to have diversified in a “logistic” (s-shaped) pattern predicted by equations developed a century ago for population demography, but that this pattern has also been “perturbed” several times by major diversity drops—which we interpret to be mass extinctions.  The good news, though, is that this visual story tells us that life always seems to recover quickly afterwards.

My dad used computers and fairly complex statistical analysis to generate his data narrative, but in fact paleontologists have been doing this kind of thing for a long time.  In 1860, the English geologist John Phillips published the first diversity graph, based on an analysis of paper data collections:

John Phillips, Life on the Earth; Its Origin and Succession  (Cambridge; London: Macmillan, 1860).

John Phillips, Life on the Earth; Its Origin and Succession  (Cambridge; London: Macmillan, 1860).

One thing you should notice is that it looks very similar, in broad outline, to the Sepkoski graph—although you need to mentally rotate the image and flip it so that it’s oriented left-to-right like Sepkoski’s diagram.  The reason for this is that in the 19th century, the standard geological practice was to orient figures vertically, with time flowing up from the bottom, in the manner of a classic depiction of an idealized stratigraphic column like this one:

George Cuvier’s idealization of the stratigraphy of the Paris basin, from Ossemens Fossiles (Paris, 1812).

George Cuvier’s idealization of the stratigraphy of the Paris basin, from Ossemens Fossiles (Paris, 1812).

Line graphs obviously aren’t unique to paleontology—they’re used to represent changes in data in two dimensions that can stand for anything.  The axes of a diversity graph happen to represent time and number of taxonomic groups present.  Interestingly, line graphs like this weren’t used much in any context before the mid-19th century, when they suddenly took off as the standard idiom for depicting phenomena like economic growth or population changes.  More on this later.

A third kind of diagram commonly used in paleontology is what’s called a “spindle diagram” (since it looks a little bit like the spindle of a loom):

J. John Sepkoski Jr., “A Factor Analytic Description of the Phanerozoic Marine Fossil Record,” Paleobiology 7 (1981).

J. John Sepkoski Jr., “A Factor Analytic Description of the Phanerozoic Marine Fossil Record,” Paleobiology 7 (1981).

This diagram was also published in the 1980s by my father, and it shows the pattern of diversification for each group of marine fossil invertebrates as the varying thickness of each spindle.  It’s another way to use data to narrate the fossil record, and a particularly effective one for showing how particular groups’ histories correlate with one another.  For example, in this diagram we can see that the trilobites (in the middle column) diversified rapidly early in the history of life, and then plummeted to sudden extinction, while the gastropods and bivalves rose to replace them as dominant marine lifeforms.

Spindles are also a fairly old visual idiom.  They’ve been popular since the early 20th century, and are sometimes called “romerograms” after the vertebrate paleontologist Alfred Romer (1894-1973), who popularized their use.  In fact, spindles go back much earlier in paleontology.  They first appeared around the 1840s, in images like this one, from the German paleontologist Heinrich Georg Bronn’s 1849 Index Paleontologicus, a massive taxonomic data compilation and analysis of the entire fossil record:

H.G. Bronn, Index Palaeontologicus, Oder, Übersicht Der Bis Jetzt Bekannten Fossilen Organismen (Stuttgart: E. Schweizerbart, 1848).

H.G. Bronn, Index Palaeontologicus, Oder, Übersicht Der Bis Jetzt Bekannten Fossilen Organismen (Stuttgart: E. Schweizerbart, 1848).

Bronn (1800-1862) is a fascinating figure.  At a time long before the development of computers—or even probabilistic statistics—he developed a “data-driven” approach to paleontology that involved making a census of the global fossil record (by scouring taxonomic catalogs and compendia), converting that information to numerical data, and analyzing those data for patterns.  He expressed these patterns both as images like the spindle diagram above, as well as in complex numerical tables, like this one (also from his Index) showing the relationship between fossil (extinct) and living genera of plants and animals:

H.G. Bronn, Index Palaeontologicus, Oder, Übersicht Der Bis Jetzt Bekannten Fossilen Organismen (Stuttgart: E. Schweizerbart, 1848).

H.G. Bronn, Index Palaeontologicus, Oder, Übersicht Der Bis Jetzt Bekannten Fossilen Organismen (Stuttgart: E. Schweizerbart, 1848).

How Bronn came to this approach to the history of life—which was highly unusual for its day—is an interesting story.  While at the time of his death in 1862 Bronn was considered one of the leading paleontologists in Europe, he wasn’t originally trained as a paleontologist or geologist.   Rather, his background was in the science of “cameralism,” an 18th and early 19th century approach to rational state administration popular in central and northern Europe.

Cameralists were famous—and infamous—for collecting massive amounts of data about state populations, economic products, natural resources, and the like, and publishing them in mind-numbingly boring tables.  The idea was that this data could tell them—and the kings and princes they worked for—something useful about how to maximize profit and resources, but cameralists developed a fairly deserved reputation for fetishizing data accumulation without having much idea of what to do with it.

Bronn, however, was a member of a more cutting-edge school of cameralism at the University of Heidelberg, where he received his degree and joined the faculty in the early 1820s.  Heidelberg cameralists objected to this more limited ambition for state statistics, and instead promoted an approach they described as “statics,” in which statistical data collection was accompanied by analysis in an attempt to determine regularities or even empirical “laws.”  Bronn taught this approach in courses on agriculture and forestry throughout his career. 

At the same time, he developed an interest in geology and paleontology, which over the course of a decade went from being a sideline to the main focus of his research.  After a fossil collecting trip to Italy and the Alps, Bronn became interested in a central problem that occupied European geologists: how to date the earth’s strata relative to one another.

William Smith, A Memoir to the Map and Delineation of the Strata of England and Wales, With Part of Scotland (London: John Cary, 1815).

William Smith, A Memoir to the Map and Delineation of the Strata of England and Wales, With Part of Scotland (London: John Cary, 1815).

The science of stratigraphy had been established more than a decade earlier thanks to the work of geologists like Cuvier and the English surveyor William Smith, whose famous geological map of England demonstrated that the earth’s distinctive layers are “universal”—that is, they appear in the same order and arrangement all over the globe.  The strata themselves were identified on the basis of the composition of the rock (e.g., granite, sandstone, etc.) and the “characteristic” fossils