Printer Friendly

Fossil cross-validation of the dated ant phylogeny (Hymenoptera: Formicidae).

Abstrac--As molecular dock methods become more widely used it has become apparent that careful consideration of fossil minimum calibrations is essential. Not only is it necessary to be certain of the taxonomic identity of the fossils and correct placement within the phylogenetic tree, recent studies have suggested that when multiple fossils are available consideration of conflict among fossils must also be taken into account. In this study we investigate whether any of the 43 fossils used by Moreau, et al. (2006) are "inconsistent" and how this affects the results of molecular clock dating analyses and inferred diversification patterns. After considering each of the 43 fossils in turn, following the methods of Near and colleagues (Near & Sanderson, 2004; Near et al., 2005), we round that five fossils are considered to be "inconsistent." After removing these fossils and reanalyzing the data, we round that excluding these minimum age fossil calibration points did not have a considerable effect on the results. Comparing lineages-through-time plots demonstrate that not only are similar ages recovered, but also that the previously inferred significant shift in diversification rates within the ant phylogeny is not an artifact of the "inconsistent" fossils. These findings suggest that all available fossil information should be included in molecular dock analyses.


Recent advances in molecular phylogenetic analysis have permitted divergence dating for numerous lineages or clades through the use of molecular clock methods (Sanderson, 1997, 2002; Thorne et al., 1998; Huelsenbeck et al., 2000; Drummond et al., 2006). To calibrate the timing of divergence for lineages of interest, information from fossils, geologic events or rates of molecular evolution may be incorporated. This information coupled with molecular clock analyses facilitates the placing of a timeline on the origin of lineages and can also be used for further testing of evolutionary hypotheses. Although only one form of calibration is needed to conduct most molecular clock analyses, given the apparent heterogeneity of rates among lineages (e.g., Smith and Donoghue, 2008), a more ideal situation is when more than one minimum or maximum calibration can be incorporated into the analysis (Ochman and Wilson, 1987; Marshall, 1990; Soltis et al., 2002; Graur and Martin, 2004; Near et al., 2005; Bell et al., 2010). Although few would argue having more information is usually better, this may not be the case if the data are in conflict or if the calibration points were inaccurately placed on the phylogeny.

Reasons for conflict for a particular fossil calibration may come in many forms. The fossil calibration could be incorrectly placed within the phylogeny, the phylogenetic relationships within the phylogeny may be incorrectly inferred resulting in placing the fossil on the wrong node within the topology, the fossil could be incorrectly identified, the geologic strata from which the fossil was found could be incorrectly dated, among others forms (Hug & Roger, 2007; Hugall et al., 2007; Rutschmann et al., 2007; Heath et al., 2008; Marshall, 2008; Parham and Irmis, 2008). Some of these concerns can be addressed by careful examination of the fossils to ensure they are not only correctly identified, but that they are also correctly placed within the phylogeny. Concerns over the inference of the phylogeny and the resulting inferred relationships of the included taxa can be a major source of problems. Recent Bayesian relaxed clock methods offer a promising opportunity to account for uncertainty in phylogenies by simultaneously estimating the topology and branch lengths (Drummond et al., 2006).

Near and colleagues (Near & Sanderson, 2004; Near et al., 2005) highlighted the potential problem of conflicting multiple fossil calibration points. To address the issue of when multiple nodes across a phylogeny are constrained as minimum or maximum ages based on fossil or other data and they are in disagreement, Near and colleagues (Near & Sanderson, 2004; Near et al., 2005) proposed the "fossil cross-validation" method to identify which if any fossils generate inconsistent, and potentially erroneous, molecular age estimates. Once these inconsistent fossils have been determined, the authors advocate that these fossils be excluded and the analysis performed with all remaining non-conflicting fossils included.

The fossil cross-validation method (Near & Sanderson, 2004; Near et al., 2005) is performed on a previously inferred phylogenetic tree where each fossil calibration is 1) fixed, then 2) the calculated difference between the molecular and fossil estimates for all other fossil-dated nodes is calculated. This two-step method aims to identify and remove inconsistent fossils from the analysis. To identify potential "inconsistent" fossils, we used the average DA- statistic of Near, et al. (2005) as a heuristic. Since fossil calibrations (or constraints) are minimum age estimates for nodes, we determined that fossils were inconsistent if average [D.sub.x] values were negative. That is calibrations that consistently yielded molecular age estimates that were younger (more recent) than their fossil age estimate.

Moreau, et al. (2006) published the first large-scale molecular phylogeny of ants (Hymenoptera: Formicidae) based on 4.5 kilobases of sequence data from six gene regions for 139 ant genera. From these data relationships among the major ant lineages were inferred demonstrating that of the 19 subfamilies included in the analysis all were recovered as monophyletic with the exception of Cerapachyinae. In addition, Moreau, et al. (2006) incorporated 43 ant fossils as minimum age calibrations for divergence dating analyses. To account for the fact that 12 of the 43 fossils are from formations of uncertain stratigraphic ages, Moreau, et al. (2006) performed two separate molecular clock analyses. The first analysis used the minimum age for each of the formations from which the 12 fossils belonged plus all 31 remaining fossils with a maximum constraint on the root age for all ingroup and outgroup taxa (excluding Apis mellifera) at 200 million years ago (Ma) (minimum fossil ages dataset) and the second analysis was implemented with the maximum age for the same formations for the 12 fossils plus all 31 remaining fossils with maximum constraint on the root age for all ingroup and outgroup taxa (excluding Apis mellifera) at 250 Ma (maximum fossil age dataset). The outcome of these analyses resulted in a range of dates for the origin of the extant ant lineages (140-168 million years ago). In both analyses the complete set of fossils were incorporated as minimum age calibrations without taking into account whether any of the fossils were in conflict within the phylogeny. Based on these molecular clock and lineages-through-time (LTT) divergence time analyses, Moreau, et al. (2006) found that much of the diversification of the major ant lineages occurred from the early Paleocene to the late Cretaceous (60 to 100 Ma) and may be correlated with the rise of the flowering plants (angiosperms).

To test if any of the 43 fossils used by Moreau, et al. (2006) are potentially in conflict with one another, we performed Near, et al.'s fossil cross-validation procedure on all 43 fossils under both the minimum and maximum datasets. These results are compared to the results obtained by Moreau, et al. (2006) and how this may or may not affect the inferred patterns of diversification across the ants.


The maximum likelihood topology of Moreau, et al. (2006) was used for testing inconsistent fossil calibrations [all files from the Moreau, et al. (2006) paper can be downloaded from]. The original dataset was composed of 4.5 kb of sequence data from five nuclear and one mitochondrial gene from 139 ant genera and six Hymenoptera outgroups (Moreau et al., 2006). Fossil calibrations and age constraints for all 43 fossils follow those outlined in Moreau, et al. (2006) with fossils used as minimum ages for the lineage to which it belongs plus the sister lineage. Molecular clock analysis was performed using the penalized likelihood method (Sanderson, 2002) as implemented in the software package r8s v1.7 (Sanderson, 2003).

Following the methods of Near and colleagues, the fossil cross-validation analysis (Near & Sanderson, 2004; Near et al., 2005) was performed on the previously inferred maximum likelihood phylogenetic tree. In turn a single fossil dated node was fixed and the calculated difference between the molecular and fossil estimates for all other fossil-dated nodes was calculated. To determine if any of the observed differences were significant, potentially demonstrating that the calibration point is in conflict with the other fossil calibrations, we used the average [D.sub.x] statistic of Near, et al. (2005) as a heuristic. Since fossil calibrations (or constraints) are minimum age estimates for nodes, we determined that fossils were inconsistent if average [D.sub.x] values were negative.

Once the "inconsistent" fossils were determined using the fossil cross-validation method, these fossils were excluded from the final molecular clock analyses. Again the maximum likelihood topology (Moreau et al., 2006) was used for the divergence dating of the "inconsistent fossils removed" dataset using r8s (Sanderson, 2003). Like Near, et al. (2005), we assessed significance in the change of variance before and after "inconsistent" fossils were removed by using a one-tailed F-test, with N-1 degrees of freedom, where n is the number of nodes in a rooted tree.

To visualize the effect of removing fossils that were deemed to be inconsistent, the ultra-metric trees obtained from the penalized likelihood analyses were used to calculate proportional non-log transformed lineages-through-time (LTT) plots for the ants. These LTT plots were then compared to those recovered by Moreau, et al. (2006) where all 43 fossils (including those deemed "inconsistent" in this study) were included in the divergence time analyses.


Each of the 43 minimum fossil calibrations used by Moreau, et al. (2006) was investigated in turn using the fossil cross-validation method of Near and colleagues (Near & Sanderson, 2004; Near et al., 2005) to test for inconsistency. Based on this method, five fossils were deemed "inconsistent" and therefore removed from the final molecular clock analyses (Table 1), which means these calibrated clades yielded molecular age estimates that were younger than their fossil age estimate. The "inconsistent fossils removed" dataset consisted of the remaining 38 fossils used as minimum calibration points on the maximum likelihood topology of Moreau, et al. (2006). The F-tests ([P.sub.min] = 0.18, [P.sub.max] = 0.09), suggested that this did not significantly remove the variance seen across ail the dated nodes in the phylogeny for both the maximum and minimum age treatments. For a graphical representation of this lack of significance, the resulting ultra-metric trees obtained from the penalized likelihood analyses of the two "inconsistent fossils removed" datasets accounting for the fossils from formations of uncertain stratigraphic ages (minimum fossil ages dataset and maximum fossil ages dataset) were subjected to LTT analyses for comparison to the original results found by Moreau, et al. (2006) when all 43 fossils were included without regard to potential inconsistency.

Comparing the diversification rates recovered when the "inconsistent" fossils were removed for both the minimum fossil ages dataset and maximum fossil ages dataset resulted in very similar results (Fig. 1). In the case of the minimum fossil ages dataset the overall shape of the LTT curve is nearly identical, although once the rive "inconsistent" fossils were removed a slightly older age (Fig. 1--dotted line: five inconsistent fossils removed versus dash-dotted line: all 43 fossils included) for some lineages nested within the ant phylogeny were inferred. For the maximum fossil ages dataset not only were the shapes of the LTT curves very similar, but the ages for all ant clades were nearly identical (Fig. 1--dashed line: five inconsistent fossils removed versus solid lines: all 43 fossils included).


Molecular clock and divergence time analyses have advanced our understanding of the timeline of evolution for many groups. With a rich fossil record our understanding of the age and diversification of modern ants has benefited from the use of these molecular clock tools (Brady et al., 2006; Moreau et al., 2006; Moreau, 2009). Not only have these methods and analyses resulted in estimates for the age of the modern crown group ants, but have also allowed for investigation into changes in rates of diversification. Based on a large-scale molecular phylogeny and incorporating 43 fossils as minimum age constraints in molecular clock analyses Moreau, et al. (2006) found a burst in the diversification of the ants around 100 Ma, which seems to be correlated with the rise of the flowering plants (angiosperms) and many sap-feeding insects.

In regard to molecular clock analyses, one might expect that it is always better to include all available fossil data into an analysis, but Near and colleagues (Near & Sanderson, 2004; Near et al., 2005) highlighted the fact that in some cases a number of fossil calibration points may be "inconsistent" and should be excluded. To test if any of the 43 fossils included by Moreau, et al. (2006) fit into the "inconsistent" category and are potentially affecting the divergence dating results, we applied this method to these data. Even after excluding five fossils deemed "inconsistent" according to the methods of Near and colleagues (Near and Sanderson, 2004; Near et al., 2005) we did not find that this affected the overall diversification patterns recovered by Moreau, et al. (2006) (Fig. 1).


Although we did not find that excluding the "inconsistent" fossils changed our overall findings regarding the diversification patterns in the ants (Fig. 1), we acknowledge that this could be due to the large number of fossils available in the ants to serve as minimum calibration points. The potential negative effect of "inconsistent" fossils could be greater with fewer calibration points, although the validity of the method proposed by Near & colleagues (Near & Sanderson, 2004; Near et al., 2005) has been questioned (Marshall, 2008; Parham & Irmis, 2008). In most divergence dating analyses fossils are treated as minimum age constraints, and by definition can be redundant but not inconsistent. Fossil calibrations may be found to be inconsistent if maximum ages are implemented or if the fossils are treated as point estimates, as in the fossil cross-validation test (Parham & Irmis, 2008), but this again will be affected by the amount of rate heterogeneity among branches (Graur & Martin, 2004). Another shortcoming of the fossil cross validation method is that the method tends to discard calibrations until the remaining are mutually consistent, which could results in discarding the most informative accurate calibrations (Marshall, 2008; Ho & Phillips, 2009). For these reasons, among others, Ho and Phillips (2009) advocate retaining as many minimum calibration points as possible in dating analyses.

Although we did not fmd that excluding any of the "inconsistent" fossils from out divergence dating analyses had much effect on the results, we note that careful consideration and examination of all calibration information is necessary since incorrect use or placement could have large negative affects on molecular clock analyses (Hug & Roger, 2007; Hugall et al., 2007; Rutschmann et al., 2007; Heath et al., 2008; Marshall, 2008; Parham & Irmis, 2008; Ware et al., 2010). Our findings suggest that excluding fossils deemed "inconsistent" may not be necessary and may not affect dating analyses. In addition, the validity of the fossil cross-validation method has been questioned (Marshall, 2008; Parham & Irmis, 2008; Ho & Phillips, 2009) suggesting that excluding any fossil information may not be a good idea, which is reassuring for many taxonomic groups since multiple fossil calibrations are not available for molecular clock analyses.


Special thanks to Jessica Thomas and Jessica Ware for the invitation to participate in the Northeastern Symposium on Evolutionary Divergence Time hosted at Rutgers University in January 2010, which lead to this publication. We thank four anonymous reviewers for comments that helped improve this paper. This work was funded in part by the Department of Zoology at the Field Museum of Natural History and the Office of Research and Sponsored Projects at the University of New Orleans.


Bell, C. D., D. E. Soltis and P. S. Soltis. 2010. The age and diversification of the angiosperms re-revisited. American Journal of Botany 97: 1296-1303.

Brady, S. G., T. R. Schultz, B. L. Fisher and P. S. Ward. 2006. Evaluating alternative hypotheses for the early evolution and diversification of the ants. Proceedings of the National Academy of Sciences of the USA 103: 18172-18177.

Drummond, A. J., S. Y. Ho, M. J. Phillips and A. Rambaut. 2006. Relaxed phylogenetics and dating with confidence. PLoS Biology 4: e88.

Graur, D. and W. Martin. 2004. Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. Trends in Genetics 20: 80-86.

Heath, T. A., S. M. Hedtke and D. M. Hillis. 2008. Taxon sampling and the accuracy of phylogenetic analyses. Journal of Systematics and Evolution 46: 239-257.

Ho, S. Y. and M. J. Phillips. 2009. Accounting for calibration uncertainty in phylogenetic estimation of evolutionary divergence times. Systematic Biology 58: 367-380.

Huelsenbeck, J. P., B. Larget and D. Swofford. 2000. A compound Poisson process for relaxing the molecular clock. Genetics 154: 1879-1892.

Hug, L. A. and A. J. Roger. 2007. The impact of fossils and taxon sampling on ancient molecular dating analyses. Molecular Biology and Evolution 24: 1889-1897.

Hugall, A. F., R. Foster and M. S. Lee. 2007. Calibration choice, rate smoothing, and the patterns of tetrapod diversification according to the long nuclear gene RAG-1. Systematic Biology 56: 543-563.

Marshall, C. R. 1990. Confidence intervals on stratigraphic ranges. Paleobiology 16: 1-10.

Marshall, C. R. 2008. A simple method for bracketing absolute divergence rimes on molecular phylogenies using multiple fossil calibration points. American Naturalist 171: 726-742.

Moreau, C. S., C. D. Bell, R. Vila, S. B. Archibald and N. E. Pierce. 2006. Phylogeny of the ants: diversification in the age of angiosperms. Science 312: 101-104.

Moreau, C. S. 2009. Inferring ant evolution in the age of molecular data (Hymenoptera: Formicidae). Myrmecological News 12: 201-210.

Near, T. J. and M. J. Sanderson. 2004. Assessing the quality of molecular divergence time estimate by fossil calibrations and fossil-based model election. Philosophical Transactions of the Royal Society of London Series B-Biological Sciences 359: 1477-1483.

Near, T. J., P. A. Meylan and H. B. Shaffer. 2005. Assessing concordance of fossil calibration points in molecular clock studies: an example using turtles. American Naturalist 165:137-146.

Ochman, H. and A. C. Wilson. 1987. Evolution in bacteria: evidence for a universal substitution rate in cellular genomes. Journal of Molecular Evolution 26: 74-86.

Parham, J. F. and R. B. Irmis. 2008. Caveats on the use of fossil calibrations for molecular dating: a comment on Near et al. American Naturalist 171: 132-136.

Rutsehmann, F., T. Eriksson, K. Abu Salim and E. Conti. 2007. Assessing calibration uncertainty in molecular dating: the assignment of fossils to alternative calibration points. Systematic Biology 56: 591-608.

Sanderson, M. J. 1997. A nonparametric approach to estimating divergence times in the absence of rate constancy. Molecular Biology and Evolution 14: 1218-1231.

Sanderson, M. J. 2002. Estimating absolute rates of moleeular evolution and divergence times: a penalized likelihood approach. Moleeular Biology and Evolution 19: 101-109.

Sanderson, M. J. 2003. r8s: inferring absolute rates of moleeular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19: 301-302.

Smith, S. A. and M. J. Donoghue. 2008. Rates of molecular evolution are linked to life history in flowering plants. Science 322: 86-89.

Soltis, P. S., D. E. Soltis, V. Savolainen, P. R. Crane and T. G. Barraclough. 2002. Rate heterogeneity among lineages of tracheo phytes: integration of molecular and fossil data and evidence for molecular living fossils. Proceedings of the National Academy of Sciences of the USA 99: 4430-1435.

Thorne, J. L., H. Kishino and I. S. Painter. 1998. Estimating the rate of evolution of the rate of molecular evolution. Molecular Biology and Evolution 15: 1647-1657.

Ware, J. L., D. A. Grimaldi and M. S. Engel. 2010. The effects of fossil placement and calibration on divergence times and rates: an example from the termites (Insecta: Isoptera). Arthropod Structure and Development 39: 204-219.


(1) Field Museum of Natural History, Department of Zoology, 1400 South Lake Shore Drive, Chicago, Illinois 60605, USA, Phone: (312) 665-7743, E-mail:

(1a) Email address for correspondence:

(2) University of New Orleans, Department of Biological Sciences, 2000 Lakeshore Drive, New Orleans, Louisiana 70148, USA, Phone: (504) 280-7040, E-mail:
Table 1. Fossil cross-validation results for the 43 fossils
originally used by Moreau, et al. (2006). The fossils found to be
"inconsistent" are denoted by X. Fossil dates with asterisk indicate
that molecular clock analyses were done with both lower and upper
dates to ensure confidence in dating of fossils. Average Dx
differences between the molecular and fossil age estimates.

Node/taxon          Oldest        Fossil locality
                    fossil (Ma)

Leptogenys          15.5          Shanwang Formation, China
Myopopone           15.5          Shanwang Formation, China
Acropyga            15.0-20.0 *   Dominican Amber,
                                  Dominican Republic
Azteca              15.0-20.0 *   Dominican Amber,
                                  Dominican Republic
Cephalotes          15.0-20.0 *   Mexican Amber, Mexico
Discothyrea         15.0-20.0 *   Mexican Amber, Mexico
Neivamyrmex         15.0-20.0 *   Dominican Amber,
                                  Dominican Republic
Odontomachus        15.0-20.0 *   Dominican Amber,
                                  Dominican Republic
Pyramica            15.0-20.0 *   Dominican Amber,
                                  Dominican Republic
Trachymyrmex        15.0-20.0 *   Dominican Amber,
                                  Dominican Republic
Crematogaster       28.4-33.9 *   Sicilian Amber, Italy
Podomyrma           28.4-33.9 *   Sicilian Amber, Italy
Pheidole            34.0          Florissant Formation, USA
Pogonomyrmex        34.0          Florissant Formation, USA
Agroecomyrmecinae   44.1          Baltic Amber
Anonychomyrma       44.1          Baltic Amber
Aphaenogaster       44.1          Baltic Amber
Camponotus          44.1          Baltic Amber
Cerapachys          44.1          Baltic Amber
Formica             44.1          Baltic Amber
Irtdomyrmex         44.1          Baltic Amber
Lasius              44.1          Baltic Amber
Monomoriu           44.1          Baltic Amber
Myrmic              44.1          Baltic Amber
Oligomyrmex         44.1          Baltic Amber
Plagiolepis         44.1          Baltic Amber
Proceratiinae       44.1          Baltic Amber
Rhytidoponera       44.1          Baltic Amber
Solenopsi           44.1          Baltic Amber
Stenamma            44.1          Baltic Amber
Tetramorium         44.1          Baltic Amber
Tetraponera         44.1          Baltic Amber
Vollenhovia         44.1          Baltic Amber
Myrmicinae          52.0          Hat Creek Amber, Canada
Tapinoma            52.0          Hat Creek Amber, Canada
Dolichoderus        48.5-53.5 *   Green River Formation, USA
Pachycondyla        48.5-53.5 *   Green River Formation, USA
Myrmeciinae         54.5          O1st Formation, Denmark
Ponerinae           60.0          Sakhalin Amber, Russia
Dolichoderinae      79.0          Canadian Amber, Canada
Ectatomminae        79.0          Canadian Amber, Canada
Formicinae          92.0          New Jersey Amber, USA
Aneuretinae         100.0         Burmese Amber, Myanmar

Node/taxon            Fossils found       Average       Average
                          to be          Dx values     Dx values
                    "inconsistent" in   for minimum   for maximum
                       this study       age dataset     dataset
                         (N = 5)

Leptogenys                                 54.1          66.3
Myopopone                                  60.7          76.2
Acropyga                                   29.2          27.4

Azteca                                     61.1          66.5

Cephalotes                                 58.2          62.4
Discothyrea                                75.6          92.2
Neivamyrmex                                14.4          14.4

Odontomachus                               35.9          39.5

Pyramica                                   27.7          26.9

Trachymyrmex                               30.2          30.2

Crematogaster                              48.4          53.6
Podomyrma                                  36.9          38.6
Pheidole                                   45.8          56.6
Pogonomyrmex                               42.1          53.2
Agroecomyrmecinae                          35.9          90.1
Anonychomyrma                              10.81         12.6
Aphaenogaster                              10.6          15.9
Camponotus                  X              -4.8          -2.9
Cerapachys                                 43.5          59.4
Formica                                     0.27          2.7
Irtdomyrmex                 X             -14.5         -13.7
Lasius                      X              -8.8          -7.8
Monomoriu                   X              -3.6          -4.2
Myrmic                                     36.1          48.1
Oligomyrmex                                18.8          17.6
Plagiolepis                                 4.5           7.57
Proceratiinae                              67.8          95.2
Rhytidoponera                              28.9          37.9
Solenopsi                                  28.9          38.3
Stenamma                                   35.9          47.7
Tetramorium                                22.15         31.1
Tetraponera                                 6.78         17.5
Vollenhovia                                25.5          34.1
Myrmicinae                                 46.6          63.3
Tapinoma                                   20.8          31.3
Dolichoderus                               31.8          39.0
Pachycondyla                                9.4          14.8
Myrmeciinae                                53.5          79.7
Ponerinae                                  51.2          77.4
Dolichoderinae                              8.2          22.3
Ectatomminae                                2.16         12.9
Formicinae                  X              -1.74         -5.6
Aneuretinae                                 7.2          29.7
COPYRIGHT 2011 New York Entomological Society
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2011 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Moreau, Corrie S.; Bell, Charles D.
Publication:Entomologica Americana
Article Type:Report
Geographic Code:1USA
Date:Jul 1, 2011
Previous Article:Notes on types of some American wasps in the Spinola collection.
Next Article:A life cut short: Kurt Milton Pickett Ph.D. (1972-2011).

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters