Fossil cross-validation of the dated ant phylogeny (Hymenoptera: Formicidae).
Recent advances in molecular phylogenetic analysis have permitted divergence dating for numerous lineages or clades through the use of molecular clock methods (Sanderson, 1997, 2002; Thorne et al., 1998; Huelsenbeck et al., 2000; Drummond et al., 2006). To calibrate the timing of divergence for lineages of interest, information from fossils, geologic events or rates of molecular evolution may be incorporated. This information coupled with molecular clock analyses facilitates the placing of a timeline on the origin of lineages and can also be used for further testing of evolutionary hypotheses. Although only one form of calibration is needed to conduct most molecular clock analyses, given the apparent heterogeneity of rates among lineages (e.g., Smith and Donoghue, 2008), a more ideal situation is when more than one minimum or maximum calibration can be incorporated into the analysis (Ochman and Wilson, 1987; Marshall, 1990; Soltis et al., 2002; Graur and Martin, 2004; Near et al., 2005; Bell et al., 2010). Although few would argue having more information is usually better, this may not be the case if the data are in conflict or if the calibration points were inaccurately placed on the phylogeny.
Reasons for conflict for a particular fossil calibration may come in many forms. The fossil calibration could be incorrectly placed within the phylogeny, the phylogenetic relationships within the phylogeny may be incorrectly inferred resulting in placing the fossil on the wrong node within the topology, the fossil could be incorrectly identified, the geologic strata from which the fossil was found could be incorrectly dated, among others forms (Hug & Roger, 2007; Hugall et al., 2007; Rutschmann et al., 2007; Heath et al., 2008; Marshall, 2008; Parham and Irmis, 2008). Some of these concerns can be addressed by careful examination of the fossils to ensure they are not only correctly identified, but that they are also correctly placed within the phylogeny. Concerns over the inference of the phylogeny and the resulting inferred relationships of the included taxa can be a major source of problems. Recent Bayesian relaxed clock methods offer a promising opportunity to account for uncertainty in phylogenies by simultaneously estimating the topology and branch lengths (Drummond et al., 2006).
Near and colleagues (Near & Sanderson, 2004; Near et al., 2005) highlighted the potential problem of conflicting multiple fossil calibration points. To address the issue of when multiple nodes across a phylogeny are constrained as minimum or maximum ages based on fossil or other data and they are in disagreement, Near and colleagues (Near & Sanderson, 2004; Near et al., 2005) proposed the "fossil cross-validation" method to identify which if any fossils generate inconsistent, and potentially erroneous, molecular age estimates. Once these inconsistent fossils have been determined, the authors advocate that these fossils be excluded and the analysis performed with all remaining non-conflicting fossils included.
The fossil cross-validation method (Near & Sanderson, 2004; Near et al., 2005) is performed on a previously inferred phylogenetic tree where each fossil calibration is 1) fixed, then 2) the calculated difference between the molecular and fossil estimates for all other fossil-dated nodes is calculated. This two-step method aims to identify and remove inconsistent fossils from the analysis. To identify potential "inconsistent" fossils, we used the average DA- statistic of Near, et al. (2005) as a heuristic. Since fossil calibrations (or constraints) are minimum age estimates for nodes, we determined that fossils were inconsistent if average [D.sub.x] values were negative. That is calibrations that consistently yielded molecular age estimates that were younger (more recent) than their fossil age estimate.
Moreau, et al. (2006) published the first large-scale molecular phylogeny of ants (Hymenoptera: Formicidae) based on 4.5 kilobases of sequence data from six gene regions for 139 ant genera. From these data relationships among the major ant lineages were inferred demonstrating that of the 19 subfamilies included in the analysis all were recovered as monophyletic with the exception of Cerapachyinae. In addition, Moreau, et al. (2006) incorporated 43 ant fossils as minimum age calibrations for divergence dating analyses. To account for the fact that 12 of the 43 fossils are from formations of uncertain stratigraphic ages, Moreau, et al. (2006) performed two separate molecular clock analyses. The first analysis used the minimum age for each of the formations from which the 12 fossils belonged plus all 31 remaining fossils with a maximum constraint on the root age for all ingroup and outgroup taxa (excluding Apis mellifera) at 200 million years ago (Ma) (minimum fossil ages dataset) and the second analysis was implemented with the maximum age for the same formations for the 12 fossils plus all 31 remaining fossils with maximum constraint on the root age for all ingroup and outgroup taxa (excluding Apis mellifera) at 250 Ma (maximum fossil age dataset). The outcome of these analyses resulted in a range of dates for the origin of the extant ant lineages (140-168 million years ago). In both analyses the complete set of fossils were incorporated as minimum age calibrations without taking into account whether any of the fossils were in conflict within the phylogeny. Based on these molecular clock and lineages-through-time (LTT) divergence time analyses, Moreau, et al. (2006) found that much of the diversification of the major ant lineages occurred from the early Paleocene to the late Cretaceous (60 to 100 Ma) and may be correlated with the rise of the flowering plants (angiosperms).
To test if any of the 43 fossils used by Moreau, et al. (2006) are potentially in conflict with one another, we performed Near, et al.'s fossil cross-validation procedure on all 43 fossils under both the minimum and maximum datasets. These results are compared to the results obtained by Moreau, et al. (2006) and how this may or may not affect the inferred patterns of diversification across the ants.
MATERIALS AND METHODS
The maximum likelihood topology of Moreau, et al. (2006) was used for testing inconsistent fossil calibrations [all files from the Moreau, et al. (2006) paper can be downloaded from www.moreaulab.org]. The original dataset was composed of 4.5 kb of sequence data from five nuclear and one mitochondrial gene from 139 ant genera and six Hymenoptera outgroups (Moreau et al., 2006). Fossil calibrations and age constraints for all 43 fossils follow those outlined in Moreau, et al. (2006) with fossils used as minimum ages for the lineage to which it belongs plus the sister lineage. Molecular clock analysis was performed using the penalized likelihood method (Sanderson, 2002) as implemented in the software package r8s v1.7 (Sanderson, 2003).
Following the methods of Near and colleagues, the fossil cross-validation analysis (Near & Sanderson, 2004; Near et al., 2005) was performed on the previously inferred maximum likelihood phylogenetic tree. In turn a single fossil dated node was fixed and the calculated difference between the molecular and fossil estimates for all other fossil-dated nodes was calculated. To determine if any of the observed differences were significant, potentially demonstrating that the calibration point is in conflict with the other fossil calibrations, we used the average [D.sub.x] statistic of Near, et al. (2005) as a heuristic. Since fossil calibrations (or constraints) are minimum age estimates for nodes, we determined that fossils were inconsistent if average [D.sub.x] values were negative.
Once the "inconsistent" fossils were determined using the fossil cross-validation method, these fossils were excluded from the final molecular clock analyses. Again the maximum likelihood topology (Moreau et al., 2006) was used for the divergence dating of the "inconsistent fossils removed" dataset using r8s (Sanderson, 2003). Like Near, et al. (2005), we assessed significance in the change of variance before and after "inconsistent" fossils were removed by using a one-tailed F-test, with N-1 degrees of freedom, where n is the number of nodes in a rooted tree.
To visualize the effect of removing fossils that were deemed to be inconsistent, the ultra-metric trees obtained from the penalized likelihood analyses were used to calculate proportional non-log transformed lineages-through-time (LTT) plots for the ants. These LTT plots were then compared to those recovered by Moreau, et al. (2006) where all 43 fossils (including those deemed "inconsistent" in this study) were included in the divergence time analyses.
Each of the 43 minimum fossil calibrations used by Moreau, et al. (2006) was investigated in turn using the fossil cross-validation method of Near and colleagues (Near & Sanderson, 2004; Near et al., 2005) to test for inconsistency. Based on this method, five fossils were deemed "inconsistent" and therefore removed from the final molecular clock analyses (Table 1), which means these calibrated clades yielded molecular age estimates that were younger than their fossil age estimate. The "inconsistent fossils removed" dataset consisted of the remaining 38 fossils used as minimum calibration points on the maximum likelihood topology of Moreau, et al. (2006). The F-tests ([P.sub.min] = 0.18, [P.sub.max] = 0.09), suggested that this did not significantly remove the variance seen across ail the dated nodes in the phylogeny for both the maximum and minimum age treatments. For a graphical representation of this lack of significance, the resulting ultra-metric trees obtained from the penalized likelihood analyses of the two "inconsistent fossils removed" datasets accounting for the fossils from formations of uncertain stratigraphic ages (minimum fossil ages dataset and maximum fossil ages dataset) were subjected to LTT analyses for comparison to the original results found by Moreau, et al. (2006) when all 43 fossils were included without regard to potential inconsistency.
Comparing the diversification rates recovered when the "inconsistent" fossils were removed for both the minimum fossil ages dataset and maximum fossil ages dataset resulted in very similar results (Fig. 1). In the case of the minimum fossil ages dataset the overall shape of the LTT curve is nearly identical, although once the rive "inconsistent" fossils were removed a slightly older age (Fig. 1--dotted line: five inconsistent fossils removed versus dash-dotted line: all 43 fossils included) for some lineages nested within the ant phylogeny were inferred. For the maximum fossil ages dataset not only were the shapes of the LTT curves very similar, but the ages for all ant clades were nearly identical (Fig. 1--dashed line: five inconsistent fossils removed versus solid lines: all 43 fossils included).
Molecular clock and divergence time analyses have advanced our understanding of the timeline of evolution for many groups. With a rich fossil record our understanding of the age and diversification of modern ants has benefited from the use of these molecular clock tools (Brady et al., 2006; Moreau et al., 2006; Moreau, 2009). Not only have these methods and analyses resulted in estimates for the age of the modern crown group ants, but have also allowed for investigation into changes in rates of diversification. Based on a large-scale molecular phylogeny and incorporating 43 fossils as minimum age constraints in molecular clock analyses Moreau, et al. (2006) found a burst in the diversification of the ants around 100 Ma, which seems to be correlated with the rise of the flowering plants (angiosperms) and many sap-feeding insects.
In regard to molecular clock analyses, one might expect that it is always better to include all available fossil data into an analysis, but Near and colleagues (Near & Sanderson, 2004; Near et al., 2005) highlighted the fact that in some cases a number of fossil calibration points may be "inconsistent" and should be excluded. To test if any of the 43 fossils included by Moreau, et al. (2006) fit into the "inconsistent" category and are potentially affecting the divergence dating results, we applied this method to these data. Even after excluding five fossils deemed "inconsistent" according to the methods of Near and colleagues (Near and Sanderson, 2004; Near et al., 2005) we did not find that this affected the overall diversification patterns recovered by Moreau, et al. (2006) (Fig. 1).
[FIGURE 1 OMITTED]
Although we did not find that excluding the "inconsistent" fossils changed our overall findings regarding the diversification patterns in the ants (Fig. 1), we acknowledge that this could be due to the large number of fossils available in the ants to serve as minimum calibration points. The potential negative effect of "inconsistent" fossils could be greater with fewer calibration points, although the validity of the method proposed by Near & colleagues (Near & Sanderson, 2004; Near et al., 2005) has been questioned (Marshall, 2008; Parham & Irmis, 2008). In most divergence dating analyses fossils are treated as minimum age constraints, and by definition can be redundant but not inconsistent. Fossil calibrations may be found to be inconsistent if maximum ages are implemented or if the fossils are treated as point estimates, as in the fossil cross-validation test (Parham & Irmis, 2008), but this again will be affected by the amount of rate heterogeneity among branches (Graur & Martin, 2004). Another shortcoming of the fossil cross validation method is that the method tends to discard calibrations until the remaining are mutually consistent, which could results in discarding the most informative accurate calibrations (Marshall, 2008; Ho & Phillips, 2009). For these reasons, among others, Ho and Phillips (2009) advocate retaining as many minimum calibration points as possible in dating analyses.
Although we did not fmd that excluding any of the "inconsistent" fossils from out divergence dating analyses had much effect on the results, we note that careful consideration and examination of all calibration information is necessary since incorrect use or placement could have large negative affects on molecular clock analyses (Hug & Roger, 2007; Hugall et al., 2007; Rutschmann et al., 2007; Heath et al., 2008; Marshall, 2008; Parham & Irmis, 2008; Ware et al., 2010). Our findings suggest that excluding fossils deemed "inconsistent" may not be necessary and may not affect dating analyses. In addition, the validity of the fossil cross-validation method has been questioned (Marshall, 2008; Parham & Irmis, 2008; Ho & Phillips, 2009) suggesting that excluding any fossil information may not be a good idea, which is reassuring for many taxonomic groups since multiple fossil calibrations are not available for molecular clock analyses.
Special thanks to Jessica Thomas and Jessica Ware for the invitation to participate in the Northeastern Symposium on Evolutionary Divergence Time hosted at Rutgers University in January 2010, which lead to this publication. We thank four anonymous reviewers for comments that helped improve this paper. This work was funded in part by the Department of Zoology at the Field Museum of Natural History and the Office of Research and Sponsored Projects at the University of New Orleans.
Bell, C. D., D. E. Soltis and P. S. Soltis. 2010. The age and diversification of the angiosperms re-revisited. American Journal of Botany 97: 1296-1303.
Brady, S. G., T. R. Schultz, B. L. Fisher and P. S. Ward. 2006. Evaluating alternative hypotheses for the early evolution and diversification of the ants. Proceedings of the National Academy of Sciences of the USA 103: 18172-18177.
Drummond, A. J., S. Y. Ho, M. J. Phillips and A. Rambaut. 2006. Relaxed phylogenetics and dating with confidence. PLoS Biology 4: e88.
Graur, D. and W. Martin. 2004. Reading the entrails of chickens: molecular timescales of evolution and the illusion of precision. Trends in Genetics 20: 80-86.
Heath, T. A., S. M. Hedtke and D. M. Hillis. 2008. Taxon sampling and the accuracy of phylogenetic analyses. Journal of Systematics and Evolution 46: 239-257.
Ho, S. Y. and M. J. Phillips. 2009. Accounting for calibration uncertainty in phylogenetic estimation of evolutionary divergence times. Systematic Biology 58: 367-380.
Huelsenbeck, J. P., B. Larget and D. Swofford. 2000. A compound Poisson process for relaxing the molecular clock. Genetics 154: 1879-1892.
Hug, L. A. and A. J. Roger. 2007. The impact of fossils and taxon sampling on ancient molecular dating analyses. Molecular Biology and Evolution 24: 1889-1897.
Hugall, A. F., R. Foster and M. S. Lee. 2007. Calibration choice, rate smoothing, and the patterns of tetrapod diversification according to the long nuclear gene RAG-1. Systematic Biology 56: 543-563.
Marshall, C. R. 1990. Confidence intervals on stratigraphic ranges. Paleobiology 16: 1-10.
Marshall, C. R. 2008. A simple method for bracketing absolute divergence rimes on molecular phylogenies using multiple fossil calibration points. American Naturalist 171: 726-742.
Moreau, C. S., C. D. Bell, R. Vila, S. B. Archibald and N. E. Pierce. 2006. Phylogeny of the ants: diversification in the age of angiosperms. Science 312: 101-104.
Moreau, C. S. 2009. Inferring ant evolution in the age of molecular data (Hymenoptera: Formicidae). Myrmecological News 12: 201-210.
Near, T. J. and M. J. Sanderson. 2004. Assessing the quality of molecular divergence time estimate by fossil calibrations and fossil-based model election. Philosophical Transactions of the Royal Society of London Series B-Biological Sciences 359: 1477-1483.
Near, T. J., P. A. Meylan and H. B. Shaffer. 2005. Assessing concordance of fossil calibration points in molecular clock studies: an example using turtles. American Naturalist 165:137-146.
Ochman, H. and A. C. Wilson. 1987. Evolution in bacteria: evidence for a universal substitution rate in cellular genomes. Journal of Molecular Evolution 26: 74-86.
Parham, J. F. and R. B. Irmis. 2008. Caveats on the use of fossil calibrations for molecular dating: a comment on Near et al. American Naturalist 171: 132-136.
Rutsehmann, F., T. Eriksson, K. Abu Salim and E. Conti. 2007. Assessing calibration uncertainty in molecular dating: the assignment of fossils to alternative calibration points. Systematic Biology 56: 591-608.
Sanderson, M. J. 1997. A nonparametric approach to estimating divergence times in the absence of rate constancy. Molecular Biology and Evolution 14: 1218-1231.
Sanderson, M. J. 2002. Estimating absolute rates of moleeular evolution and divergence times: a penalized likelihood approach. Moleeular Biology and Evolution 19: 101-109.
Sanderson, M. J. 2003. r8s: inferring absolute rates of moleeular evolution and divergence times in the absence of a molecular clock. Bioinformatics 19: 301-302.
Smith, S. A. and M. J. Donoghue. 2008. Rates of molecular evolution are linked to life history in flowering plants. Science 322: 86-89.
Soltis, P. S., D. E. Soltis, V. Savolainen, P. R. Crane and T. G. Barraclough. 2002. Rate heterogeneity among lineages of tracheo phytes: integration of molecular and fossil data and evidence for molecular living fossils. Proceedings of the National Academy of Sciences of the USA 99: 4430-1435.
Thorne, J. L., H. Kishino and I. S. Painter. 1998. Estimating the rate of evolution of the rate of molecular evolution. Molecular Biology and Evolution 15: 1647-1657.
Ware, J. L., D. A. Grimaldi and M. S. Engel. 2010. The effects of fossil placement and calibration on divergence times and rates: an example from the termites (Insecta: Isoptera). Arthropod Structure and Development 39: 204-219.
CORRIE S. MOREAU (1) AND CHARLES D. BELL (2)
(1) Field Museum of Natural History, Department of Zoology, 1400 South Lake Shore Drive, Chicago, Illinois 60605, USA, Phone: (312) 665-7743, E-mail: firstname.lastname@example.org
(1a) Email address for correspondence: email@example.com
(2) University of New Orleans, Department of Biological Sciences, 2000 Lakeshore Drive, New Orleans, Louisiana 70148, USA, Phone: (504) 280-7040, E-mail: firstname.lastname@example.org
Table 1. Fossil cross-validation results for the 43 fossils originally used by Moreau, et al. (2006). The fossils found to be "inconsistent" are denoted by X. Fossil dates with asterisk indicate that molecular clock analyses were done with both lower and upper dates to ensure confidence in dating of fossils. Average Dx differences between the molecular and fossil age estimates. Node/taxon Oldest Fossil locality fossil (Ma) Leptogenys 15.5 Shanwang Formation, China Myopopone 15.5 Shanwang Formation, China Acropyga 15.0-20.0 * Dominican Amber, Dominican Republic Azteca 15.0-20.0 * Dominican Amber, Dominican Republic Cephalotes 15.0-20.0 * Mexican Amber, Mexico Discothyrea 15.0-20.0 * Mexican Amber, Mexico Neivamyrmex 15.0-20.0 * Dominican Amber, Dominican Republic Odontomachus 15.0-20.0 * Dominican Amber, Dominican Republic Pyramica 15.0-20.0 * Dominican Amber, Dominican Republic Trachymyrmex 15.0-20.0 * Dominican Amber, Dominican Republic Crematogaster 28.4-33.9 * Sicilian Amber, Italy Podomyrma 28.4-33.9 * Sicilian Amber, Italy Pheidole 34.0 Florissant Formation, USA Pogonomyrmex 34.0 Florissant Formation, USA Agroecomyrmecinae 44.1 Baltic Amber Anonychomyrma 44.1 Baltic Amber Aphaenogaster 44.1 Baltic Amber Camponotus 44.1 Baltic Amber Cerapachys 44.1 Baltic Amber Formica 44.1 Baltic Amber Irtdomyrmex 44.1 Baltic Amber Lasius 44.1 Baltic Amber Monomoriu 44.1 Baltic Amber Myrmic 44.1 Baltic Amber Oligomyrmex 44.1 Baltic Amber Plagiolepis 44.1 Baltic Amber Proceratiinae 44.1 Baltic Amber Rhytidoponera 44.1 Baltic Amber Solenopsi 44.1 Baltic Amber Stenamma 44.1 Baltic Amber Tetramorium 44.1 Baltic Amber Tetraponera 44.1 Baltic Amber Vollenhovia 44.1 Baltic Amber Myrmicinae 52.0 Hat Creek Amber, Canada Tapinoma 52.0 Hat Creek Amber, Canada Dolichoderus 48.5-53.5 * Green River Formation, USA Pachycondyla 48.5-53.5 * Green River Formation, USA Myrmeciinae 54.5 O1st Formation, Denmark Ponerinae 60.0 Sakhalin Amber, Russia Dolichoderinae 79.0 Canadian Amber, Canada Ectatomminae 79.0 Canadian Amber, Canada Formicinae 92.0 New Jersey Amber, USA Aneuretinae 100.0 Burmese Amber, Myanmar Node/taxon Fossils found Average Average to be Dx values Dx values "inconsistent" in for minimum for maximum this study age dataset dataset (N = 5) Leptogenys 54.1 66.3 Myopopone 60.7 76.2 Acropyga 29.2 27.4 Azteca 61.1 66.5 Cephalotes 58.2 62.4 Discothyrea 75.6 92.2 Neivamyrmex 14.4 14.4 Odontomachus 35.9 39.5 Pyramica 27.7 26.9 Trachymyrmex 30.2 30.2 Crematogaster 48.4 53.6 Podomyrma 36.9 38.6 Pheidole 45.8 56.6 Pogonomyrmex 42.1 53.2 Agroecomyrmecinae 35.9 90.1 Anonychomyrma 10.81 12.6 Aphaenogaster 10.6 15.9 Camponotus X -4.8 -2.9 Cerapachys 43.5 59.4 Formica 0.27 2.7 Irtdomyrmex X -14.5 -13.7 Lasius X -8.8 -7.8 Monomoriu X -3.6 -4.2 Myrmic 36.1 48.1 Oligomyrmex 18.8 17.6 Plagiolepis 4.5 7.57 Proceratiinae 67.8 95.2 Rhytidoponera 28.9 37.9 Solenopsi 28.9 38.3 Stenamma 35.9 47.7 Tetramorium 22.15 31.1 Tetraponera 6.78 17.5 Vollenhovia 25.5 34.1 Myrmicinae 46.6 63.3 Tapinoma 20.8 31.3 Dolichoderus 31.8 39.0 Pachycondyla 9.4 14.8 Myrmeciinae 53.5 79.7 Ponerinae 51.2 77.4 Dolichoderinae 8.2 22.3 Ectatomminae 2.16 12.9 Formicinae X -1.74 -5.6 Aneuretinae 7.2 29.7
|Printer friendly Cite/link Email Feedback|
|Author:||Moreau, Corrie S.; Bell, Charles D.|
|Date:||Jul 1, 2011|
|Previous Article:||Notes on types of some American wasps in the Spinola collection.|
|Next Article:||A life cut short: Kurt Milton Pickett Ph.D. (1972-2011).|