Getting the details right: Gene signatures for cancer therapy.
Baggerly and Coombes are at the forefront of the emerging discipline of "forensic bioinformatics," an effort to reconstruct and validate analytical results that have been reported in the literature. Although one might think that this exercise would be straightforward, given that the published methods have passed peer review, it frequently requires a laborious reconstruction of unreported parameters, methodologies, and transformations. The authors argue that without a clear understanding and documentation of analytical methods, one can easily miss important errors that compromise experimental interpretation and, by extension, may impair the treatment of patients.
In this recent work, Baggerly and Coombes have reanalyzed a set of experiments aimed at predicting the response of a tumor to chemotherapy. In brief, a number of reports from 2006 onward (2-4) explored the appealing concept of combining cell line drug-sensitivity data with microarray profiles to predict the therapeutic response of a given tumor. If certain cell lines are known to be sensitive to a chemotherapeutic agent, then genes that are up- or down-regulated in these sensitive cells relative to resistant cell lines may serve as a "sensitivity signature." By measuring gene expression in a primary human tumor, one might then be able to predict whether that tumor is likely to respond to chemotherapy. To understand how best to use this strategy in collaboration with their clinical colleagues, Baggerly and Coombes undertook to reproduce analyses that had identified signatures for resistance to a variety of drug classes along with clinical data suggesting that these signatures can predict tumor response in vivo. Surprisingly, their analysis revealed something quite different, yet equally important: that simple errors of data management compromised these reports and that these simple errors appeared to arise frequently within complex bioinformatics work flows.
Baggerly and Coombes identified a number of problems with either the initial data sets or various processing/classification steps. These problems included the inadvertent use of duplicate samples, reversed sample labels (sensitive vs resistant), incorrect gene lists, incorrect figures, and experimental data sets confounded by nonrandom factors. The authors provide extensive documentation of how they identified these problems, which samples or genes were affected, and how the results were affected by errors in data management. In the case of gene lists, they demonstrate that many were simply incorrect. A large cohort of genes were "offset by one" because of an incorrectly handled file-header line, and a small set of genes appeared despite their absence from the microarray platform reportedly used. Importantly, although many of these problems were due to simple errors, the complexity of the data analysis pipeline and the time-consuming nature of recapitulating the analysis made it difficult to spot that the errors had arisen. An initial report describing a subset of these issues led to published corrections by the original authors. Taken together with reports in which Baggerly and Coombes have found data management problems arising from other groups, it appears that current common practice (beyond any single laboratory) in -omics-scale data management and reporting is insufficiently robust.
The role and responsibility of the individual investigator in such work are, of course, paramount. Given the large quantity of data and the resulting difficulty in determining if something "looks wrong," individuals within a laboratory must check and recheck results to ensure their correctness. The common use of spreadsheets to manipulate data sets by cutting and pasting can easily produce misaligned data (this result might be designated the "curse of Excel"). In a high-throughput context where intellectual attention is justifiably focused on designing and understanding results from -omics experiments, it is therefore increasingly important to plan and implement a more robust infrastructure for data management. Baggerly and Coombes report good success with freely available software tools such as R (The R Project for Statistical Computing; http:// www.r-project.org/) and Sweave (a function in the R package) to maintain data labels and to create transparent, reproducible analytical reports, respectively. Although there is clearly an up-front cost to standardizing a laboratory in this way, the authors make a strong case for the critical nature of this investment. Furthermore, this lesson extends to any situation in which large quantities of data are in play. An academic translational laboratory handling large numbers of clinical samples may encounter many of the same challenges and errors, and this situation may be another area in which the clinical laboratory can bring its management experience to bear in strengthening the biomedical research enterprise.
For the broader scientific community, it is equally critical to understand how best to identify potential flaws ahead of publication. To take but one example, many of the gene lists as originally presented were essentially meaningless and would not have provided any predictive power for another group hoping to learn from their publication. Although peer reviewers must assume some responsibility for the quality of the literature, a full exercise of "forensic bioinformatics" is, by any measure, an extremely time-consuming enterprise. Such an endeavor requires not only domain expertise but also a substantial degree of institutional support, and it is not clear that a sufficient pool of appropriate groups exists to critically review manuscripts and data before acceptance for publication. At some level, reviewers are asked to trust that the reported methods have been correctly implemented; any requirement for a full-fledged reconstruction of data analysis pipelines may bring the current system to its knees.
To support reviewers in this task, journals themselves can undertake at least 2 supportive measures. First, the use of open, widely available analytical pipelines should be encouraged. Although it may be too much to require the uniform use of open source analytical tools such as R, journals should require--at a minimum--a baseline listing of processing steps and settings that yielded the reported data. It has not been unheard of for proteomics results to appear in the literature a year or more before the publication and release of software used to analyze the primary data. In such cases, the burden will fall squarely on the editorial leadership to produce clear software-availability guidelines that mandate reviewer access. When more-difficult issues arise (for example, what if the analysis requires proprietary package X and very few reviewers can afford this software?), it will again fall to journals to take the lead in setting policy. If this endeavor requires a cohort of bioinformatics "supercenters" with the necessary expertise and software resources, perhaps a coordinated effort with the NIH (who surely have a vested interest in the outcome) would be in order. Second, a more formal description of the mandated level of detail truly needed to reconstruct each analysis pipeline would be helpful. Given that the microarray community developed the MIAME ("minimal information about a microarray experiment") criteria and the proteomics community followed suit with MIAPE ("minimal information about a proteomics experiment"), it is surely time for a full fledged MIABA ("minimal information about bioinformatics analysis") effort.
It is important to note that the critical work of Baggerly and Coombes does not, in itself, invalidate the use of gene expression signatures for predicting chemotherapy response. A panel of external reviewers recently engaged by Duke University to review clinical trials based on this work has allowed the trials to proceed (5). Furthermore, the original researchers continue to maintain that the errors identified do not affect their fundamental findings or the utility of profile based prediction of tumor response. Supporting this contention, the work of Baggerlyand Coombes reproduces the separations originally reported after gene lists were corrected to account for an offset (although Baggerly and Coombes assert that predictions made from the corrected data are not significantly better than chance).
Perhaps the most disturbing aspect of Baggerly and Coombes' analysis is the frequency with which such errors have been uncovered. Of course, a substantial selection bias is undoubtedly present; the discovery that investigators accurately handle their data as described in a clearly written methods section would not be a particularly interesting, publishable result. Nonetheless, "forensic bioinformatics" has now identified major issues with multiple -omics studies that are related at least in part to data management, and the time consuming nature of these analyses raises the disturbing possibility that this finding may represent the tip of an iceberg poised to sink promising lines of biomarker development. No investigator or diagnostics developer can afford to ignore this important issue.
Author Contributions: All authors confirmed they have contributed to the intellectual content of this paper and have met the following 3 requirements: (a) significant contributions to the conception and design, acquisition of data, or analysis and interpretation of data; (b) drafting or revising the article for intellectual content; and (c) final approval of the published article.
Authors' Disclosures of Potential Conflicts of Interest: Upon manuscript submission, all authors completed the Disclosures of Potential Conflict of Interest form. Potential conflicts of interest:
Employment or Leadership: None declared.
Consultant or Advisory Role: S.R. Master, Bio-Rad Laboratories.
Stock Ownership: None declared.
Honoraria: S.R. Master, Roche, Invitrogen, and Bio-Rad Laboratories.
Research Funding: None declared.
Expert Testimony: None declared.
Role of Sponsor: The funding organizations played no role in the design of study, choice of enrolled patients, review and interpretation of data, or preparation or approval of manuscript.
(1). Baggerly KA, Coombes KR. Deriving chemosensitivity from cell lines: forensic bioinformatics and reproducible research in high-throughput biology. Ann Appl Stat 2009; 3:1309-34.
(2). Potti A, Dressman HK, Bild A, Riedel RF, Chan G, Sayer R, et al. Genomic signatures to guide the use of chemotherapeutics. Nat Med 2006; 12:1294-300.
(3). Hsu DS, Balakumaran BS, Acharya CR, Vlahovic V, Walters KS, Garman K, et al. Pharmacogenomic strategies provide a rational approach to the treatment of cisplatin-resistant patients with advanced cancer. J Clin Oncol 2007; 25:4350-7.
(4). Augustine CK, Yoo JS, Potti A, Yoshimoto Y, Zipfel PA, Friedman HS, et al. Genomic and molecular profiling predicts response to temozolomide in melanoma. Clin Cancer Res 2009; 15:502-10.
(5). Hutson S. Data handling errors spur debate over clinical trial [News]. Nat Med 2010; 16:618.
Stephen R. Master *
Department of Pathology and Laboratory Medicine, University of Pennsylvania, Philadelphia, PA.
* Address correspondence to the author at: Department of Pathology and Laboratory Medicine, University of Pennsylvania, 613A Stellar-Chance Labs, 422 Curie Blvd., Philadelphia, PA 19104. Fax 215-746-4650; e-mail srmaster@mail. med.upenn.edu.
Received June 7, 2010; accepted June 11, 2010.
Previously published online at DOI:10.1373/clinchem.2010.147686
|Printer friendly Cite/link Email Feedback|
|Author:||Master, Stephen R.|
|Date:||Sep 1, 2010|
|Previous Article:||Detection of androgen receptor mutations in circulating tumor cells: Highlights of the long road to clinical qualification.|
|Next Article:||Clinical impact of reporting estimated glomerular filtration rates.|