Validation of (Q)SARs models.From a practitioner's point of view (but not having been part of the workshop), I feel compelled to comment on "Summary of a Workshop on Regulatory Acceptance of (Q)SARs for Human Health and Environmental Endpoints" by Jaworska et al. (2003).
There are a variety of quantitative structure-activity relationships [(Q)SARs] models available for a variety of purposes, and, as stated by Jaworska et al. (2003), predictive power is a critical issue in evaluating any model. Regrettably, the accompanying articles by Eriksson et al. (2003) and Cronin et al. (2003a, 2003b) fail to mention any of the recent publications on the application of probabilistic neural networks (PNNs) for the modeling of toxicity endpoints. Highly effective PNN PNN Probabilistic Neural Network
PNN Police National Network (UK)
PNN Profesor No Numerario
PNN Planet2025 News Network (internet news service)
PNN Prediction-Based Nearest Neighbor
PNN Plmn Network Name models have been demonstrated for the fathead minnow (Kaiser and Niculescu 1999), the waterflea Daphnia magna (Kaiser and Niculescu 2001a), the ciliate ciliate /cil·i·ate/ (sil´e-at)
1. having cilia.
2. any individual of the Ciliophora.
Any of various protozoans of the class Ciliata.
adj. Tetrahymena pyriformis (Niculescu et al. 2000), the Microtox bacterium Vibrio fischeri (Kaiser and Niculescu. In press), and estrogen receptor binding affinity (Kaiser and Niculescu 2001b). Indeed, Moore et al. (2003) have shown that fathead minnow PNN has superior performance in essentially all aspects when compared to the other methods. Other types of neural networks have similarly been shown to be robust and to provide optimal predictions (e.g., Burden and Winkler Winkler may refer to:
Although representativity or domain of a model are good concepts in theory, they are difficult to define or use in practice. Moreover, the statistical descriptors of a model's performance--such as goodness of fit Goodness of fit means how well a statistical model fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e. , specificity, sensitivity, transparency, and similarity--are often misleading because the applied data set(s) for many (Q)SARs are narrow, skewed skewed
curve of a usually unimodal distribution with one tail drawn out more than the other and the median will lie above or below the mean.
skewed Epidemiology adjective Referring to an asymmetrical distribution of a population or of data , or otherwise nonrepresentative of the chemical world existing in reality. In most cases, a model user cannot ascertain whether a particular model may or may not be used for a particular compound and end point to be estimated. Without tests of comparative performance, this conundrum exists for users of most models. Even for quite similar compounds, model outputs can vary by several orders of magnitude between both models and measured values. For example, predictions of octanol/water partition coefficients (a physical property) for a small set of quite similar compounds by commonly used models show a large divergence of values (Vrakas et al. 2003). Therefore, the (only) proof of model accuracy is in the testing of each model's performance against a broad spectrum of measured data, which are not part of the training set of each model. In practice, this means that performance of a model should be the driving force for its acceptability in the regulatory world, not its statistics.
Regular scrutiny of performance has been commonplace in other areas. For example, the performance of Canadian environmental analytical laboratories is regularly checked with round robin testing. The predictive power of carcinogenicity carcinogenicity /car·ci·no·ge·nic·i·ty/ (kahr?si-no-je-nis´i-te) the ability or tendency to produce cancer.
the ability or tendency to produce cancer. and mutagenicity mutagenicity /mu·ta·ge·nic·i·ty/ (-je-nis´it-e) the property of being able to induce genetic mutation.
the property of being able to induce genetic mutation. models has been evaluated in several rounds of testing, with the biological testing subsequent to the models' predictions. There is a great need for such comparative testing of the usefulness of various existing (Q)SAR (Segmentation And Reassembly) The protocol that converts data to cells for transmission over an ATM network. It is the lower part of the ATM Adaption Layer (AAL), which is responsible for the entire operation. See AAL.
SAR - segmentation and reassembly models. The valiant performance testing of several toxicity-prediction (Q)SARs models by Moore et al. (2003) shows some surprising results and further gives credence to this thought. Indeed, Jaworska et al. (2003) also stress the need for an independent organization to validate data and models irrespective of any model's claims.
The author is the director of research and a principal of TerraBase, Inc.
Klaus L.E. Kaiser
Hamilton, Ontario, Canada
Burden FR, Winkler DA. 1999. Robust QSAR QSAR Quantitative Structure-Activity Relationship
QSAR Quality System Audit Report
QSAR Quality Service Activity Report
QSAR Québec Secours Search and Rescue (Canada) models using Bayesian regularized neural networks. J Med Chem 42:3183-3187.
Cronin MT, Jaworska JS, Walker JD, Comber comb·er
1. One, such as a machine or a worker, that combs something, such as wool.
2. A long wave that has reached its peak or broken into foam; a breaker. MH, Warts Warts Definition
Warts are small, benign growths caused by a viral infection of the skin or mucous membrane. The virus infects the surface layer. The viruses that cause warts are members of the human papilloma virus (HPV) family. CB, Worth AP. 2003a. Use of QSARs in international decision-making frameworks to predict hearth effects of chemical substances, Environ Health Perspect 111:1391-1401.
Cronin MTD MTD Mounted
MTD Maximum Tolerated Dose
MTD Memory Technology Device
MTD Month To-Date
MTD Methadone (drug screening)
MTD motion to dismiss (legal)
MtD Mountain Dew
MTD Memory Technology Driver , Walker JO, Jaworska JS, Comber MH, Watts CO, Worth AP. 2003b. Use of QSARs in international decision-making frameworks to predict ecologic effects and environmental fate of chemical substances. Environ Health Perspect 111:1376-1390.
Eriksson L, Jaworska J, Worth AP, Cronin MTD, McDowell RM, Gramatica P. 2003, Methods for reliability and uncertainty assessment and for applicability evaluations of classification and regression-based QSARs. Environ Health Perspect 111:1361 1375.
Jaworska JS, Comber M, Auer C, Van Leeuwen CJ. 2003. Summary of a workshop on regulatory acceptance of (Q)SARs for human health and environmental endpoints. Environ Health Perspect 111:1358-1360.
Kaiser KLE KLE Keiner Lebt Ewig (German gaming clan)
KLE Karnataka Lingayat Education (Society)
KLE Key Leader Engagement
KLE Knowledge and Language Engineering , Niculescu SP. 1999. Using probabilistic neural networks to model the toxicity of chemicals to the fathead minnow (Pimephales promelas): a study based on 885 compounds, Chemosphere chemosphere: see atmosphere. 38:3237-3245.
--. 2001a. Modeling the acute toxicity of chemicals to Daphnia magna: a probabilistic neural network approach. Environ Toxicol Chum 20:420-431.
--. 2001b. On the PNN modelling of estrogen receptor binding data for carboxylic acid esters and organochlorine or·gan·o·chlo·rine
Any of various hydrocarbon pesticides, such as DDT, that contain chlorine. compounds. Water Qual Res J Canada 36:619-630.
--. In press. Neural network modeling of Vibrio fischeri toxicity data with structural physico-chemical parameters and molecular indicator variables, In: QSARs for Predicting Ecological Effects of Chemicals (Walker JD, ed). Pensacola, FL:SETAC SETAC Society of Environmental Toxicology And Chemistry
SETAC Systems Engineering & Technical Assistance Contract
SETAC Shipboard Electronic Thermoacoustic Chiller
SETAC Shipboard Electronics Thermo-Acoustic Cooler
SETAC Shipboard Electronics Thermoacoustic Chiller Press.
Moore DRJ DRJ Data Requirement Justification , Breton RL, MacDonald DB. 2003. A comparison of model performance for six quantitative structure-activity relationship packages that predict acute toxicity to fish. Environ Technol Chum 22:1799-1809.
Niculescu SP, Kaiser KLE, Schultz TW. 2000. Modeling the toxicity of chemicals to Tetrahymena pyriformis using molecular fragment descriptors and probabilistic neural networks. Arch Environ Toxicol Chem 39:289-298.
TerraBase, Inc. 2002 TerraQSAR--FHM. Fish Toxicity Computation Program. Available: http://www.terrabase-inc.com//tq-4fhm.htm [accessed 7 January 2004].
--. 2003a. TerraQSAR--RMIV. Rat/Mouse intravenous LD50 Computation Program. Available: http://www. terrabase-inc.com//tq-4rmiv.htm [accessed 7 January 2004].
--. 2003b. TerraQSAR--E2-RBA. Estrogen Receptor Binding Affinity Computation Program. Available: http:// www.terrabase-inc.com//tq-4e2rba.htm [accessed 7 January 2004.
Vrakas D, Tsantili-Kakoulidou A, Hadjipavlou-Litina D. 2003. Exploring the consistency of logP estimation for substituted coumarins. OSAR Combin Sci 22:622-629.