Printer Friendly
The Free Library
14,496,256 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs.


This article provides an overview of methods for reliability assessment of quantitative structure-activity relationship Quantitative structure-activity relationship (QSAR) is the process by which chemical structure is quantitatively correlated with a well defined process, such as biological activity or chemical reactivity.  (QSAR QSAR Quantitative Structure-Activity Relationship
QSAR Quality System Audit Report
QSAR Quality Service Activity Report
QSAR Québec Secours Search and Rescue (Canada) 
) models in the context of regulatory acceptance of human health and environmental QSARs. Useful diagnostic tools and data analytical approaches are highlighted and exemplified. Particular emphasis is given to the question of how to define the applicability borders of a QSAR and how to estimate parameter and prediction uncertainty. The article ends with a discussion regarding QSAR acceptability criteria. This discussion contains a list of recommended acceptability criteria, and we give reference values ref·er·ence values
pl.n.
A set of laboratory test values obtained from an individual or from a group in a defined state of health.
 for important QSAR performance statistics. Finally, we emphasize that rigorous and independent validation of QSARs is an essential step toward their regulatory acceptance and implementation. Key words: QSAR acceptability criteria, QSAR applicability domain The Applicability Domain (AD) of a QSAR is the physico-chemical, structural or biological space, knowledge or information on which the training set of the model has been developed, and for which it is applicable to make predictions for new compounds. , QSAR reliability, QSAR uncertainty estimation, QSAR validation.

Introduction

General Considerations

Quantitative structure-activity relationships (QSARs) are mathematical models
Note: The term model has a different meaning in model theory, a branch of mathematical logic. An artifact which is used to illustrate a mathematical idea is also called a mathematical model and this usage is the reverse of the sense explained below.
 approximating the often complex relationships between chemical properties and biological activities of compounds. Common objectives of such models are a) to allow prediction of biological activity of untested and sometimes yet unavailable compounds and b) to extract clues of which chemical properties of compounds are likely determinants for their biological activities. It is convenient to distinguish between QSARs and SARs: QSARs are typically quantitative in nature, producing categorical That which is unqualified or unconditional.

A categorical imperative is a rule, command, or moral obligation that is absolutely and universally binding.

Categorical is also used to describe programs limited to or designed for certain classes of people.
 or continuous prediction scales; SARs are qualitative in nature, often occurring in the form of structural alerts that include molecular substructures or fragment counts related to the presence or absence of biological activity.

In this article, we review QSARs. The most common techniques for establishing QSARs are based on regression analysis In statistics, a mathematical method of modeling the relationships among three or more variables. It is used to predict the value of one variable given the values of the others. For example, a model might estimate sales based on age and gender. , neural nets neural nets - artificial neural network , and classification approaches. Among the regression-based approaches, the methods of multiple linear regression Linear regression

A statistical technique for fitting a straight line to a set of data points.
 (MLR MLR

mixed lymphocyte reaction.

MLR Myocardial laser revascularization, see there
) and partial least squares (PLS See playlist. ) regression are prime examples. Examples of classification methods involve, for example, discriminant dis·crim·i·nant  
n.
An expression used to distinguish or separate other expressions in a quantity or equation.
 analysis and decision trees. It is important to observe that classification is a central concept also in regression-based QSARs. A molecule that is not satisfactorily classified in a model--that is, it does not "fit" the model--should be handled with care, and the model's predictions should be considered with some skepticism. Hence, methods and tools for classification are ubiquitous in QSARs, regardless of the final form of the "equation."

QSARs are increasingly used by authorities, industries, and other institutions for assessing the risks of chemicals released to the environment (Anonymous 1995, 1999). An important reason for this is the increasing awareness that completing even the most basic biological testing of compounds of concern would take decades. Therefore, predictive models (PMs) such as QSARs are necessary for aiding in chemicals management because they may considerably reduce costs, avoid animal testing Animal testing or animal research refers to the use of animals in experiments. It is estimated that 50 to 100 million vertebrate animals worldwide [4][5][6] , and speed up managerial decisions Managerial decisions

Decisions concerning the operation of the firm, such as the choice of firm size, firm growth rates, and employee compensation.
. In addition, safety of new chemicals, often already in the preproduction pre·pro·duc·tion  
adj.
1. Taking place or existing before production: preproduction planning.

2.
 phase, can be assessed via QSARs (Anonymous 1995, 1999; Wahlstr6m 1988). This may guide the design of compounds with fewer unwanted side effects Side effects

Effects of a proposed project on other parts of the firm.
 by optimizing their relevant properties. However, these potential benefits of QSARs can be fulfilled only if QSAR results are accepted at the regulatory level. The decision to accept QSAR results relies on assessing the reliability and uncertainty of the predictions as well as assessing the applicable domain of a QSAR.

Analogy Models (programming) Analogy Model - A method of estimating the cost of a proposed software project by extrapolating from the costs and schedules of similar completed projects.

The goal in any QSAR modeling is to obtain the mathematical expression A group of characters or symbols representing a quantity or an operation. See arithmetic expression.  that best portrays the relationship between chemistry and biology. To adequately describe the often complex nature of such phenomena, it is often necessary to use a battery of relevant and consistent chemical descriptors (Dunn 1989; Eriksson et al. 2001; Wold and Dunn 1983). The assumption, or expectation, is then that the factors governing the events in a biological test system are represented in the multitude of descriptors characterizing the compounds. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke"
put differently
, within a series of compounds--in which biological activity is expressed via the same mechanism--it is anticipated that a small change in chemical structure will be accompanied by a proportionally small shift in biological activity, and that the set of descriptors will reveal these analogies. Hence, QSARs are sometimes referred to as analogy models (Eriksson et al. 2001; Wold and Dunn 1983).

Analogy models can be regarded as linearizations of the real, complicated SARs. Wold and Dunn (1983) have shown that such analogy models normally have local validity only, that is, can embrace only compounds with similar chemical and biological data. It is noted, however, that the substances must be disparate enough to cause some systematic change in biological activity.

The nature of the biological response variable under study has a strong impact on the degree of chemical diversity that can be accommodated by a QSAR model; that is, there is a trade-off between chemical diversity in the training set and complexity of the biological response variable (Wold and Dunn 1983; Eriksson et al. 2001). For an endpoint variable where measured data involve a specific and selective mechanism, it is expected that the resulting QSAR model cannot tolerate too much structural diversity in chemicals (Anonymous 1995, 1999). On the other hand, in less complicated cases, dealing with less "demanding" biological response variables, for example, acute toxicity acute toxicity Pharmacology Illness caused by a single exposure to a toxic substance  of narcotics narcotics n. 1) techinically, drugs which dull the senses. 2) a popular generic term for drugs which cannot be legally possessed, sold, or transported except for medicinal uses for which a physician or dentist's prescription is required.  to aquatic organisms, QSAR models are usually possible for a much broader and more diverse set of chemical structures.

The Role of Pattern Recognition in QSARs

Pattern recognition (PARC (Palo Alto Research Center Incorporated, Palo Alto, CA, www.parc.com) Founded in 1970, PARC is a Xerox subsidiary involved in high-tech research and development. Although Xerox's headquarters are in Stamford, Connecticut, and manufacturing and marketing are in Rochester, New York, PARC is ) is often described as a procedure for formulating rules of classification (Albano et al. 1978; Wold et al. 1983) in multivariate The use of multiple variables in a forecasting model.  data. PARC has been used in a wide variety of applications such as analytical chemistry analytical chemistry: see under chemistry. , food research, and process monitoring in manufacturing. PARC methods are useful also in QSARs (Wold and Dunn 1983). Based on a set of given classes, each of which contains a number of observations (in QSARs, compounds) mapped by a multitude of variables, guidelines, and rules are developed that make it possible to classify new observations (compounds) as similar or dissimilar to the members of the existing classes.

Experience shows that nature often seems to organize itself in a clustered, rather discontinuous discontinuous /dis·con·tin·u·ous/ (dis?kon-tin´u-us)
1. interrupted; intermittent; marked by breaks.

2. discrete; separate.

3. lacking logical order or coherence.
 way. Inside a class or a duster, the observations (compounds) are rather similar to each other, so if we know the class membership of an observation, we can potentially infer a great deal about it. The similarity among observations within each class is considerably greater than among observations of different classes. This is the basis for the principle of analogy. If we know that a compound is a hydrocarbon, for instance, we can confidently predict how the compound reacts or fails to react with various "reagents" because we know from experience that almost all hydrocarbons behave similarly, analogously, when subjected to various "treatments."

It is therefore often practical to formulate a QSAR problem in terms of similarities and classes. One tries to find a battery of easily accessible properties (variables) that can be used to predict the class of an unknown observation (compound). One then infers that all observations within a class behave similarly and that there are no outliers or further subclustering endangering the foundation of the class model. Once such information is known, it is also possible to determine which among existing--perhaps competing--QSAR models will best accommodate a candidate chemical for which prediction of biological and environmental data is sought.

Scope of Review

The objective of this article is to review existing methods for assessing the reliability and uncertainty of QSARs, particularly regarding predictive power The predictive power of a scientific theory refers to its ability to generate testable predictions. Theories with strong predictive power are highly valued, because the predictions can often encourage the falsification of the theory.  and applicability domain. In so doing, the objective is also to distill dis·till
v.
1. To subject a substance to distillation.

2. To separate a distillate by distillation.

3. To increase the concentration of, separate, or purify a substance by distillation.
 some indicators that can be used as acceptability criteria. In the section, "Conditions for Applicability and Validity of QSARs," we outline basic conditions for the applicability of QSARs. In "Modeling Techniques" we review common modeling techniques in QSARs, with emphasis on regression-based methods. In "Assessing/ Enhancing Model Reliability, Interpretability, and Predictive Power," we describe various tools that aid the development and use of QSARs. In "Bayesian Methods for Reliability Testing," we consider Bayesian approaches and their applicability in QSAR reliability assessment, and, last, in the "Discussion," we provide concluding remarks with recommendations for acceptability criteria.

We make very dear that we are addressing important matters of QSARs from a statistical perspective. Thus, the main focus lies on discussing methods, procedures, and diagnostic tools--mostly statistical in nature--aiding us in developing statistically and informationally sound QSARs. However, this very strong emphasis on data analytical aspects of QSARs does not mean that we refrain entirely from touching upon related and important items that deal with, for example, compilation of chemical and biological data, configuration of data tables, and so forth. It should be emphasized, however, that we do not intend to delve deep into detailed and practical issues regarding procedures for gathering the necessary chemical, biological, and toxicologic data.

Conditions for Applicability and Validity of QSARs

Homogeneity Homogeneity

The degree to which items are similar.


Any data analysis, including QSAR modeling, is based on an assumption of homogeneity and absence of influential outliers (Wold et al. 1993; Eriksson and Johansson 1996). This means that the investigated system, that is, series of compounds, must be in a similar "state"--have rather similar properties--throughout the investigation, and the mechanism of influence of X on Y must be the same. This, in turn, corresponds to having some limits on the variability and diversity of X and Y. These limits may be wide if the biological activity is unspecific Adj. 1. unspecific - not detailed or specific; "a broad rule"; "the broad outlines of the plan"; "felt an unspecific dread"
broad

general - applying to all or most members of a category or group; "the general public"; "general assistance"; "a general rule";
 (e.g., acute toxicity to fish for narcotic narcotic, any of a number of substances that have a depressant effect on the nervous system. The chief narcotic drugs are opium, its constituents morphine and codeine, and the morphine derivative heroin.

See also drug addiction and drug abuse.
 chemicals), or narrow if the biological endpoint involves a very specific mechanism of action (e.g., binding of substrates to the active site of an enzyme).

Hence, it is essential that the data analysis provide diagnostics about how well these assumptions indeed are fulfilled. Much of the recent progress in applied statistics has concerned diagnostics, and many of these diagnostics can be used also in QSAR modeling as discussed later.

In many cases, QSAR modeling in risk assessment involves large databases of clustered compounds. Here the term "clustered" corresponds to a data set in which several classes of chemical compounds are encountered. These classes may be partially overlapped, barely separated, or completely resolved in the chemical descriptor (1) A word or phrase that identifies a document in an indexed information retrieval system.

(2) A category name used to identify data.

(operating system) descriptor
 (X-) space and/or biological property (Y-) space of the compounds in question. To conduct proper QSAR modeling, it is important to understand the nature of the clustering that occurs.

The extent to which data are clustered will be a function of the compounds and descriptors chosen, and can be checked by simply plotting the data and/or model parameters. In the ideal case, compounds will have an even spread in such plots. Moreover, there should be no influential outliers or strong clustering. If there is strong clustering in the data, it is often not realistic to fit only one model. Such a model would be able to describe only systematic variation among the groups and would be unable to resolve what is happening within a group. We also note that, from a modeling point of view, too, severe clustering will violate the assumption of homogeneity; that is, if a data set is clustered with large separation between groups, it no longer has a homogeneous distribution.

Therefore, with selective and specific biological or environmental responses, and a strongly clustered data set--a chemical property space containing several dense regions (clusters) of compounds with empty space between--it is often appropriate to treat each cluster/class independently and make separate QSAR models for each homogeneous cluster (Andersson et al. 2000; Eriksson et al. 2000a).

However, with nonspecific nonspecific /non·spe·cif·ic/ (non?spi-sif´ik)
1. not due to any single known cause.

2. not directed against a particular agent, but rather having a general effect.


nonspecific

1.
 responses, often resulting from measurement in aquatic environments and with less strong clustering in the chemical properties, that is, dusters that are partially overlapped or barely resolved from one another in chemical space, the approach is a bit more complicated. Although a single QSAR model is still conceivable, care must be exercised to assure all chemical classes are represented in the training set (Andersson et al. 2000; Eriksson et al. 2000a). Otherwise, there is an apparent risk that small clusters with few members will not be represented in the final training set.

Representativity

As should be apparent from the discussion above, the selection of the training set is crucially important in QSAR analysis. A representative selection of compounds that well span the chemical domain of interest should be included in this set. One way to accomplish a representative training set is through multivariate design (Wold et al. 1986). This methodology is also frequently used in medicinal chemistry Medicinal or pharmaceutical chemistry is a scientific discipline at the intersection of chemistry and pharmacology involved with designing, synthesizing and developing pharmaceutical drugs.  and combinatorial approaches and is known as statistical molecular design (SMD (1) (Storage Module Device) A high-performance hard disk interface used with minis and mainframes that transfers data in the 1-4 MBytes/sec range (SMD-E provides highest rate). See hard disk. ). It results in a test series of compounds in which all major structural and chemical properties are systematically varied at the same time (Giraud et al. 2000; Linusson et al. 2000).

A point of some controversy is how to define the chemical space appropriately. This is not a trivial issue. Because it is often difficult to know beforehand exactly which type and combination of chemical descriptors will be found useful in the QSAR modeling, the general advice given is to include a broad and stable set of descriptors.

The ensuing en·sue  
intr.v. en·sued, en·su·ing, en·sues
1. To follow as a consequence or result. See Synonyms at follow.

2. To take place subsequently.
 data analysis will then reveal whether the data set contains groups, outliers, and so forth, and care must then be exercised to modify the data set accordingly. Moreover, QSAR practitioners are sometimes anxious regarding the consequences of forgetting to include important chemical descriptors when compiling the initial set of descriptors. Frequently, however, this is not a big problem. If extra variables are added to the data set during the QSAR analysis, and if these are few compared with the total number of descriptors used, the structure of the training set in terms of its latent variables In statistics, Latent variables (as opposed to observable variables), are variables that are not directly observed but are rather inferred (through a mathematical model) from other variables that are observed and directly measured.  usually is little affected.

Moreover, it is important to understand the range of validity of the QSAR model-to-be, both in terms of the range of biological response data within which it will predict reliably, and also in terms of the type of chemical structure on which it is based. Diagnostic tools aiding us in the assessment of such model validity ranges are discussed.

Demands on the X-Data (Chemical Descriptors) and Y-Data (Biological Responses)

The intuitive belief of many environmental chemists and toxicologists is that measuring many variables provides more information about the chemical and biological properties of compounds than measuring just a few variables. Indeed, a rich description of chemical properties of compounds will facilitate the detection of groups (classes) of compounds with markedly different properties and help in unraveling chemical outliers. Outliers are compounds that do not fit a QSAR. It is important not to simply mechanically delete such compounds from a data set; rather, they should be analyzed carefully because their existence might lead to new, unexpected discoveries.

The compilation of data for use in QSARs requires consideration of some important aspects. First of all, because all our QSAR modeling efforts rest critically on the assumption of chemical similarity and biological homogeneity of compounds, we must analyze data that are rich enough to allow an adequate testing of this important assumption. This means that we must use chemical descriptors that are meaningful, interpretable, and reversible reversible,
adj capable of going through a series of changes in either direction, forward or backward (e.g., reversible chemical reaction).

reversible hydrocolloid,
n See hydrocolloid, reversible.
.

Descriptors that are often found useful in QSARs mirror fundamental physicochemical physicochemical /phys·i·co·chem·i·cal/ (fiz?i-ko-kem´ik-il) pertaining to both physics and chemistry.

phys·i·co·chem·i·cal
adj.
1. Relating to both physical and chemical properties.
 factors that in some way relate to the biological endpoint(s) under study. Examples of such molecular properties are hydrophobicity hy·dro·pho·bic  
adj.
1. Repelling, tending not to combine with, or incapable of dissolving in water.

2. Of or exhibiting hydrophobia.



hy
, steric steric /ste·ric/ (ster´ik) pertaining to the arrangement of atoms in space; pertaining to stereochemistry.

ster·ic or ster·i·cal
n.
 and electronic properties, molecular weight, p[K.sub.a], and so forth. These descriptors provide valuable insight into plausible mechanistic mech·a·nis·tic
adj.
1. Mechanically determined.

2. Of or relating to the philosophy of mechanism, especially one that tends to explain phenomena only by reference to physical or biological causes.
 properties. It is also desirable that the chemical description be reversible. It must be possible to convert model information into understandable chemical properties. For a deeper treatment of chemical descriptors and their use in QSARs, the reader is advised to consult the literature (e.g., Andersson et al. 2000; Cronin and Schultz 2003).

Furthermore, as emphasized by Cronin and Schultz (2003), knowledge about the biological data is essential in QSARs:
   Reliable data are required to build reliable predictive
   models. In terms of biological activities, such data
   should ideally be measured by a single protocol, ideally
   even the same laboratory and by the same workers.
   High quality biological data will have lower
   experimental error associated with them. Biological
   data should ideally be from well standardized assays,
   with a dear and unambiguous endpoint.


The article of Cronin and Schultz (2003) also discusses in depth the importance of appreciating the quality of biological data and of knowing the uncertainty with which the biological data were measured.

Interestingly, QSAR analysis may involve modeling of more than one endpoint, that is, a matrix (Y) of several end points. This will lead to the determination of biological response profiles of compounds (Nendza and Muller Mul·ler , Hermann Joseph 1890-1967.

American geneticist. He won a 1946 Nobel Prize for the study of the hereditary effect of x-rays on genes.



Mül·ler , Johannes Peter 1801-1858.
 2000). Measurement of multivariate biological data leads to statistically beneficial properties of the QSAR and improved possibilities of exploring the biological similarity of the studied substances. The absence of outliers in multivariate biological data is a very valuable indication of homogeneity of the biological response profiles among the compounds (Eriksson et al. 2001, 2002).

The use of multiple endpoints is becoming increasingly widespread in QSARs, in both drug design and environmental sciences (Deneer et al. 1987, 1989; Nendza and Muller 2000; Sjostrom et al. 1997; Verhaar et al. 1994). And, as discussed above, a multitude of chemical descriptors is often favorable and tends to stabilize the description of the chemical properties of the compounds.

Modeling Techniques

Multiple Linear Regression

MLR is the classical approach to regression problems in QSARs. MLR assumes the predictor variables Noun 1. predictor variable - a variable that can be used to predict the value of another variable (as in statistical regression)
variable quantity, variable - a quantity that can assume any of a set of values
, normally called X, to be mathematically independent (orthogonal At right angles. The term is used to describe electronic signals that appear at 90 degree angles to each other. It is also widely used to describe conditions that are contradictory, or opposite, rather than in parallel or in sync with each other. ). Mathematical independence means that the rank of X is K(the number of X-variables).

A limitation of MLR is the sensitivity to correlated descriptors. One practical workaround (jargon, programming) workaround - A temporary kluge used to bypass, mask or otherwise avoid a bug or misfeature in some system. Customers often find themselves living with workarounds for long periods of time rather than getting a bug fix.  is to use long and lean data matrices--matrices where the number of compounds substantially exceeds the number of chemical descriptors--where interrelatedness in·ter·re·late  
tr. & intr.v. in·ter·re·lat·ed, in·ter·re·lat·ing, in·ter·re·lates
To place in or come into mutual relationship.



in
 among variables usually drops. It has been recommended that the ratio of compounds to variables should be at least 5 (Topliss and Edwards 1979). We note that one way to introduce orthogonality orthogonality

In mathematics, a property synonymous with perpendicularity when applied to vectors but applicable more generally to functions. Two elements of an inner product space are orthogonal when their inner product—for vectors, the dot product (see
 or near-orthogonality among the X-variables is through SMD.

MLR is satisfactorily applied in QSAR studies if the main problem of the selection of variables is faced and solved.

MLR is usually used to fit the regression model (Equation 1), which models a response variable, y, as a linear combination of the X-variables, with the coefficients b. The deviations between the data (y) and the model (Xb) are called residuals, and are denoted by e:

[1] y = Xb+ e

For many response variables (columns in the response matrix Y), regression normally forms one model for each of the M y-variables, that is, M separate models.

If MLR is applied to data sets exhibiting collinearities among the X-variables, the calculated regression coefficients Regression coefficient

Term yielded by regression analysis that indicates the sensitivity of the dependent variable to a particular independent variable. See: Parameter.


regression coefficient 
 get unstable and their interpretability breaks down (Draper and Smith 1981; Lindgren 1994; Topliss and Edwards 1979). For example, certain coefficients may be much larger than expected, or they may even have the wrong sign (Eriksson et al. 1995; Lindgren 1994; Mullet mullet: see silversides.
mullet

Any of fewer than 100 species (family Mugilidae) of abundant, commercially valuable schooling fishes found in brackish or fresh waters throughout tropical and temperate regions.
 1976).

Another key feature of MLR is that it exhausts the X-matrix, that is, uses all (100%) of its variance (i.e., there will be no X-matrix error term in the regression model). Hence, it is assumed that the X-variables are exact and completely (100%) relevant for the modeling of Y.

Other Approaches

Multivariate projection methods such as principal component analysis (PCA (tool, programming) PCA - A dynamic analyser from DEC giving information on run-time performance and code use. ), principal component regression (PCR PCR polymerase chain reaction.

PCR
abbr.
polymerase chain reaction


Polymerase chain reaction (PCR) 
), and PLS are other approaches that are increasingly used in QSAR analysis in the environmental sciences (Langer 1994; Sjostrom et al. 1997; Tosato et al. 1992; Tysklind et al. 1995; Verhaar et al. 1994). These methods are particularly apt when the number of variables equals or exceeds the number of compounds. This is because projections to latent variables in multivariate space tend to become more distinct and stable as more variables are involved (Eriksson et al. 2001; Hoskuldsson 1996; Lindgren 1994; Wold et al. 1993).

Geometrically, PCA, PCR, PLS, and similar methods can be seen as the projection of the observation points (compounds) in variable space down on an A-dimensional hyperplane. The positions of the observation points on this hyperplane are given by the scores, and the orientation of the plane in relation to the original variables is indicated by the loadings. In contrast to MLR, PLS and similar approaches do not exhaust the X-matrix; that is, they do not assume that the X-variables are exact and 100% relevant for modeling of Y.

Some other methods are canonical The standard or authoritative method. The term comes from "canon," which is the law or rules of the church. See canonical name and canonical synthesis.

canonical - (Historically, "according to religious law")

1. A standard way of writing a formula.
 correspondence analysis (CCA (1) (Common Cryptographic Architecture) Cryptography software from IBM for MVS and DOS applications.

(2) (Compatible Communications A
), correspondence analysis scaling (for discrete data), redundancy analysis, and ridge regression (Jackson 1991; Jongman et al. 1987).

MLR, PLS, and the other methods discussed above are usually applied to data sets where a linear relationship between X and Y is anticipated. However, there are also many other methods that are used in the analysis of nonlinear A system in which the output is not a uniform relationship to the input.

nonlinear - (Scientific computation) A property of a system whose output is not proportional to its input.
 QSAR data, for example, neural networks neural network or neural computing, computer architecture modeled upon the human brain's interconnected system of neurons. Neural networks imitate the brain's ability to sort out patterns and learn from trial and error, discerning and extracting  (Burden et al. 1997), nonlinear versions of genetic algorithms Genetic algorithms

Search procedures based on the mechanics of natural selection and genetics. Such procedures are known also as evolution strategies, evolutionary programming, genetic programming, and evolutionary computation.
 (Vankeerberghen et al. 1995), and nonlinear extensions of PLS (Eriksson et al. 2000a; Martin et al. 1995; Wold 1992). All these methods contain more adjustable model parameters than do linear modeling techniques. As a consequence, nonlinear modeling methods are usually very flexible and adapt to almost anything, including outliers, in homogeneities, discontinuities, and other anomalies in the data. Because of the high degree of flexibility of such methods, very many observations (compounds) are required for these techniques to work reliably and produce stable models.

A recent article by Worth and Cronin (2003) describes the use of alternative techniques in QSARs, such as discriminant analysis, logistic regression In statistics, logistic regression is a regression model for binomially distributed response/dependent variables. It is useful for modeling the probability of an event occurring as a function of other factors. , and classification tree analysis. The reader is referred to this article for a more in-depth discussion of the classification problem and how to categorize cat·e·go·rize  
tr.v. cat·e·go·rized, cat·e·go·riz·ing, cat·e·go·riz·es
To put into a category or categories; classify.



cat
 compounds as active/inactive or potent/nonpotent using these approaches.

Assessing/Enhancing Model Reliability, Interpretability, and Predictive Power

A QSAR analyst must master many elements of data analysis. There are many tools and diagnostics available that will give better, more reliable, and more useful PMs. In this section we provide an overview of some of these tools and diagnostics.

Preprocessing A preliminary processing of data in order to prepare it for the primary processing or for further analysis. The term can be applied to any first or preparatory processing stage when there are several steps required to prepare data for the user.  Techniques

Scaling and centering. Pretreatment pretreatment,
n the protocols required before beginning therapy, usually of a diagnostic nature; before treatment.

pretreatment estimate,
n See predetermination.
 of measured data is carried out to reshape ("transform") the data to facilitate data analysis and model interpretation. The two most common preprocessing procedures are centering and scaling (Eriksson et al. 2001). Subtracting the mean (mean-centering) facilitates model interpretation and may in certain situations also remove some numerical instability.

An initial scaling of data, often to a variance of 1 for each variable (unit-variance scaling), is done to ensure that all variables have the same chance to influence a regression model (Eriksson et al. 2001). This type of preprocessing is especially useful when the variables considered are of different origin and display considerably different numerical range. Without any scaling, variables with large numerical range would otherwise dominate over variables with small numerical range. Figure 1 shows an example involving two variables where one variable can be made to dominate over the other when scaling is not done appropriately.

[FIGURE 1 OMITTED]

One additional approach of considerable interest for the future is called Pareto scaling (Eriksson et al. 2001), whereby each variable is given a variance equal to its standard deviation In statistics, the average amount a number varies from the average number in a series of numbers.

(statistics) standard deviation - (SD) A measure of the range of values in a set of numbers.
 rather than unit variance. It can be seen as a compromise between no scaling (risk: "small" variables will be masked by "large" variables) and unit-variance scaling (risk: noise is inflated because noisy variables are up-weighted).

In summary, scaling can be done in many different ways, depending on the modeling objectives and the level of prior knowledge about the properties of the data. Also, if the uncertainties of the X- and Y-data are estimable es·ti·ma·ble  
adj.
1. Possible to estimate: estimable assets; an estimable distance.

2. Deserving of esteem; admirable: an estimable young professor.
, such information may be used to modify the scaling weights. For instance, if in a given situation it is known that the standard deviation of an X-variable is three times higher than that for any other X-variable, a down-weighting by one-third would seem reasonable, thus giving this X-variable a variance of one-third rather than unity.

Data correction and compression. Data pretreatment often has wider scope than just scaling and centering. In spectroscopically based QSAR applications, occurring mainly in pharmaceutical industry, spectral data are often transformed to remove undesired systematic behavior ("signal correction") (Wold and Josefson 2000). Such undesired variation may arise from light-scattering effects, baseline drift, nonlinearities, and so forth, which influence the shape of the spectral data without really being relevant to the Y-data modeled. Therefore, there is an interest of "correcting" the spectral data and removing from the X-matrix the variation that does not relate to the Y-data. The "corrected" or "filtered" X-matrix then contains the variation that correlate with Y, and hence the QSAR model is better focused.

Signal correction improves the interpretability of a QSAR model and may also improve its predictive power. A facilitated transfer of a model from one site to another, so-called calibration transfer, may also be the result (Sjoblom et al. 1998). Furthermore, when very large sets of spectral data are investigated, the pretreatment phase may also involve measures to reduce the size of the data material (signal compression In telecommunication, the term signal compression has the following meanings:

In analog (usually audio) systems, reduction of the dynamic range of a signal by controlling it as a function of the inverse relationship of its instantaneous value relative to a specified
), for instance, by using wavelet compression A lossy compression method used for color images and video. Instead of compressing small blocks of 8x8 pixels (64 bits) as in JPEG and MPEG, the wavelet algorithms compress the entire image with ratios of up to 300:1 for color and 50:1 for gray scale.  (Wold and Josefson 2000). For a discussion of useful correction and compression approaches, see Eriksson et al. (2001).

Transformations. Another situation for preprocessing of raw data is when a variable contains one or a few extreme measurements that may unduly influence model building. Consider Figure 2A, which shows the histogram histogram
 or bar graph

Graph using vertical or horizontal bars whose lengths indicate quantities. Along with the pie chart, the histogram is the most common format for representing statistical data.
 of a variable, Var1. One out of the 40 measurements in this variable is substantially larger than the others. If this extreme measurement is not manipulated in some way before the data analysis, it will exert a large influence (have high leverage) on the model and dominate over the other measurements. A simple logarithmic logarithmic

pertaining to logarithm.


logarithmic relationship
when the logs of two variables plotted against each other create a straight line.
 transformation will in this case remedy the situation (Figure 2B). If the transformation does not increase the model's goodness of prediction, it should be avoided.

[FIGURE 2 OMITTED]

Informative Model Parameters

Depending on the data analytical technique An analytical technique is a method that is used to determine the concentration of a chemical compound or chemical element. There are a wide variety of techniques used for analysis, from simple weighing (gravimetric) to titrations (titrimetric)to very advanced techniques using  used, QSAR analysis will result in a set of model parameters that is useful in the interpretation phase. With straightforward MLR, a regression equation Regression equation

An equation that describes the average relationship between a dependent variable and a set of explanatory variables.
 consisting of coefficients is produced. These coefficients have an intuitively simple and therefore appealing meaning. But, one should be aware that--depending on the choice of regression method--there are other model parameters and diagnostics available that also deserve attention when interpreting a QSAR model. Our goal in this subsection is to highlight a few of these parameters and diagnostics. In so doing, we will use two data sets drawn from the literature.

Interpretation with emphasis placed on regression coefficients, Y-residuals, and model performance statistics. The first data set deals with toxicity data (IC[G.sub.50], concentration causing 50% growth inhibition Growth inhibition (GI) is a medical term pertaining to cancer therapy and the specific reduction in growth of tumors and oncogene cells by a chemical compound, mechanical therapy (e.g.  to Tetrahymena pyriformis) taken from the literature (Cronin et al. 2000). The complete data set comprises 140 compounds, two X-variables [log P and energy of the lowest unoccupied molecular orbital In chemistry, a molecular orbital is a region in which an electron may be found in a molecule.[1] MOs are introduced in qualitative and pictorial models of bonding in molecules, and specify the spatial distribution and energy of one (or a pair) of electrons.  (LUMO LUMO Lowest Unoccupied Molecular Orbital )], and one Y-variable (the endpoint). The two X-variables are almost uncorrelated with a squared correlation coefficient Correlation Coefficient

A measure that determines the degree to which two variable's movements are associated.

The correlation coefficient is calculated as:
 of 0.044. The data set is known to contain five outliers. The MLR results are summarized in Figure 3.

[FIGURE 3 OMITTED]

Figure 3A shows the relationship between observed and predicted endpoint data. This is a standard plot in QSAR analysis. In this case, a few outliers are identifiable, but sometimes the situation can be a bit trickier.

A diagnostic tool that is specifically designed to pinpoint outliers is the normal probability plot of residuals (Box et al. 1978). All observation points that lie on an imagined straight line that goes through the point (zero residual, 0.5 probability) have approximately normally distributed residuals. Any point that falls off such an imagined straight line has a residual, that is, a difference between measured and predicted endpoint data that is much larger or smaller than would be expected based on the assumption of nearly normally distributed residuals. A normal probability plot of the example data set is shown in Figure 3B. It is immediately evident from this plot that there are five outliers in the data set that all fall off the straight line.

After removal of the five outliers and refitting of the model, the normal probability plot looks much nicer (Figure 3C). We emphasize that deleting outliers should be done with caution so that the model is not overtrained. For the updated model (devoid of the five outliers) the explained Y-variation ([R.sup.2]Y = 0.85) is 0.85 and the predicted Y-variation ([Q.sup.2]Y = 0.84; estimated with cross-validation) is 0.84, which are excellent performance statistics. The regression coefficients are plotted in Figure 3D. We have chosen to plot them to simplify comparison with the PLS model (see Figure 4C). The regression equation is listed in the figure legend. As seen from the 95% confidence intervals confidence interval,
n a statistical device used to determine the range within which an acceptable datum would fall. Confidence intervals are usually expressed in percentages, typically 95% or 99%.
, the uncertainty in these coefficients is very small.

[FIGURE 4 OMITTED]

Thus, in conclusion, we have a very good MLR model. Minor improvement in [R.sup.2]Y and [Q.sup.2]Y (2% in each parameter) is accomplished if the cross-term log P*LUMO is included in the model; however, the significance of this slight improvement remains unclear.

Interpretation with a bit wider scope: defining the range of the QSAR model It is possible to calculate the applicability domain of a QSAR model, that is, the range within which it "tolerates" a new molecule. This specification can be made regarding both the X-and the Y data as long as not all initial variance is used in the model. We will now illustrate this possibility.

Reanalyzing the above example with PLS yields a similar model with [R.sub.2]Y = 0.84 and [Q.sub.2]Y = 0.84 (Figure 4A-C A-C Air Conditioning ). However, the PLS model uses only 42% [[R.sub.2]X = 1 - RS[S.sub.X]/S[S.sub.X] (residual sum of squares/sum of squares; "explained X-variation") = 0.42] of X to explain and predict the Y-data, not 100% as does MLR. The X-residuals are of diagnostic interest. They can be used to calculate the typical distance to the model in the X-data (here abbreviated DModX) for a compound (Figure 4A). Figure 4B shows DModX for each compound. We can also see the critical distance corresponding to the 0.05 probability level. This critical distance indicates the "tolerance volume" around the model, that is, the range of the model in the X-data (Eriksson et al. 2001). Apparently, a few compounds are positioned outside the range of the model; that is, they do not fit the model well. Hence, predictions for any of these should be considered with caution. Finally, Figure 4C provides the PLS loadings, which are reminiscent of the MLR regression coefficients (compare with Figure 3D).

A companion parameter to DModX, DModY, is also calculable cal·cu·la·ble  
adj.
1. That can be calculated or estimated: calculable odds.

2. Readily relied on; dependable: a calculable assistant.
. DModY is especially useful in the situation when more than one Y-variable is modeled by the same QSAR. This will be illustrated by our second example, where QSAR modeling is attempted for a set of 15 mononitrobenzene derivatives (Eriksson et al. 1995). The goal in this study was to be able to model and predict the aquatic toxicity profiles of the 15 chemicals based on information concerning their chemical properties. The 15 compounds were characterized using eight descriptor variables [boiling point boiling point, temperature at which a substance changes its state from liquid to gas. A stricter definition of boiling point is the temperature at which the liquid and vapor (gas) phases of a substance can exist in equilibrium.  (Bp), melting point melting point, temperature at which a substance changes its state from solid to liquid. Under standard atmospheric pressure different pure crystalline solids will each melt at a different specific temperature; thus melting point is a characteristic of a substance and  (Mp), density (eta), log P, [[sigma].sup.-], HOMO (energy of the highest occupied molecular orbital), LUMO, and hardness). In total, eight biological responses were available (Deneer et al. 1987, 1989). These are primarily related to toxicity toward the four aquatic species Poecilia reticulata, Daphnia magma, Chlorella chlorella

Any green algae of the genus Chlorella, found in fresh or salt water and in soil. They have a cup-shaped chloroplast. Chlorellas are used often in studies of photosynthesis, in mass cultivation experiments, and for purifying sewage wastes.
 pyrenoidina, and Photobacterium phosphoreum Photobacterium phosphoreum or Vibrio phosphoreum is a Gram-negative luminescent bacterium living in symbiosis with marine organisms. It can emit bluish-green light (490 nm) thanks to a chemical reaction between FMN, luciferin and molecular oxygen catalysed by an enzyme .

The data analysis resulted in a QSAR with [R.sup.2]X = 0.84, [R.sup.2]Y = 0.76, and [Q.sup.2]Y = 0.67, which are excellent performance statistics considering that eight responses are handled simultaneously. For the interpretation of this QSAR model, we may consider the model coefficients (scores and loadings) to see how the compounds and the X-and Y-variables are interrelated in·ter·re·late  
tr. & intr.v. in·ter·re·lat·ed, in·ter·re·lat·ing, in·ter·re·lates
To place in or come into mutual relationship.



in
 (Figures 5A, B).

[FIGURE 5 OMITTED]

Figure 5A indicates that all X-variables load strongly in the model, and that D, Mp, [[sigma].sup.-], and LUMO are closely related. A second group is formed by log P, Bp, and [eta], whereas HOMO provides information different from these two groups. Overall, log P is the most important X-variable. Altogether, nitrobenzene nitrobenzene, C6H5NO2, very poisonous, flammable, pale yellow, liquid aromatic compound with an odor like that of bitter almonds. It is sometimes called oil of mirbane or nitrobenzol. Nitrobenzene melts at 5.85°C;, boils at 210.  (Figure 5A-C, point 1) is the least toxic compound to these aquatic organisms, and it is also the least hydrophobic hydrophobic /hy·dro·pho·bic/ (-fo´bik)
1. pertaining to hydrophobia (rabies).

2. not readily absorbing water, or being adversely affected by water.

3.
 compound (lowest value of log P).

Figure 5B shows the model scores. There are no outliers in the score space because all compounds lie inside the elliptic el·lip·tic   or el·lip·ti·cal
adj.
1. Of, relating to, or having the shape of an ellipse.

2. Containing or characterized by ellipsis.

3.
a.
 95% tolerance volume depicted in the plot. This tolerance volume is given by a diagnostic called Hotelling's [T.sup.2]. Hotelling's [T.sup.2] is a multivariate generalization gen·er·al·i·za·tion
n.
1. The act or an instance of generalizing.

2. A principle, a statement, or an idea having general application.
 of Student's t-test A t test is any statistical hypothesis test in which the test statistic has a Student's t distribution if the null hypothesis is true. History
The t
. It provides a check for compounds adhering to multivariate normality normality, in chemistry: see concentration.  (Jackson 1991).

Plots of DModX and DModY are given in Figure 5C and D. These parameters suggest that this data set contains no outliers, neither in the X-or the Y-data. This absence of outliers is a valuable indication about chemical similarity and biological homogeneity among the studied compounds.

Thus, as shown here, there are two complementary diagnostic tools available, Hotelling's [T.sup.2] and DModX/DModY, that jointly assess the range of a QSAR model. The difference lies in the fact that, whereas DModX and DModY are derived from the unexplained X-and Y-variances (residuals), Hotelling's [T.sup.2] is founded with the explained variances Explained variance is part of the variance of any residual that can be attributed to a specific condition (cause). The other part of variance is unexplained variance. The higher the explained variance relative to the total variance, the stronger the statistical measure used. . Further, through these diagnostics it is also possible to discriminate between strong (Hotelling's [T.sup.2]) and moderate (DModX/DModY) outliers, depending on which tool is used for their detection.

A similar way of defining the range of a QSAR model is according to according to
prep.
1. As stated or indicated by; on the authority of: according to historians.

2. In keeping with: according to instructions.

3.
 the leverage of a compound. The leverage h (Atkinson 1985) of a compound measures its influence on the model. It is noted that the leverage h and Hotelling's [T.sup.2] are, apart from a proportionality constant, identical. The leverage of a compound in the original variable space is defined as:

[2] [h.sub.i] = [x.sup.T.sub.i] [([X.sup.T] X).sup.-1] [x.sub.i] (i = 1, ..., n),

where [x.sub.i]is the descriptor vector of the considered compound and X is the model matrix derived from the training set descriptor values. The warning leverage [h.sup.*] is defined as follows:

[3] [h.sup.*] = 3 x h = 3 x [[SIGMA].sub.i][h.sub.i]/n = 3 x p'/n (i = 1, ..., n),

where n is the number of training compounds and p' is the number of model parameters.

Leverage values can be calculated for both training compounds and new compounds. In the first case, they are useful for finding training compounds that influence model parameters to a marked extent, resulting in an unstable model. In the second case, they are useful for checking the applicability domain of the model. A leverage greater than the warning leverage [h.sup.*] means that the compound predicted response can be extrapolated from the model, and therefore, the predicted value must be used with great care. Only predicted data for chemicals belonging to the chemical domain of the training set should be proposed. The kind of leverage plot seen in Figure 6 allows a graphical detection of both the outliers and the influential chemicals in a model.

[FIGURE 6 OMITTED]

Yet another way of defining the range of a PM is according to the principles of optimum prediction space (OPS Ops (ŏps), in Roman religion, goddess of harvests. She was the wife of Saturn, by whom she bore Jupiter and Juno. At her festivals, the Opiconsivia and the Opalia, held in August and December, respectively, she was worshiped as a goddess of sowing ) by Gombar (1996). Although OPS has similar scope and objective as the combination Hotelling's [T.sup.2]/DModX, the implementation is somewhat different. OPS is defined in the original variable space, whereas Hotelling's [T.sup.2] is usually based on calculation in score space of latent variable projection methods (Eriksson et al. 2001; Jackson 1991).

Assessing Predictive Power

Realizing the difference between fit and predictive power. In any modeling, including QSAR modeling, it is easy to manipulate data such that an apparently good model can be formulated. The most drastic step here is removal of observations (compounds) and variables that "do not fit" according to some subjective criterion. Furthermore, variables might be unduly transformed, and model complexity might be driven beyond pertinent limits. Such an inappropriate model often arises when one is merely interested in the fit of the model to the underlying data, and neglects its performance with new compounds. The problem with this kind of model is that it is not representative for other, additional compounds. Predictive validation is one way to reliably assess model adequacy for new compounds.

In this context, it is of crucial importance to realize the difference between a model's fit and prediction ability. The fit, usually estimated as [R.sup.2]Y, tells how well we are able to mathematically reproduce the endpoint data of the training set. The problem with the goodness of fit Goodness of fit means how well a statistical model fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e.  is that with sufficiently many free parameters The introduction to this article provides insufficient context for those unfamiliar with the subject matter.
Please help [ improve the introduction] to meet Wikipedia's layout standards. You can discuss the issue on the talk page.
 in the model, [R.sup.2]Y can be made arbitrarily close to the optimal value of 1.0. Fortunately, the prediction ability is not as easy to manipulate. It measures how accurately we can predict the data of new compounds not previously used in the model training. The predictive power of a model may be estimated by the goodness of prediction parameter [Q.sup.2]Y.

The most demanding way to predictively validate a model is by external validation, which consists of making predictions for an independent set of data not used in the model calibration. Such a prediction set may well be selected according to the principles of multivariate design. However, external validation might not always be tractable tractable

easy to manage; tolerable.
, for example, in QSARs because of the resources needed to make a test set of new compounds. Hence, alternatives for predictive validation are of interest, for example, methods such as cross-validation and permutation One possible combination of items out of a larger set of items. For example, with the set of numbers 1, 2 and 3, there are six possible permutations: 12, 21, 13, 31, 23 and 32.

(mathematics) permutation - 1.
 testing (Eriksson et al. 1997).

Cross-validation. In the two examples above, no external validation set validation set Decision-making A group of Pts with a clinical finding of interest–eg, chest pain, who are studied prospectively in order to verify facets of their disease that had been previously identified as possible predictors of outcome. See Derivation set.  was available, so we used cross-validation with seven cross-validation groups instead. Basically, cross-validation is performed by dividing the data in a number of groups and then developing a number of parallel models from reduced data with one of the groups deleted. It should be noted that increasing the number of cross-validation groups to N (number of compounds, that is, the so-called leave-one-out (LOO) approach, is not recommended because the estimated [Q.sub.2]Y then becomes too similar to [R.sub.2]Y (Eriksson et al. 2001; Shao 1993;).

After developing the reduced model, the omitted data are used as a test set, and the differences between actual and predicted Y-values are calculated for these data points. The sum of squares SS of these differences from all the parallel models are used to form PRESS (predictive residual sum of squares In statistics, the residual sum of squares (RSS) is the sum of squares of residuals,



In a standard regression model , where a and b
). This is a measure of the predictive ability of the model and is often reexpressed as [Q.sup.2]Y (the "cross-validated" [R.sup.2]Y), a statistic that is similar to [R.sup.2]Y.

Without a high [R.sup.2]Y, it is impossible to obtain a high [Q.sup.2]Y. Generally, a [Q.sup.2]Y > 0.5 is regarded as good and a [Q.sup.2]Y > 0.9 as excellent, but these guidelines are of course heavily application dependent (Eriksson et al. 2001). Differences between [R.sup.2]Y and [Q.sup.2]Y larger than 0.2-0.3 indicate the presence of many irrelevant model terms or a few outlying out·ly·ing  
adj.
Relatively distant or remote from a center or middle: outlying regions.


outlying
Adjective

far away from the main area

Adj. 1.
 data points.

We note that the [R.sup.2]Y and [Q.sup.2]Y measures can be equivalently expressed as residual standard deviations and predictive residual standard deviations (PRESDs). The latter is often called SDEP SDEP Scottish Division of Educational Psychology (British Psychological Society)
SDEP Sergeants Distance Education Program (USMC)
SDEP Software Demonstration Evaluation Procedure(s) 
 (standard error of prediction). These standard deviations should be of sizes similar to those of the known or expected "noise" in the system, for example, [+ or -] 0.3 units for log(1/C) in QSAR investigations.

Response permutation testing. One limitation of cross-validation is that it assesses only the predictive power and provides no statement of the statistical significance of the estimated predicted power. To obtain an estimate of the significance of a [Q.sup.2]Y value, one may develop a number of parallel models based on fit to randomly reordered Y-data, and then evaluate the real [Q.sup.2]Y in light of a distribution of [Q.sup.2]Y values of reordered response data. A good description of permutation testing can be found in Van der Voet (1994).

This validation option works as follows: For the training set, the X-data are left intact, whereas the Y-data are permuted to appear in a different order. This means that the Y-data remain numerically the same, but their positions are shifted by random shuffling. A QSAR model is then fitted to the permuted Y-data, and by using cross-validation, both [R.sup.2]Y and [Q.sup.2]Y values are computed for the derived model. These "permuted" values may then be compared with the estimates of [R.sup.2]Y and [Q.sup.2]Y of the "real" model to get a first indication of the significance of the latter values.

In the next round, a second model is fitted to another permuted version of the Y-data, and new estimates of "permuted" [R.sub.2]Y and [Q.sup.2]Y values are thus formed. By repeating this permutation procedure a number of times, say, between 50 and 100 times, and by establishing an equivalent number of parallel QSAR models, it is possible to achieve reference distributions of [R.sup.2]Y and [Q.sup.2]Y based on random data. Such reference distributions are useful for appraising the statistical significance of the [R.sup.2]Y and [Q.sup.2]Y parameters of the parent QSAR model. If the "real" [Q.sup.2]Y and [R.sup.2]Y are found outside such reference distributions, this constitutes a strong indication of a valid model.

Furthermore, because the numerical values of the "permuted" versions of [R.sup.2]Y and [Q.sup.2]Y depend, at least partly, on the extent of perturbation perturbation (pŭr'tərbā`shən), in astronomy and physics, small force or other influence that modifies the otherwise simple motion of some object. The term is also used for the effect produced by the perturbation, e.g.  inflicted by the permutation procedure, it is advisable to keep track of the correlation coefficient between original and permuted Y-variables. Should an original Y-variable be only mildly perturbed per·turb  
tr.v. per·turbed, per·turb·ing, per·turbs
1. To disturb greatly; make uneasy or anxious.

2. To throw into great confusion.

3.
 by permutation, the permuted Y-variable will by necessity display a high correlation coefficient with the original Y-variable. By jointly assessing such correlation coefficients and "permuted" [R.sup.2]Y/[Q.sup.2]Y numbers, it is possible to understand and explain the existence of occasionally high [R.sup.2]Y and [Q.sup.2]Y values for permuted Y-data.

An informative way of summarizing results of response permutation testing was recently published (Eriksson et al. 2001). Figure 7 shows such a plot for the first data set. It manifests the validity of that QSAR because the "real" model parameters are constantly much higher than their permuted counterparts. The plot in Figure 7 was constructed by letting the y-axis represent the [R.sup.2]Y/[Q.sup.2]Y values of all MLR models, including the "real" one and by assigning the x-axis to the correlation coefficients between permuted and original response variables. Observe that the points of [R.sup.2]Y and [Q.sup.2]Y for the original model are always found in the right-hand part of the plot at correlation 1.0 (because 1.0 is the correlation coefficient obtained when correlating a variable with itself).

[FIGURE 7 OMITTED]

Assessing Parameter Uncertainty

Confidence intervals. When estimating a parameter, for example, a regression coefficient, we would like to know the significance of this parameter--we would like to know not only the estimated value of the statistic but also how precise it is. In other words, we want to be able to state some reference limits within which we may reasonably declare the true value of the statistic lies. Such statements may assert that the true value is unlikely to exceed some upper limit, or it is unlikely to be less than some lower limit, or it is unlikely to lie outside a pair of limits. Such a pair of limits is often known as confidence limits or a confidence interval and is just as important as the estimated statistic itself. The degree of confidence used is usually set at 95%, but higher or lower levels may be chosen by the user.

Usually, in QSAR modeling parameter uncertainty is given in terms of 95% confidence intervals. Such intervals are easily calculated in MLR when applied to well-conditioned data sets. For other methods and more challenging data sets, more elaborate calculations are often necessary (Burnham et al. 1996, 1999, 2001; Denham 1997).

Jack-knifing. One way to estimate standard errors and confidence intervals directly from the data is to use jack-knifing (Efron 1982; Efron and Gong 1983). This is useful for data where the assumptions of regression analysis are not fulfilled. The objective of jack-knifing is to estimate variability of model parameters.

Interestingly, cross-validation where the objective is to estimate the model complexity giving the optimal predictive power--produces results that can be fed directly to jack-knifing. In this way, the various submodels generated by cross-validation are used to calculate the standard errors of the model parameters, which are then converted into confidence intervals via the t-distribution. This connection between cross-validation and jack-knifing was highlighted by Herman Wold Herman Ole Andreas Wold (December 25, 1908 - February 16, 1992) Swedish statistician known for his work in time series analysis and econometrics. Eponymous terms include the Wold decomposition and the Cramér-Wold theorem.  in 1982 and has recently been revived in Martens and Martens (2000).

Bootstrapping Bootstrapping

A procedure used to calculate the zero coupon yield curve from market figures.

Notes:
Since the T-bills offered by the government are not available for every time period, the bootstrapping method is used to fill in the missing figures in order to derive the
. Another way to estimate confidence intervals for model parameters is to use the method of bootstrap See boot.

(operating system, compiler) bootstrap - To load and initialise the operating system on a computer. Normally abbreviated to "boot". From the curious expression "to pull oneself up by one's bootstraps", one of the legendary feats of Baron von Munchhausen.
 resampling. The basic premise of this method is that the data set is representative of the population from which it was drawn. Because there is only one data set, bootstrapping simulates what would happen if the population were resampled by randomly resampling the data set (Efron and Tibshirani 1993; Wehrens et al. 2000). An illustration of the use of bootstrap resampling to derive confidence intervals for the parameters of a classification model is provided in Worth and Cronin (2000).

Variable Selection and Reduction

A delicate problem. Yet another approach to improving QSARs is the deletion deletion /de·le·tion/ (de-le´shun) in genetics, loss of genetic material from a chromosome.

de·le·tion
n.
Loss, as from mutation, of one or more nucleotides from a chromosome.
 of uninformative un·in·for·ma·tive  
adj.
Providing little or no information; not informative.



unin·for
 variables. However, one should be very careful when reducing the number of variables because this can be done in so many ways, so almost any result is possible. In variable selection, it is important to test the predictive power of the model on real new data, and not just the cross-validation with the training set. In multivariate data, most of the X-variables contain at least some information about Y. Hence, one can hope for only a mild variable reduction, usually not more than 20-30% of the variables have less information than the noise level (Eriksson et al. 2001).

Moreover, because of the correlations among the more important X-variables, one can continue to reduce the X-variables further than these 20-30% with no apparent decrease in fit. This makes the remaining X-variables take over importance from the ones that are deleted, and a serious bias is introduced. Thus, the interpretation of the model shifts, and some variables take the role of being related to Y, while other variables correlated to these have been deleted and hence are forgotten in the interpretation. This also makes the prediction power of the model deteriorate because the correlations are not perfectly stable, and for new samples/molecules, important variables are now missing in the model.

Also, one should remember that even seemingly unimportant un·im·por·tant  
adj.
Not important; petty.



unim·portance n.
 variables still have a role in diagnosing outliers. Consider a variable that is almost constant in the training set and that will appear unimportant in the QSAR model. If a new compound has a value of this variable that substantially differs from its values in the training set, this is an indication that this compound is different, and hence predictions of its Y-values are doubtful. If one mechanically deletes all variables that do not contribute to the modeling of Y in the training set, one automatically decreases the possibility of finding outliers among the new observations.

How do we then handle very many variables? Despite all the drawbacks discussed above, and the care that should be taken in selecting variables, variable selection is sometimes necessary to find a simple and predictive QSAR model. Nowadays it is becoming quite common to use a wide set of molecular descriptors of different kinds (experimental and/or theoretical) able to capture all the structural information possibly related to the Y-response. A recent survey of this was published by Livingstone (2000). Many software programs calculate wide sets of different theoretical descriptors, from SMILES (simplified molecular input line entry specification “SMILES” redirects here. For other uses, see Smile (disambiguation).

The simplified molecular input line entry specification or SMILES is a specification for unambiguously describing the structure of chemical molecules using short ASCII strings.
), two-dimensional graphs, and three-dimensional x,y,z-coordinates. Only some of the more complete are mentioned here: ADAPT (Jurs 2002; Stuper and Jurs 1976), OASIS (Mekenyan and Bonchev 1986), CODESSA (Katritzky et al. 1994), and DRAGON (Todeschini et al. 2001). It has been estimated that more than 3,000 molecular descriptors are now available, and most of them are summarized and explained in recently published books (Devillers and Balaban 1999; Karelson 2000; Todeschini and Consonni 2000). The great advantage of theoretical descriptors is that they are calculable for not yet synthesized syn·the·sized  
adj.
1. Relating to or being an instrument whose sound is modified or augmented by a synthesizer.

2. Relating to or being compositions or a composition performed on synthesizers or synthesized instruments.
 chemicals.

There are two main steps in QSAR modeling by variable selection: first, statistically validated and robust regression This article or section is written like a personal reflection or and may require .
Please [ improve this article] by rewriting this article or section in an .
 models must be found, and second, the model variables must be interpretable. In principle, all the different possible variable combinations of the X-variables should be investigated to find the most predictive QSAR model. However, this may be quite taxing, mainly for reasons of time. Thus, first, various types of rapid prescreens (discarding constant values, pair-correlated variables, etc.) are often implemented to sort out a limited set of descriptors among which the selection of those really related to the response, not only in fitting but most importantly Adv. 1. most importantly - above and beyond all other consideration; "above all, you must be independent"
above all, most especially
 in prediction, is then performed by alternative variable selection methods.

Several strategies for variable subset selection have been applied in QSARs (among those most widely applied: stepwise regressions In statistics, stepwise regression includes regression models in which the choice of predictive variables is carried out by an automatic procedure.[1][2][3] , forward selection, backward elimination, simulated annealing simulated annealing - A technique which can be applied to any minimisation or learning process based on successive update steps (either random or deterministic) where the update step length is proportional to an arbitrarily set parameter which can play the role of a temperature. , and evolutionary and genetic algorithms). A recent comparison (Xu and Zhang 2001) of these methods has given a demonstration of the advantages and success of genetic algorithms as a variable selection procedure for QSAR studies. Below, we discuss genetic algorithms and a few alternatives.

Genetic algorithm genetic algorithm - (GA) An evolutionary algorithm which generates each individual from some encoded form known as a "chromosome" or "genome". Chromosomes are combined or mutated to breed new individuals.  strategy for variable selection. Genetic algorithms are a particular kind of evolutionary algorithm evolutionary algorithm - (EA) An algorithm which incorporates aspects of natural selection or survival of the fittest. An evolutionary algorithm maintains a population of structures (usually randomly generated initially), that evolves according to rules of selection, recombination,  shown to be able to solve complex optimization problems In computer science, an optimization problem is the problem of finding the best solution from all feasible solutions. More formally, an optimization problem is a quadruple  in a number of fields, including chemistry (Davis 1991; Goldberg 1989; Hibbert 1993; Wehrens and Buydens 1998). The natural principles of the evolution of species in the biological world are applied: the assumption that conditions that lead to better results will prevail over poorer ones, and that improvement can be obtained by different kinds of recombination recombination, process of "shuffling" of genes by which new combinations can be generated. In recombination through sexual reproduction, the offspring's complete set of genes differs from that of either parent, being rather a combination of genes from both parents.  of independent variables, that is, reproduction, mutation, and crossover Crossover

The point on a stock chart when a security and an indicator intersect. Crossovers are used by technical analysts to aid in forecasting the future movements in the price of a stock. In most technical analysis models, a crossover is a signal to either buy or sell.
. The goodness of the selected solution is measured by a response function that has to be optimized.

Genetic algorithms, first proposed as a strategy for variable subset selection in multivariate analysis multivariate analysis,
n a statistical approach used to evaluate multiple variables.

multivariate analysis,
n a set of techniques used when variation in several variables has to be studied simultaneously.
 by Leardi et al. (1992), are now widely and successfully applied in QSAR approaches where there are many molecular descriptors as X-variables in various modified versions, depending on the way to perform reproduction, crossover, mutation, and so forth (Devillers 1996; GFA GFA Gospel for Asia
GFA Guitar Foundation of America (Garden Grove, CA)
GFA Ghana Football Association
GFA Gross Floor Area
GFA Gliding Federation of Australia
GFA Gateway Foreign Agent
GFA Gas Forced Air
 of Rogers and Hopfinger 1994; Leardi 1994; MUSEUM of Kubinyi 1994a, 1994b; MOBY-DIGS of Todeschini 1997).

In variable selection for QSAR studies, each variable (molecular descriptor) is denoted by a bit equal to 1 if present in the regression model or to 0 if excluded. A population constituted by a number of 0/1 bit strings (each of length equal to the total number of variables) is evolved following genetic algorithm rules, maximizing the predictive power of the models (explained variance in prediction, [Q.sup.2]Y, or root mean squared error In statistics, the mean squared error or MSE of an estimator is the expected value of the square of the "error." The error is the amount by which the estimator differs from the quantity to be estimated.  of prediction). Only the models producing the highest predictive power are finally retained and further analyzed.

Whereas revolutionary algorithms search for the global optimum In mathematics, a global optimum is a selection from a given domain which yields either the highest value or lowest value (depending on the objective), when a specific function is applied.  and end up with only one or very few results (Kubinyi 1994a, 1994b, 1996), genetic algorithms simultaneously create many different results of comparable quality in larger populations of models. Within a given population, the selected models can differ in number and kind of variables.

Different rules can be adopted to select the final "best" models. Todeschini, Gramatica, and colleagues (Gramatica et al. 1998, 1999, 2000; Gramatica and Papa 2003; Todeschini and Gramatica 1997) use the QUIK rule (Q under influence of KJ (Todeschini et al. 1999) to avoid multicollinearity without prediction power or "apparent" prediction power (chance correlation). According to this rule, only models with a K multivariate correlation calculated on the X + Y-block that is at least 5% greater than the K correlation of the X-block are considered statistically significant. Alternatively, one may use the approach of Hopfinger (discussed in a later section).

Model validation is always used to avoid "overfitted" models, that is, models where too many variables have been selected, and to avoid selecting variables randomly correlated with the dependent response. Particular care must be taken against overfitting; therefore, subsets with fewer variables are favored, even though the chance of finding "acceptable" models increases with increasing the selected variables. The proportion of random variables selected by chance correlation could also increase (Jouan-Rimbaud et al. 1996).

The collinearity collinearity

very high correlation between variables.
 in the original set of molecular descriptors results in many similar models yielding more or less the same predictive power. Therefore, after having selected a set of similar PMs, model validation proceeds via leave-more-out cross-validation, response permutation testing (Y-scrambling), bootstrapping (Efron 1982), or other resampling techniques. This is done to avoid overestimation o·ver·es·ti·mate  
tr.v. o·ver·es·ti·mat·ed, o·ver·es·ti·mat·ing, o·ver·es·ti·mates
1. To estimate too highly.

2. To esteem too greatly.
 of the model predictive power by [Q.sup.2.sub.LOO] (Golbraikh and Tropsha 2002; Shao 1993), to verify model predictivity stability, and to select the "best" model. Finally, for the strongest evaluation of model applicability for prediction in new chemicals, external validation (verified by [Q.sup.2.sub.EXT EXT Extension
EXT Extended
EXT External
Ext Extraction
EXT Exterior (screenwriting)
EXT Extinguisher
EXT Extruded
EXT Extinguished
EXT Exeter, England, United Kingdom - Exeter (Airport Code) 
]) of all the models is also recommended, depending on whether the data set is large enough to permit an independent external validation set. The best splitting of the original data set into a representative training set and a validation set can be obtained by applying experimental design (Eriksson et al. 2000b; Marengo and Todeschini 1992).

If after several different runs of genetic algorithms the same subsets of variables have been selected, and if the obtained models pass all the validation procedures above (cross-validation), external testing, Y-scrambling, bootstrapping), there is a reasonable certainty that the models are robust and applicable for prediction. Good predictive properties is also an indication that chance correlation has been avoided.

Because genetic algorithms simultaneously create many different good models in a population, the user can choose the "best model" according to need: the interpretability of the selected molecular descriptors, the possibility of having reliable predictions for some chemicals rather than others, the highlighting of different outliers, and so forth. The need for interpretability depends on the application, as a validated mathematical model relating a target property to chemical features may, in some cases, be all that is necessary, though it is obviously desirable to attempt some explanation of the 'mechanism' in chemical terms, but it is often not necessary, per se (Livingstone 2000). This type of QSAR model follows a path that starts with a statistical validation and further interpretation for their biological and mechanistic meaning (Tropsha et al. 2003). Therefore, their application domain is mainly related to the production of predicted data, verified for their reliability.

Assessing model uniqueness. Hopfinger and colleagues advocate a related approach aiming at defining the best QSAR model. This approach is based on some of the elements described above, notably, cross-validation, response permutation testing, and variable selection (Kulkarni et al. 2001). They strive to maximize [Q.sup.2]Y through elimination of unimportant X-variables. Several different model versions are derived using genetic algorithms and the ones producing the highest [Q.sup.2]Yare yare  
adj.
1. Agile; lively.

2. Nautical Responding easily; maneuverable. Used of a vessel.

3. Archaic Ready; prepared.
 retained. In the next step cross-correlation analysis of the modeling residuals from the set of best models is used to determine how many unique models have been obtained. A unique model will have low correlations of its residuals of fit to those of the alternative top-ranked models.

After having selected a set of unique models with highest possible [Q.sup.2]Y, model validation proceeds via response permutation testing and/or external predictive validation, depending on whether the data set is large enough to permit an independent external prediction set. In some cases when there is thought to be considerable noise in the Y-data, the approach of Hopfinger and colleagues also involves studying the stability of the resultant QSAR models as a function of increasing simulated error among the X-variables. The objective with this latter exercise is to investigate whether stable QSARs with respect to the inherent error of the data set have been obtained (Hopfinger AJ and Jaworska J. Personal communication).

GOLPE (generating optimal linear PLS estimations). About a decade ago an advanced variable selection procedure called GOLPE was introduced by Sergio Clementi and colleagues (Baroni et al. 1993) and has found widespread use in three-dimensional QSARs. The objective of this approach is to obtain PLS regression models with the highest prediction ability. The key steps of this approach involve a first preliminary variable selection by means of a D-optimal (determinant determinant, a polynomial expression that is inherent in the entries of a square matrix. The size n of the square matrix, as determined from the number of entries in any row or column, is called the order of the determinant.  optimal) design in the loading space, and an iterative it·er·a·tive  
adj.
1. Characterized by or involving repetition, recurrence, reiteration, or repetitiousness.

2. Grammar Frequentative.

Noun 1.
 evaluation of the effects of the individual variables on the model predictivity. This is accomplished based on the validation of a number of partial submodels using many combinations of the descriptor variables as dictated by a fractional factorial design In statistics, fractional factorial designs are experimental designs consisting of a carefully chosen subset (fraction) of the experimental runs of a full factorial design.  strategy. Cruciani and Watson (1994) show the utility of GOLPE in generating three-dimensional QSAR models with good predictive power.

Hierarchical modeling In a hierarchical data model, data are organized into a tree-like structure. The structure allows repeating information using parent/child relationships: each parent can have many children but each child only has one parent.  for easier model interpretation and as an alternative to variable selection. In two-and three-dimensional QSAR modeling involving many variables, plots and lists of coefficients, loadings, and so forth, rapidly become messy, and results are therefore difficult to interpret. As discussed above, there may then be a strong temptation to eliminate variables to obtain a smaller data set. Such a reduction of variables, however, often removes information and makes the modeling efforts less reliable. Model interpretation may be misleading, and predictive power may deteriorate.

As reported by Berglund et al. (1997), an interesting alternative is to partition the variables into blocks of logically related variables and apply hierarchical data analysis. All such blocks may be analyzed individually. This modeling forms the base level of the hierarchical modeling setup (Eriksson et al. 2002). The score vectors, often called "super variables," formed on the base level may be concatenated in new matrices amenable for analysis on the top level. On the top level, superficial relationships between the X-and the Y-data are investigated. On the base level, in-depth information is extracted for the different blocks.

Bayesian Methods for Reliability Testing

Bayesian-based methods have been heavily used in reliability engineering Reliability engineering is an engineering field, that deals with the study reliability: the ability of a system or component to perform its required functions under stated conditions for a specified period of time.[1] It is often reported in terms of a probability.  and diagnostic medicine where models are used for decision making. These methods are perfectly suitable to evaluating QSARs and have been introduced to the field but still are not used broadly (McDowell and Jaworska 2002; Pet-Edwards et al. 1989). One characteristic of Bayesian-based procedures is that they allow both prior information (including expert judgment) and sampling information to be combined in the weighting scheme inherent in Bayes' formula. The second characteristic of Bayesian-based methods is they can be formulated in a recursive See recursion.

recursive - recursion
 form. This means Bayesian methods allow successive updating of battery interpretation as additional tests results are obtained, which is particularly useful if sequential testing procedures are being considered.

The most common and simple application of Bayes' approach is found in evaluating performance statistics for two-way categorical classifications. It uses as inputs sensitivity and specificity. Sensitivity is the fraction of active chemicals that are predicted to be active by the model ([[alpha].sup.+.sub.i]); and specificity is defined as the fraction of nonactive chemicals the model predicts nonactive ([[alpha].sup.-.sub.i]). Sensitivity can also be expressed as Pr(P+|S+), the conditional probability conditional probability

the probability that event A occurs, given that event B has occurred. Written P(AB).
 a model predicts a chemical to be active (P+) given that the true state is active (S+). Similarly, specificity is defined as Pr(P-|S-), the conditional probability the model predicts a chemical nonactive (P-) given the true state is nonactive (S-).

We then can use Bayes' formula

[4] Pr([S.sub.i]|[T.sub.j]) = Pr([S.sub.i])Pr([T.sub.j]|[S.sub.i]) / [summation summation n. the final argument of an attorney at the close of a trial in which he/she attempts to convince the judge and/or jury of the virtues of the client's case. (See: closing argument)  over (i)] Pr([S.sub.i])Pr([T.sub.j]|[S.sub.i])

to obtain Pr([S.sub.j]|[T.sub.j]), the posterior probability The posterior probability of a random event or an uncertain proposition is the conditional probability that is assigned when the relevant evidence is taken into account.  of condition [S.sub.i] prevailing given we have test result j from a) the prior probability prior probability,
n the extent of belief held by a patient and practitioner in the ability of a specific therapeutic approach to produce a positive outcome before treatment begins.
 of [S.sub.i], Pr([S.sub.i]), and b) Pr([T.sub.j]|[S.sub.i]), the likelihood of jth test result given true state is [S.sub.i].

It is important to note the likelihood value (sensitivity or specificity), Pr([T.sub.j]|[S.sub.i]), is conditional on [S.sub.i] (not known to the observer or analyst), whereas the posterior probability is conditional on the observed result or prediction [T.sub.j] The posterior probability or predictive value pre·dic·tive value
n.
The likelihood that a positive test result indicates disease or that a negative test result excludes disease.



predictive value

a measure used by clinicians to interpret diagnostic test results.
 is the appropriate statistic for inferring from test results the probability the modeled chemical has condition [S.sub.i]. Posterior probabilities are statistically precise statements of the likelihood a chemical has a particular state or attribute, conditional on the test evidence obtained.

The predictive value positive (PVP See portable video player. ), Pr(S+|T+), denotes the probability a chemical is active (S+) given a model predicts the chemical to be active. Predictive value negative (PVN (Private Virtual Network) See VPN. ), Pr(S-|T-), is the probability a chemical lacks the attribute, for example, is S-, given a negative model result is obtained. The terms sensitivity, specificity, PVP, and PVN are sometimes referred to as the Cooper statistics (Cooper et al. 1979). Confidence intervals for the Cooper statistics can be derived by bootstrap resampling (Worth and Cronin 200la). The computational form for the two-way classification problem using Bayesian revision is presented in Table 1.

This approach can easily be extended to an n-way classification problem. This analytical framework is easily extended to a system with n possible states or characteristics. Extending this to n possible states requires the same parameters used in the two-state analysis but describing n possible states and n possible predictions: a) prior probabilities for each possible state, Pr([S.sub.i]), for i = 1, 2, ... n; and b) likelihood values, Pr([P.sub.j]|[S.sub.i]), where [P.sub.j] is a model prediction of the jth state given true state is [S.sub.i]. These likelihood values form an n x n contingency table contingency table
n.
A statistical table that shows the observed frequencies of data elements classified according to two variables, with the rows indicating one variable and the columns indicating the other variable.
 analogous to the two-state model's 2 x 2 contingency table of likelihood values comprised sensitivity, specificity, and their complements.

The application of Bayes' formula to an n-state model is identical to the two-state case (Equation 4) and the same conditions that hold for the two-state case apply to the n-state case:

[5] [summation over (i)]Pr([S.sub.i]) = 1.0, [summation over (i)]Pr(P|[S.sub.i]) = 1.0

The PVP and PVN are not constant but vary with prior probability, Pr(S+) or Pr(S-). In other words, given fixed sensitivity and specificity, PVP and PVN vary according to the prevalence or proportion of active (toxic) chemicals in a population. This means the predictive capacities of QSAR models should not be judged according to these statistics alone because the investigator can give PVP and PVN almost any values by altering the prevalence of S+ in the test set. Examining these predictive statistics reveals the importance of evaluating the prior for understanding classification probabilities, correct and incorrect (Figure 8). As sensitivity and specificity increase, the probability curves become increasingly nonlinear. The prior probability of active/not active is not a property of an individual chemical; it is the relative frequency of active/not active in a population of chemicals. A chemical is either active or nonactive. It has no probability of being active or not (in a given biological test system under defined exposure conditions)--the probability we call prior in this case is a measure of uncertainty to its true state.

[FIGURE 8 OMITTED]

Sequential Use of Models

Sequential testing (QSAR testing) of all chemicals with models of even modest performance characteristics can significantly reduce misclassification rates when compared with single tests or multiple tests where only initial positives are subjected to subsequent tests. In most testing schemes, medical and otherwise, for economic reasons those testing negative to the first screening test are not subjected to confirmatory testing, provided that the screening test has a high sensitivity and therefore low probability of false negative test results among S+ objects. The actual rate or proportion of items testing false negative is not 1--sensitivity but Pr(S+)Pr(P-|S+). It should be noted this applies in tiered assessment strategies for genotoxicity Genotoxic substances are a type of carcinogen, specifically those capable of causing genetic mutation and of contributing to the development of tumors. This includes both certain chemical compounds and certain types of radiation. , but the converse is true in the tiered assessment strategies for skin and eye irritation, where negative findings in vitro in vitro /in vi·tro/ (in ve´tro) [L.] within a glass; observable in a test tube; in an artificial environment.

in vi·tro
adj.
In an artificial environment outside a living organism.
 are subject to confirmatory testing in animals (OECD OECD: see Organization for Economic Cooperation and Development.  1998). Using QSAR models to classify chemicals does not have this limitation; once the QSAR model is developed, the marginal cost Marginal cost

The increase or decrease in a firm's total cost of production as a result of changing production by one unit.


marginal cost

The additional cost needed to produce or purchase one more unit of a good or service.
 of making a QSAR "test" of a chemical is nearly zero. Thus, analysts using QSAR models have the opportunity to apply multiple tests to all chemicals without financial penalty. This allows great improvment in the reliability of predictions.

Interestingly, a sequential testing approach for uncovering potential estrogenic endocrine disruptors Endocrine disruptors are exogenous substances that act like hormones in the endocrine system and disrupt the physiologic function of endogenous hormones. Studies have linked endocrine disruptors to adverse biological effects in animals, giving rise to concerns that low-level  has recently been adopted by the U.S. Food and Drug Administration (FDA FDA
abbr.
Food and Drug Administration


FDA,
n.pr See Food and Drug Administration.

FDA,
n.pr the abbreviation for the Food and Drug Administration.
). It is a method based on four phases (Hong et al. 2002; Shi et al. 2002; Tong tong 1  
tr.v. tonged, tong·ing, tongs
To seize, hold, or manipulate with tongs.



[Back-formation from tongs.
 et al. 2002).

Battery selection method. The reliability of predictions can be enhanced by using information from more than one model. The battery selection method (Pet-Edwards et al. 1989) is a system for evaluating and selecting batteries of tests; it was originally applied to carcinogenicity carcinogenicity /car·ci·no·ge·nic·i·ty/ (kahr?si-no-je-nis´i-te) the ability or tendency to produce cancer.

carcinogenicity

the ability or tendency to produce cancer.
 prediction and is therefore known as the CPBS CPBS Capabilities, Programming, and Budgeting System
CPBS Country Public Broadcasting System
 (carcinogenicity prediction battery selection) method. The two CPBS methodologies have two main objectives, a) to determine the reliability and predictive capability of a battery of tests that individually may give mixed results; and b) to develop a strategy to formulate and select optimally preferred batteries of tests optimal in terms of collective performance, minimum testing time, or costs or a compromise of these attributes.

The CPBS approach is a collection of methods designed to aid in selecting and interpretation tests used for decision making. The CPBS method relies on Bayesian decision theory to support sequential nature of the testing, cluster analysis Cluster analysis

A statistical technique that identifies clusters of stocks whose returns are highly correlated within each cluster and relatively uncorrelated across clusters. Cluster analysis has identified groupings such as growth, cyclical, stable, and energy stocks.
 to determine dependencies among the various models used in the battery, multiple-objective decision making to aid finding the optimal solution using cost, time and performance criteria, and dynamic programming to optimize the search for the best test battery when a number of tests are available.

The CPBS method consists of a) preliminary data analysis to evaluate and summarize information for use in battery selection with special attention on dependencies among tests and Bayesian prediction, b) battery selection, and c) Bayesian prediction to interpret the results.

The following initial strategy is advised in forming a battery of tests:

* An odd number of tests should be used to make the most decisive package (i.e., to be able to apply the "positive majority" rule). The battery is considered positive for the property if the majority of the results are positive for the property.

* If models with high sensitivity and specificity (both > 0.75) are available and statistically independent, use as many as is cost-effective.

* A model with high sensitivity (> 0.75) and lower specificity (< 0.75) should always be coupled with a model with high specificity (> 0.75) and lower sensitivity (< 0.75).

* Avoid models with low sensitivity and low specificity.

For the further refinement of this initial strategy, see Pet-Edwards et al. (1989). The majority, consensus, or probability limit criteria are used for selecting the best test battery. These decision rules need not guide the inference once a test battery has been selected and test results obtained; the posterior probability values--Pr([S.sub.i]|test results)--indicate the appropriate inference. The decision about how to act on this information is a more complicated question involving the consequences of each decision/outcome.

The reason for the third recommendation can be illustrated by considering a battery of just two tests, one with high sensitivity, the other with high specificity. The high-sensitivity test is used to detect the attribute of interest, such as the presence of a certain type of toxicity in a set of chemicals. This test correctly identifies most chemicals that exhibit the toxicity, but it does so at the expense of overpredicting the toxicity of chemicals that lack toxic potential; that is, it generates too many false positives. When such a test is combined with a high-specificity test, the latter test serves to confirm most correct positive predictions of the first test while correctly identifying most false positives from the first test (S-P+) as negative on the second test. The latter occurs because the second test correctly classifies most S- items by virtue of its high specificity, Pr(P-|S-).

In this paired arrangement of two tests, the high-sensitivity test is sometimes called the detection or screening test, whereas the high-specificity test is sometimes called the confirmation test (Feinstein 1975). A chemical would be predicted as negative (nontoxic) if the outcome of the detection was negative, whereas the chemical would only be predicted as positive (toxic) if the outcomes of both the detection and confirmation tests were positive. For QSARs the cost of making predictions is marginally low, so both positive and negative chemicals are tested through the whole battery. This is demonstrated in the example below.

Predictivity of independent tests. We now focus on the predictivity based on a battery of k tests. Let [[alpha].sup.+.sub.1], [[alpha].sup.+.sub.2], ..., [[alpha].sup.+.sub.k] be the sensitivities of the tests and [[alpha].sup.-.sub.1], [[alpha].sup.-.sub.2], ..., [[alpha].sup.-.sub.k] be the specificities of the tests.

For the case where the ith tests gave positive results, predictivity of the entire battery is calculated using the recursive formula

[6] Pr([S.sup.+.sub.i]|[P.sub.i]) = Pr([S.sup.+.sub.i-1]|[P.sub.i-1]) [[alpha].sup.+.sub.i] / (1 - [[alpha].sup.-.sub.i]) + ([[alpha].sup.+.sub.i] + [[alpha].sup.-.sub.i] - 1) x Pr([S.sup.+.sub.i-1]|[P.sub.i-1]).

Similarly, for the case where the ith test gave negative results, the predictivity of the entire battery is given by

[7] Pr([S.sup.+.sub.i]|[P.sub.i]) = Pr([S.sup.+.sub.i-1]| [P.sub.i-1]) x (1 - [[alpha].sup.+.sub.i] / [[alpha].sup.-.sub.i] - ([[alpha].sup.+.sub.i] + [[alpha].sup.-.sub.i] - 1) Pr([S.sup.+.sub.i-1]|[P.sub.i-1]).

After each test there is a refinement in the predictivity. This can be visualized as in Figure 8, which shows the range of posterior posterior /pos·ter·i·or/ (pos-ter´e-er) directed toward or situated at the back; opposite of anterior.

pos·te·ri·or
adj.
1. Located behind a part or toward the rear of a structure.
 sequential predictions as a function of prior probability with no tests, one test, two tests, and so forth.

The following example demonstrates a two-model sequential classification procedure using QSAR models to classify chemicals and has been previously described in McDowell and Jaworska (2002). Those authors assumed existence of two QSAR models to predict a particular chemical characteristic. The first test has sensitivity of 0.95 and specificity of 0.85. The values are reversed for the second test: sensitivity is 0.85 and specificity is 0.95. All chemicals are subjected to both tests. The resulting predictive values for all the tests are based on a prior probability of "active" of 0.10. The results are summarized in Table 2.

The columns of Table 2 denoted "Prior x likelihood" contain the relative frequency of classification rates for each test; the misclassification rates are summarized in columns 11 and 12. Total misclassification rate for model 1 is 14% in total; 13.5% is misclassified as false positive, 0.5% as false negative. When model 1 positives are subjected to model 2, 6.2% test false negative (column 5) and 2.9% test false positive (column 4), totaling 9.1% (column 9) misclassified. When adjusted to reflect the population proportion that predicted positive by model 1, the model 1 positives misclassified by model 2 represent 2.1% of total population with 1.4 and 0.7% testing false negative and false positive, respectively. Summing up the misclassification rates for model 2 shows a total misclassification rate of 6%, 1.5% as false negatives, and 4.5% as false positives. This represents a 57% decline in total misclassification rate compared with using one test. Note that this example does not rely on the majority rule introduced above. Rather, probabilities that a chemical is positive or negative conditional on the whole battery are explicitly calculated. Therefore, this approach can be used for two-model batteries and applies to any n > 1 test batteries.

Predictivity of dependent tests (Pet-Edwards et al. 1989). If the tests are dependent, a correction factor needs to be introduced expressed as conditional dependence:

[8] [K.sub.P]([r.sub.1],[r.sub.2]) = Pr([r.sub.1], [ r.sub.2]|P) / Pr([r.sub.1]|P)Pr([r.sub.2]|NP)

and

[9] [K.sub.NP]([r.sub.1],[r.sub.2]) = Pr([r.sub.1], [r.sub.2]|NP) / Pr([r.sub.1]|NP)Pr([r.sub.2]|NP)

and the batch formula for predictivity is preferred over a recursive one:

[10] Pr(P|[r.sub.1], [r.sub.2], ..., [r.sub.k]) = 1 / 1 + Pr(NP)[K.sub.NP]([r.sub.1],... [r.sub.k])Pr([r.sub.1]|NP)...Pr([r.sub.k]|NP) / Pr(P)[K.sub.P] ([r.sub.1], ..., [r.sub.k])Pr([r.sub.1]|P) ... Pr([r.sub.k]|P)

When [r.sub.i] is positive, then Pr([r.sub.i]|P) would be sensitivity of test i and Pr([r.sub.i]|NP) would be 1--specificity of test i. Similarly, if [r.sub.i] was negative, then Pr([r.sub.i]/P) would be 1--sensitivity of test i and Pr([r.sub.i]|P would be specificity of test i.

Discussion

The Need for Reliability Assessment of QSAR and Related Models

Executive summary reports of two recent QSAR projects in the European Union European Union (EU), name given since the ratification (Nov., 1993) of the Treaty of European Union, or Maastricht Treaty, to the

European Community
 (Anonymous 1995, 1999) indicate that there are many "environmental" QSAR models available. It is clear that these models cover broad classes of chemicals for many of the environmental endpoints that are used in the risk assessment of existing and virtual chemicals. A primary conclusion of these two projects, however, was that if such models are used for prediction outside their applicability domains, very unreliable predictions may result (Anonymous 1995, 1999). Consequently, in these reports, the reader is frequently reminded about the necessity of dearly defining the boundaries of each model. These reports also point out that it should be realized that our predictive capabilities are limited because for several classes of compounds or for very specific mechanisms of action, the QSAR models are simply not available and the progress in establishing such models is slow.

For both existing (published) and putative Alleged; supposed; reputed.

A putative father is the individual who is alleged to be the father of an illegitimate child.

A putative marriage is one that has been contracted in Good Faith and pursuant to ignorance, by one or both parties, that certain
 (still under development) QSAR models, it is important that reliability be assessed carefully and consistently. Reliability assessment procedures of QSARs must consider several aspects, for instance, quality of the underlying data, the chemical domain of the training set, predictivity estimates, and the work flow underpinning un·der·pin·ning  
n.
1. Material or masonry used to support a structure, such as a wall.

2. A support or foundation. Often used in the plural.

3. Informal The human legs. Often used in the plural.
 the QSAR. The predictions using any given QSAR model should be restricted to the chemicals that belong to the model domain. This emphasizes the importance of understanding the compositions of the intended prediction and validation sets. To have faith in model results, analysts must consider the model and the chemicals tested to determine if they are appropriately matched.

QSAR models based on the mechanism of action approach tend to rely on expert judgment to define the domain. QSAR models based on chemometric or statistical approaches tend to use similarity analysis tools where the decision is made based on formally defined similarity of chemicals in the prediction set to the chemicals in the training set. Similarity is measured as the multidimensional mul·ti·di·men·sion·al  
adj.
Of, relating to, or having several dimensions.



multi·di·men
 distance in the molecular descriptors space used as parameters of the evaluated QSAR model or by matching fragments. If the results of the similarity analysis indicate that the given QSAR model is applicable to the chemicals in the prediction set, then and only then the statistical reliability should be evaluated.

Furthermore, as described by Cronin et al. (2003a, 2003b), national and international validation centers have been established in the European Union and in the United States United States, officially United States of America, republic (2005 est. pop. 295,734,000), 3,539,227 sq mi (9,166,598 sq km), North America. The United States is the world's third largest country in population and the fourth largest country in area.  to validate alternative (nonanimal) methods. In this context, validation is seen as the process by which the relevance and reliability of a method for a particular purpose undergo independent assessment (Balls et al. 1995). Alternative methods include not only physicochemical and in vitro tests but also QSAR models and other computer-based systems Computer-based systems

Complex systems in which computers play a major role. While complex physical systems and sophisticated software systems can help people to lead healthier and more enjoyable lives, reliance on these systems can also result in loss of
 for predicting toxicity. An alternative test based on physicochemical or in vitro data can be regarded as the combination of a test system that generates experimental data and a PM that provides an objective means of extrapolating the data to an expression of toxicity at the in vivo in vivo /in vi·vo/ (ve´vo) [L.] within the living body.

in vi·vo
adj.
Within a living organism.



in vivo adv.
 level (Worth and Balls 2001). Thus, a PM is analogous to a QSAR for an in vivo endpoint: the former is based on experimental physicochemical or in vitro data, whereas the latter is based on physicochemical descriptors. Criteria for the acceptability of PMs, which can also be applied to QSARs, are summarized below.

Acceptability Criteria

As should be evident from the discussion above, the specification of reasonable acceptability criteria for the use of QSARs in risk assessment is a multifaceted mul·ti·fac·et·ed  
adj.
Having many facets or aspects. See Synonyms at versatile.

Adj. 1. multifaceted - having many aspects; "a many-sided subject"; "a multifaceted undertaking"; "multifarious interests"; "the multifarious
 task. We try to deal with this task by grouping such criteria according to three uniting principles: a) basic modeling conditions, b) procedural steps, and c) reference values of performance parameters.

Basic QSAR-modeling conditions. Earlier we outlined basic modeling conditions for applicability of QSARs. Checking data for homogeneity and representativity is easily overlooked.

Homogeneity. Homogeneity means that the investigated series of compounds must have rather similar chemical and biological properties, and the mechanism of influence of X on Y must be the same. Sometimes the data set/database in question may contain many classes of compounds. These classes may be partially overlapping, barely separated, or completely resolved in the chemical descriptor (X-) space and/or biological property (Y-) space of the compounds in question. Because very strong clustering violates the assumption of homogeneity, we recommend that any QSAR modeling be commenced by studying how the compounds are clustered. PARC methods and cluster analysis techniques are ideally suited for this. For instance, a plot of the scores of the first few summary latent variables will rapidly reveal groups, trends, discontinuities, outliers, and other anomalies in the data.

Representativity. The composition of the training set and the prediction set is of crucial importance. A representative selection of compounds that well span the chemical domain of interest should be included in these sets. One way to accomplish a representative selection of compounds is through SMD (Wold et al. 1986). With this approach, test series of compounds are defined in which all major structural and chemical properties are systematically varied at the same time.

Taking into account properties of X- and Y-data. Earlier we discussed several aspects that relate to the nature and quality of the X- and the Y-data. It is of utmost importance that any knowledge about measurement noise be used in the model-building process. Any estimated "noise" in response data can be beneficially compared with the predictive power of the model. For example, if the known or expected noise is [+ or -] 0.3 units for log(1/C), then the PRESD of Y should be of similar size. Also, if uncertainty estimates of many variables are available, this information can be used in the scaling of data.

Procedural steps. Before a PM/QSAR is recommended for regulatory use, it should be mandatory to carry out model validation. First, it is important to make clear what we mean by a valid model. We mean that it predicts much better than chance. In addition, it should have model coefficients that have the correct sign and with size that is proportional to their significance to the modeled process. Finally, it should be consistent with fundamental chemical, biological, and toxicologic knowledge.

To facilitate the handling of real-world data sets, a PM/QSAR should a) be associated with a defined endpoint that it serves to predict; b) take the form of an unambiguous and easily applicable algorithm for predicting a pharmacotoxicologic endpoint; c) ideally have a clear mechanistic basis; d) be accompanied by a definition of the domain of its applicability--for example, the physicochemical classes of chemicals for which it is applicable; e) be associated with a measure of its goodness of fit and internal goodness of prediction estimated with cross-validation or similar method to a training set of data; and f) be assessed in terms of its predictive power by using data that were not used in the development of the model (external validation).

In the framework of alternative tests, it is considered essential that the validation process be managed under the auspices of an organization such as the European Centre for the Validation of Alternative Methods (ECVAM ECVAM European Centre for the Validation of Alternative Methods ) in the European Union that is independent of test method developers, who have vested interests vested interest
n.
1. Law A right or title, as to present or future possession of an estate, that can be conveyed to another.

2. A fixed right granted to an employee under a pension plan.

3.
 in their own methods. Organizations such as ECVAM provide independent advice to the regulatory authorities Noun 1. regulatory authority - a governmental agency that regulates businesses in the public interest
regulatory agency

administrative body, administrative unit - a unit with administrative responsibilities
, who have the responsibility for deciding on modifications to existing legislation (including the addition of new test methods). Additional criteria have also been developed that relate to the experimental protocols of alternative methods (Balls et al. 1995).

Reference values of performance parameters for continuous models. A third part in the compilation of acceptability criteria involves the specification of recommended values for model performance statistics such as [R.sup.2]Y. The values given below must be regarded as a rule of thumb, and it might be necessary, on a case-by-case basis, to reconsider these, taking into account the purpose of the model and the variability of the underlying X- and Y-data (which place limitations on the predictive capacity). However, it is important that the model parameter criteria be defined in advance of the experimental phase of the validation study by the management team of the study. This circumvents the possibility that the criteria could be weakened, with the improper aim of "successfully" validating the method.

Proposed reference values.

* [R.sup.2]Y: This limit is conditional on the [Q.sup.2]Y value.

* [Q.sub.2]Y: [Q.sup.2]Y > 0.5 is generally regarded as good, and [Q.sub.2]Y > 0.9 as excellent. These limits are highly application dependent.

* [R.sup.2]Y - [Q.sup.2]Y: This difference ought not to exceed 0.3. A substantially larger difference indicates a) an overfitted model (i.e., a model modeling noise); b) presence of irrelevant X-variables, or c) outliers in the data.

* "Background" [R.sup.2]Y and [Q.sup.2]Y: This consists of the intercepts of the regression lines Noun 1. regression line - a smooth curve fitted to the set of paired data in regression analysis; for linear regression the curve is a straight line
regression curve
 of the response permutation testing. Results should be [R.sup.2]Y < 0.3 and [Q.sup.2]Y < 0.05 to indicate a valid model. These intercepts can be understood as indicating the level of "background" [R.sup.2]Y and [Q.sup.2]Y obtainable with random data.

Condition Number. The condition number is defined as the ratio of the largest to the smallest singular value of the X-matrix. When this ratio exceeds 10, this indicates that the X-variables are significantly correlated, and the user should refrain from using MLR. It has been recommended to use a ratio 5:1 (compounds: X-variables) to diminish the risk of multicollinearities among the X-variables.

SDEP/PRESD. The SDEP should be similar to the experimental variability of an endpoint; or, for example, if the known noise is [+ or -]0.3 units for log(1/C), then the SDEP (also called PRESD) of Y should be of similar size. Alternatively, if for instance the variability in the Y-data is 20% (in variance-metric), then it seems unlikely that [R.sup.2]Y and [Q.sup.2]Y can exceed 80% (0.8).

Reference values of performance parameters for classification models. Similar considerations apply to the Cooper statistics that are often used to assess the predictive performance of two-group classification models; that is, fixed acceptability criteria for the sensitivity, specificity, and accuracy (concordance concordance /con·cor·dance/ (-kord´ins) in genetics, the occurrence of a given trait in both members of a twin pair.concor´dant

con·cor·dance
n.
) cannot be defined for all types of classification models because the maximal max·i·mal
adj.
1. Of, relating to, or consisting of a maximum.

2. Being the greatest or highest possible.
 predictive performance achievable will depend on the quality of the predictor and response data. Thus, the acceptance criteria need to be established on a case-by-case basis in advance of the experimental work conducted to test the classification model.

Furthermore, the criteria should take account of the purpose of the model. For example, if the model is intended to serve as a stand-alone test, that is, a complete replacement of an animal test, then, as a minimum requirement, the Cooper statistics should be significantly greater than 50% (for a two-group model) to ensure that the model is producing predictions that are significantly better than chance (Worth and Cronin 2001b). An example is provided by an ECVAM validation study in which classification models based on in vitro data were assessed for their capacity to predict skin corrosion potential (Fentem et al. 1998). In this study, one of the acceptability criteria was that the sensitivity of each test be greater than 70%.

However, if a classification model is being used in a battery of tests, for example, to identify toxic chemicals Any chemical which, through its chemical action on life processes, can cause death, temporary incapacitation, or permanent harm to humans or animals. This includes all such chemicals, regardless of their origin or of their method of production, and regardless of whether they are produced  that act by a certain mechanism of action, the acceptance criteria are likely to be different. For example, classification models for predicting skin corrosion potential can also be based on pH data because chemicals that are acidic acidic /acid·ic/ (ah-sid´ik) of or pertaining to an acid; acid-forming.
acidic,
adj having the properties of an acid; acid-forming properties.
 or alkaline in solution are expected to be corrosive corrosive /cor·ro·sive/ (kor-o´siv) producing gradual destruction, as of a metal by electrochemical reaction or of the tissues by the action of a strong acid or alkali; an agent that so acts.  (Worth and Cronin 2001b). However, not all corrosive chemicals exert their toxic action by a pH-dependent mechanism. Thus, a model based on pH data may therefore detect a small percentage of known corrosives in a given test set (i.e., the model could have a sensitivity less than 50%), but of those chemicals it does identify as corrosive, there is a high probability of actual corrosivity [i.e., the model would have a high positive predictive value Positive predictive value (PPV)
The probability that a person with a positive test result has, or will get, the disease.

Mentioned in: Genetic Testing

positive predictive value 
, Pr(S+|P+)]. Tests with low specificity can generate high PVP results in only two ways: a) the test also has very high specificity, thus generating very few false positives regardless of prevalence of S- items in the tested population; and b) a high S+ prevalence (thus low S-prevalence) will also produce few false positives because there are few negatives in the tested population. This kind of performance could be regarded as acceptable because it is understood the pH model is not a standalone stand·a·lone  
adj.
Self-contained and usually independently operating: a standalone computer terminal. 
 model but is used as one component in a battery of models. Such a model is being used to identify toxic chemicals with a high degree of certainty, but it makes no predictions about nontoxic chemicals (which would need to be identified by another test in the battery).

Concluding Remarks

To increase regulatory acceptance and use of QSARs, and to enhance confidence of QSAR predictions, it is necessary to develop guidance and acceptability criteria that are not only reliable but also easy to understand and apply. This has been the intention of this article. At first sight this may seem an overly ambitious goal, particularly because many of the criteria put forward are originally statistical in nature and may therefore have a discouraging effect on the user. However, because we believe so strongly in the future use of QSARs for chemicals management, we have tried to compose a basic set of acceptability criteria so "user-friendly" in nature that each and every QSAR analyst may benefit from them.

In summary, we emphasize the value of predictive QSAR models in future chemicals management, including priority setting, risk assessment, and classification and labeling. These models can be seen as simplifications (approximations) of complicated functional relationships that often prevail between chemical and biological properties of compounds. Provided that QSARs are applied with care and common sense and are developed by fulfilling the basic acceptability criteria outlined-here, they constitute an important and powerful tool definitely deserving a slot in the risk assessor's toolbox See toolkit and toolbar. .
Table 1. Bayesian revision of diagnostic test result.

                                    Likelihood
                                  Pr([T.sub.j]Prior                               [S.sub.i])
probability
[S.sub.i]      Pr([S.sub.i]       T+          T-

S+                  p            sens      1 - sens

S-                1 - p        1 - spec      spec

Sums               1.0

                   Joint probabilities:
                    prior x likelihood              Posterior
                      Pr([S.sub.i] x              probabilities
                      Pr([T.sub.j]\               Pr([S.sub.i] Prior                  [S.sub.i])                   [P.sub.j])
probability
[S.sub.i]          T+              T-                   P+

S+              p x sens          p x               p x sens                                (1 - sens)       p x sens + (1 - p)
                                                     (1-spec)

S-             (1 - p) x        (1 - p)        (1 - p) x (1-spec)                (1 - spec)        x spec        p x sens + (1 - p) x
                                                     (1-spec)

Sums           p x sens      p x (1 - sens)            1.0
               + (1 - p)       + (1 - p)
                (1-spec)        (1-spec)

                          Posterior
                         probabilities
Prior              Pr([S.sub.i] \ [P.sub.j])
probability
[S.sub.i]                     P-

S+                        p x sens                p x (1 - sens) + (1 - p) x spec

S-                     (1 - p) x spec                 p x (1-spens) + (1 - p) x spec

Sums                          1.0

Table 2. Hypothetical application of sequential screening tests and
associated misclassification rates.

                                                            Prior x
                                                          likelihood
                                          Likelihood     Pr([S.sub.i])
                                         Pr([T.sub.j]\    ([T.sub.j]                                          [S.sub.i])      [S.sub.i])
Prior probability
[S.sub.i]                Pr([S.sub.i])    T+      T-      T+      T-

Test 1
  S+                         0.100       0.950   0.050   0.095   0.005
  S-                         0.900       0.150   0.850   0.135   0.765
    Totals                   1.000                       0.230   0.770
Test 2: T+ from test 1
  S+                         0.413       0.850   0.150   0.351   0.062
  S-                         0.587       0.050   0.950   0.029   0.558
    Totals                   1.000                       0.380   0.620
Test 2: T- from test 1
  S+                         0.006       0.850   0.150   0.006   0.001
  S-                         0.994       0.050   0.950   0.050   0.944
    Totals                   1.000                       0.055   0.945
Test 2
  Total
  FN
  FP
    Total

                         Pr([T.sub.j]\           Posterior
                          [S.sub.i])            probability
Prior probability
[S.sub.i]                 T+      T-     Category   Individual test

Test 1
  S+                     0.413   0.006     FN            0.005
  S-                     0.587   0.994     FP            0.135
    Totals               1.000   1.000    Total          0.140
Test 2: T+ from test 1
  S+                     0.923   0.100     FN            0.062
  S-                     0.077   0.900     FP            0.029
    Totals               1.000   1.000    Total          0.091
Test 2: T- from test 1
  S+                     0.100   0.001     FN            0.001
  S-                     0.900   0.999     FP            0.050
    Totals               1.000   1.000    Total          0.051
Test 2
  Total
  FN
  FP
    Total

                          Proportion
                         misclassified
Prior probability
[S.sub.i]                In population

Test 1
  S+                          NA
  S-                          NA
    Totals                    NA
Test 2: T+ from test 1
  S+                        0.014
  S-                        0.007
    Totals                  0.021
Test 2: T- from test 1
  S+                        0.00075
  S-                        0.03825
    Totals                  0.039
Test 2
  Total
  FN                        0.015
  FP                        0.045
    Total                   0.060

NA, not applicable. Misclassification categories: FN, false negative,
T-\ S+; FP, false positive, T+\ S-; S+, positive for attribute; S-,
negative for attribute; T-, test negative for attribute; T+, test
positive for attribute.


REFERENCES

Albano C, Dunn WG III, Edlund U, Johansson E, Norden B, Sjostrom M, et al. 1978. Four levels of pattern recognition. Anal Chem Acta 103:429-443.

Andersson PM, Sjostrom M, Wold S, Lundstedt T. 2000. Comparison between physicochemical and calculated molecular descriptors. J Chemomr 14:629-642.

(Anonymous]. 1995. QSAR for prediction of fate and effects of chemicals in the environment. Environmental Technologies RTD RTD returned to duty (US DoD)
RTD Rated
RTD Ready to Drink
RTD Richmond Times-Dispatch
RTD Regional Transportation District
RTD Research, Technological Development
RTD Research and Technology Development
RTD Real-Time Data
 Programme (DGXII/D-1). Contract EV5V-CT92-0211. Brussels: Commission of the European Union.

--. 1999. Fate and activity modeling of environmental pollutants environmental pollutants,
n.pl the substances and conditions, including noise, that adversely affect the health and well-being of the people within a community.
 using structure-activity relationships Structure-activity relationship is the traditional Practices of Medicinal chemistry which try to modify the effect or the potency of Bioactive chemical compound by modifying its Chemical structure.  (FAME). Contract number ENV ENV Environment
ENV Envelope
ENV Environmental Science
ENV Emissions Neutral Vehicle
ENV École Nationale Vétérinaire (French)
ENV Estimated Net Value
ENV European Norm Voluntary
4-CT98-0221. Brussels: Environment and Climate Programme of the European Union.

Atkinson AC. 1985. Plots, Transformations and Regression. Oxford:Clarendon Press.

Balls M, Blaauboer BJ, Fentem JH, Bruner L, Combes Combes may refer to:
  • Combes, Texas
  • Émile Combes, French statesman and one of the originator's of the concept of Separation of Church and State
  • Laura Combes, a female bodybuilder
  • Combes, a commune of the Hérault département, in France
 RD, Ekwall B, et al. 1995. Practical aspects of the validation of toxicity test procedures. The report and recommendations of ECVAM workshop 5. Altern Lab Anim 23:129-147.

Baroni M, Costatino G, Cruciani G, Riganelli D, Valigi R, Clementi S. 1993. Generating optimal linear PLS estimations (GOLPE): an advanced chemometric tool for handling 3D-QSAR problems. Quant Quant

A person with numerical and computer skills who carries out quantitative analyses of companies.


quant

A person who has strong skills in mathematics, engineering, or computer science, and who applies those skills to the securities
 Struct-Act Rel 12:9-20.

Berglund A, De Rosa De Rosa may refer to:
  • De Rosa (band), a band from Scotland
  • De Rosa (bicycles), a bicycle manufacturing company.
People with the name De Rosa include:
  • Alberto Fernández de Rosa, an Argentine actor
 MC, Wold S. 1997. Alignment of flexible molecules at their receptor site using 3D descriptors and Hi-PCA. J Comput Aid Mol Des 11:601-612.

Blum DJW DJW Dank Je Wel (Dutch) , Speece RE. 1990. Determining chemical toxicity to aquatic species. Environ Sci Technol 24:284-293.

Box GEP GEP

gastroenteropancreatic.
, Hunter WG, Hunter JS. 1978. Statistics for Experimenter. New York New York, state, United States
New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of
: Wiley.

Burden FR, Rosewarne BR, Winkler Winkler may refer to:
  • Winkler, Manitoba, a Canadian city
  • Winkler (novel), by Giles Coren
  • Winkler (crater), a crater on the Moon
  • Winkler (surname), people with the surname Winkler or Winckler
See also
 DA. 1997. Predicting maximum bioactivity bi·o·ac·tiv·i·ty
n.
The effect of a given agent, such as a vaccine, upon a living organism or on living tissue.
 by effective inversion inversion /in·ver·sion/ (in-ver´zhun)
1. a turning inward, inside out, or other reversal of the normal relation of a part.

2. a term used by Freud for homosexuality.

3.
 of neural networks using genetic algorithms. Chemomr Intell Lab 38:127-137.

Burnham AJ, MacGregor JF, Viveros R. 1999. A statistical framework for multivariate latent variable regression methods based on maximum likelihood. J Chemomr 13: 49-65.

--. 2001. Interpretation of regression coefficients under a latent variable regression model. J Chemomr 15:265-284.

Burnham AJ, Viveros R, MacGregor JF. 1996. Frameworks for latent variable multivariate regression. J Chemomr 10:31-45.

Cooper JA, Saracci R, Cole P. 1979. Describing the validity of carcinogen carcinogen: see cancer.
carcinogen

Agent that can cause cancer. Exposure to one or more carcinogens, including certain chemicals, radiation, and certain viruses, can initiate cancer under conditions not completely understood.
 screening tests. Br J Cancer 39:87-89.

Cronin MTD MTD Mounted
MTD Maximum Tolerated Dose
MTD Memory Technology Device
MTD Month To-Date
MTD Methadone (drug screening)
MTD motion to dismiss (legal)
MtD Mountain Dew
MTD Memory Technology Driver
, Jaworska J, Walker JD, Comber comb·er  
n.
1. One, such as a machine or a worker, that combs something, such as wool.

2. A long wave that has reached its peak or broken into foam; a breaker.
 M, Watts CD, Worth AP. 2003b. Use of QSARs in international decision-making frameworks to predict ecologic effects and environmental fate of chemical substances. Environ Health Perspect 111:1391-1401.

Cronin MTD, Schultz TW. 2003. Pitfalls in QSAR. J Mol Struct 633:39-51.

Cronin MTD, Sinks CD, Schultz TW. 2000. Modeling of toxicity to the ciliate ciliate /cil·i·ate/ (sil´e-at)
1. having cilia.

2. any individual of the Ciliophora.


cil·i·ate
n.
Any of various protozoans of the class Ciliata.

adj.
 tetrahymena pyriformis: the aliphatic aliphatic /al·i·phat·ic/ (al?i-fat´ik) pertaining to any member of one of the two major groups of organic compounds, those with a straight or branched chain structure.

al·i·phat·ic
adj.
 carbonyl carbonyl /car·bon·yl/ (kahr´bah-nil) the bivalent organic radical, C:O, characteristic of aldehydes, ketones, carboxylic acid, and esters.

car·bon·yl
n.
The bivalent radical CO.
 domain. In: Forecasting the Environmental Fate and Effects of Chemicals (Rainbow PS, Hopkins SP, Crane M, eds). Chichester, UK:Wiley, 113-124.

Cronin MTD, Walker JD, Jaworska J, Comber M, Watts CD, Worth AP. 2003a. Use of QSAR relationships in international decision-making frameworks to predict health effects of chemical substances. Environ Health Perspect 111:1376-1390. Cruciani G, Watson KA. 1994. Comparative molecular field analysis using GRID force-field and GOLPE variable selection methods in a study of inhibitors of glycogen phosphorylase glycogen phosphorylase /gly·co·gen phos·phor·y·lase/ (gli´ko-jen fos-for´i-las) see phosphorylase.  b. J Med Chem 37:2589-2601.

Davis L. 1991. Handbook of Genetic Algorithms. New York: Van Nostrand Reinhold.

Deneer JW, Sinnige TL, Seinen W, Hermens JLM JLM Jesus Loves Me
JLM Just Like Me
JLM Junior League of Memphis
JLM Junior League of Minneapolis
JLM Junior League of Mobile
JLM Junior League of Madison
JLM Junior League of Montgomery
JLM Junior League of Miami, Inc.
JLM Junior League of McAllen, Inc.
. 1987. Quantitative structure-activity relationships for the toxicity and bioconcentration factor Bioconcentration factor is the concentration of a particular chemical in a tissue per concentration of chemical in water (reported as L/kg). This physical property characterizes the accumulation of pollutants through chemical partitioning from the aqueous phase into an organic  of nitrobenzene derivatives towards the guppy. Aquat Toxicol 10:115-129.

Deneer JW, van Leeuwen CJ, Seinen W, Maas-Diepeveen JL, Hermens JLM. 1989. QSAR study of the toxicity to nitrobenzene derivatives towards Daphnia magna, Chlorella pyrenoidosa Chlorella pyrenoidosa is a species of fresh water green algae. It has been used medicinally as a chelatory agent, for example to extract mercury from the body. However there is no scientific proof of this.  and Photobacterium phosphoreum. Aquat Toxicol 15:83-97.

Denham MC. 1997. Prediction intervals This article or section may be confusing or unclear for some readers.
Please [improve the article] or discuss this issue on the talk page.
 in partial least squares. J Chemomr 11:39-52.

Devillers J. 1996. Genetic algorithms in computer-aided molecular design. In: Genetic Algorithms in Molecular Modeling (Devillers J, ed). London: Academic Press, 131-157.

Devillers J, Balaban AT. 1999. Topological to·pol·o·gy  
n. pl. to·pol·o·gies
1. Topographic study of a given place, especially the history of a region as indicated by its topography.

2.
 Indices and Related Descriptors in QSAR and QSPR QSPR Quantitative Structure-Property Relationship
QSPR Quarterly Statistical Performance Report
. Amsterdam: Gordon Breach Scientific Publishers, 811.

Draper NR, Smith H. 1981. Applied Regression Analysis. New York: Wiley and Sons.

Dunn WJ III. 1989. Quantitative structural-activity relationships. Chemomr Intell Lab 6:181-189.

Efron B. 1982. The Jackknife jack·knife  
n.
1. A large clasp knife.

2. Sports A dive in the pike position, in which the diver straightens out to enter the water hands first.

v.
, the Bootstrap and Other Resampling Planes. Philadelphia: Society for Industrial and Applied Mathematics
For the country formerly called Siam see Thailand


The Society for Industrial and Applied Mathematics (SIAM) was founded by a small group of mathematicians from academia and industry who met in Philadelphia in 1951 to start an organization
.

Efron B, Gong G. 1983. A leisurely look at the bootstrap, the jackknife, and cross-validation. Am Stat 37:36-48.

Efron B, Tibshirani RJ. 1993. An Introduction to the Bootstrap. London: Chapman & Hall.

Eriksson L, Hermens JLM, Johansson E, Verhaar HJM HJM Heath-Jarrow-Morton (model) , Wold S. 1995. Multivariate analysis of aquatic toxicity data with PLS. Aquat Sci 57:217-241.

Eriksson L, Johansson E. 1998. Multivariate design and modeling in QSAR. Chemomr Intell Lab 34:1-19.

Eriksson L, Johansson E, Kettaneh-Wold N, Wikstrom C, Wold S. 200Ob. Design of Experiments--Principles and Applications. Umea, Sweden: Umetrics AB.

Eriksson L, Johansson E, Kettaneh-Wold N, Wold S. 2001. Multi-and Megavariate Data Analysis--Principles and Applications. Umea, Sweden: Umetrics AB.

Eriksson L, Johansson E, Lindgren F, Sjostrom M, Wold S. 2002. Megavariate analysis of hierarchical QSAR data. J Comp Aid Mol Des 16:711-726.

Eriksson L, Johansson E, Lindgren F, Wold S. 2000. GIFI-PLS: modeling of non-linearities and discontinuities in QSAR. Quant Struct-Act Rel 19:345-355.

Eriksson L, Johansson E, Muller M, Wold S. 2000a. On the selection of training set in environmental QSAR when compounds are clustered. J Chemomr 14:599-616.

Eriksson L, Johansson E, Wold S. 1997. QSAR model validation. In: Quantitative Structure-Activity Relationships in Environmental Sciences--VII (Chen F, Schuurmann G, eds). Proceedings of the 7th International Workshop on QSAR in Environmental Sciences, 24-28 June 1996, Elsinore, Denmark. Pensacola, FL: SETAC SETAC Society of Environmental Toxicology And Chemistry
SETAC Systems Engineering & Technical Assistance Contract
SETAC Shipboard Electronic Thermoacoustic Chiller
SETAC Shipboard Electronics Thermo-Acoustic Cooler
SETAC Shipboard Electronics Thermoacoustic Chiller
 Press, 381-397.

Feinstein AR. 1975. Clinical biostatistics biostatistics /bio·sta·tis·tics/ (-stah-tis´tiks) biometry.

bi·o·sta·tis·tics
n.
The science of statistics applied to the analysis of biological or medical data.
 XXXI. On the sensitivity, specificity, and discrimination of diagnostic tests. Clin Pharmacol Ther 17:104-116.

Fentem JH, Archer GEB Geb
 or Keb

In ancient Egyptian religion, the god of the earth and the physical support of the world. Geb and his sister Nut belonged to the second generation of deities at Heliopolis.
, Balls M, Botham PA, Curren RD, Earl LK, et al. 1996. The ECVAM international validation study on in vitro tests for skin corrosivity. 2. Results and evaluation by the management team. Toxicol In Vitro 12:403-524.

Giraud E, Luttman C, Lavelle F, Riou JF, Mailliet P, Laoui A. 2000. Multivariate data analysis using D-optimal designs, PLS and RSM RSM (in Britain) regimental sergeant major . A directional approach for the analysis of farnesyl-transferase inhibitors. J Med Chem 43:1807-1816.

Golbraikh A, Tropsha A. 2002. Beware of q2! J Mol Graph Model 20:269-276.

Goldberg D.E. 1989. Genetic Algorithms in Search, Optimization & Machine Learning. New York: Addison-Wesley.

Gombar V. 1996. U.S. Patent 6 036 349. Method and apparatus for validation of model-based predictions.

Gramatica P, Consonni V, Todeschini R. 1999. QSAR study of the tropospheric degradation of organic compounds. Chemosphere chemosphere: see atmosphere.  38:1371-1378.

Gramatica P, Corradi M, Consonni V. 2000. Modeling and prediction of soil sorption sorption /sorp·tion/ (sorp´shun) the process or state of being sorbed; absorption or adsorption.

sorp·tion
n.
Adsorption or absorption.
 coefficients of non-ionic organic pesticides by different sets of molecular descriptors. Chemosphere 41:763-777.

Gramatica P, Navas N, Todeschini R. 1998. 3D-modelling and prediction by WHIM descriptors. Part 9. Chromatographic chro·mat·o·graph  
n.
An instrument that produces a chromatogram.

tr.v. chro·mat·o·graphed, chro·mat·o·graph·ing, chro·mat·o·graphs
To separate and analyze by chromatography.
 relative retention time and physico-chemical properties of polychlorinated biphenyls polychlorinated biphenyls, (pol´ēklôr´nā´tid bīfē´n  (PCBs). Chemomr Intell Lab 40:53-63.

Gramatica P, Papa E. 2003. QSAR modeling of bioconcentration factor by theoretical molecular descriptors. Quant Struct-Act Rel 22:374-385.

Hibbert DB. 1993. Genetic algorithms in chemistry. Chemomr Intell Lab 19:277-293.

Hong H, Tong W, Fang H, Shi L, Qian X, Wu J, et al. 2002. Prediction of estrogen receptor estrogen receptor A protein of a superfamily of nuclear receptors for small hydrophilic ligands–eg, steroid hormones, thyroid hormone, vitamin D, retinoids; the presence of ERs in breast CA generally is associated with a better prognosis, as they respond to  binding for 58,030 chemicals using an integrated computational approach. Environ Health Perspect 110:29-36.

Hoskuldsson A. 1996. Prediction Methods in Science and Technology. Copenhagen: Thor Publishing.

Jackson JE 1991. A User's Guide to Principal Components. New York: John Wiley John Wiley may refer to:
  • John Wiley & Sons, publishing company
  • John C. Wiley, American ambassador
  • John D. Wiley, Chancellor of the University of Wisconsin-Madison
  • John M. Wiley (1846–1912), U.S.
.

Jongman RGH RGH Rochester General Hospital (New York)
RGH Rawalpindi General Hospital (Rawalpindi, Pakistan) 
, ter Braak CJF CJF Council of Jewish Federations
CJF Coherent Joint Fires
CJF Channel Journal File
CJF Clearjet Filter
CJF Central Java Fault
CJF Client J Framework
CJF Calculation Job File
, van Tongeren OFR Ofr Oberfranken (German)
OFR Operating and Financial Review
OFR Office of the Federal Register (US NARA)
OFr Old French (linguistics)
OFR Optics for Research
. 1987. Data Analysis in Community and Landscape Ecology Landscape ecology

The study of the distribution and abundance of elements within landscapes, the origins of these elements, and their impacts on organisms and processes.
. Wageningen, the Netherlands: Pudoc.

Jouan-Rimbaud D, Massart DL, de Noord OE. 1996. Random correlation in variable selection for multivariate calibration with a genetic algorithm. Chemomr Intell Lab 35:213-220.

Jurs PC. 2002. ADAPT--Automated Data Analysis and Pattern Recognition Toolkit. University Park, PA: Pennsylvania State University Pennsylvania State University, main campus at University Park, State College; land-grant and state supported; coeducational; chartered 1855, opened 1859 as Farmers' High School. . Available: http://research.chem.psu.edu/pcjgroup/ ADAPT.html [accessed 23 April 2002].

Karelson M. 2000. Molecular Descriptors in QSAR/QSPR. New York: Wiley-InterScience.

Katritzky AR, Lobanov VS, Karelson M. 1994. CODESSA, Reference Manual. Gainesville, FLUniversity of Florida. Available: http://www.semichem.com/codessa refs.html [accessed 19 April 2002].

Kubinyi H. 1994a. Variable selection in QSAR studies. I. An evolutionary algorithm. Quant Struct-Act Rel 13:285-294.

--. 1994b. Variable selection in QSAR Studies. II. A highly efficient combination of systematic search and evolution. Quant Struct-Act Rel 13:393-401.

--. 1996. Evolutionary variable selection in regression and PLS analyses. J Chemomr 10:119-133.

Kulkarni A, Hopfinger AJ, Osborne R, Bruner LH, Thompson ED Thompson, city, Canada
Thompson, city (1991 pop. 14,977), central Man., Canada, on the Burntwood River. A mining town, it developed after large nickel deposits were discovered in the area in 1956.
. 2001. Prediction of eye irritation of organic chemicals using membrane-interaction QSAR analysis. Toxicol Sci 59:335-345.

Langer T. 1994. Molecular similarity determination of heteroaromatics using CoMFA and multivariate data analysis. Quant Struct-Act Rel 13:402 405.

Leardi R. 1994. Application of a genetic algorithm to feature selection under full validation conditions and to outlier outlier /out·li·er/ (out´li-er) an observation so distant from the central mass of the data that it noticeably influences results.

outlier

an extremely high or low value lying beyond the range of the bulk of the data.
 detection. J Chemomr 8:65-79.

Leardi R, Boggia R, Terrile M. 1992. Genetic algorithms as a strategy for feature selection. J Chemomr 6:267-281.

Lindgren F. 1994. Third Generation PLS--Some Elements and Applications [PhD Thesis]. Umea, Sweden: Umea University.

Linusson A, Gottfries J, Lindgren F, Wold S. 2000. Statistical molecular design of building blocks for combinatorial chemistry Combinatorial chemistry involves the rapid synthesis or the computer simulation of a large number of different but structurally related molecules. Introduction
Synthesis of molecules in a combinatorial fashion can quickly lead to large numbers of molecules.
. J Med Chem 43:1320-1328.

Livingstone DJ. 2000. The characterization of chemical structures using molecular properties. A survey. J Chem Inf Comput Sci 40:195-209.

Marengo E, Todeschini, R. 1992. A new algorithm for optimal, distance--based experimental design. Chemomr Intell Lab 16:37-44.

Martens H, Martens M. 2000. Modified jack-knife estimation of parameter uncertainty in bilinear bi·lin·e·ar  
adj.
Linear with respect to each of two variables or positions. Used of functions or equations.

Adj. 1. bilinear - linear with respect to each of two variables or positions
 modeling (PLSR PLSR Partial Least-Squares Regression
PLSR Precision Landing System Receiver
PLSR Private Local SONET Ring
). Food Qual Prefer 11:5-16.

Martin YC, Lin CT, Hetti C, DeLazzer J. 1995. PLS analysis of distance matrices to detect nonlinear relationships between biological potency and molecular properties. J Med Chem 38:3009-3015.

McDowell RM, Jaworska J. 2002. Bayesian analysis Bayesian analysis A decision-making analysis that '…permits the calculation of the probability that one treatment is superior based on the observed data and prior beliefs…subjectivity of beliefs is not a liability, but rather explicitly allows  and inference of QSAR predictive model results. SAR (Segmentation And Reassembly) The protocol that converts data to cells for transmission over an ATM network. It is the lower part of the ATM Adaption Layer (AAL), which is responsible for the entire operation. See AAL.

SAR - segmentation and reassembly
 QSAR Environ Res 13:111-125.

Mekenyan O, Bonchev D. 1986. OASIS method for predicting biological activity of chemical compounds. Acta Pharm Jugosl 36:225-237.

Mullet GM. 1976. Why regression coefficients have the wrong sign. J Qual Technol 8:121-126.

Nendza M, Muller M. 2000. Discriminating toxicant toxicant /tox·i·cant/ (tok´si-kant)
1. poisonous.

2. poison.


tox·i·cant
n.
1. A poison or poisonous agent.

2. An intoxicant.

adj.
 classes by mode of action: 2. Physico-chemical descriptors. Quant Struct-Act Rel 19:581-598.

OECD. 1998. Harmonized har·mo·nize  
v. har·mo·nized, har·mo·niz·ing, har·mo·niz·es

v.tr.
1. To bring or come into agreement or harmony. See Synonyms at agree.

2. Music To provide harmony for (a melody).
 Integrated Hazard Classification System for Human Health and Environmental Effects of Chemical Substances. Paris: Organisation for Economic Cooperation and Development.

Pet-Edwards J, Haimes Y, Chankong V, Rosenkranz H, Ennever F. 1989. Risk Assessment and Decision Making Using Tests Results--The Carcinogenicity Prediction and Battery Selection Approach. New York: Plenum In a building, the space between the real ceiling and the dropped ceiling, which is often used as an air duct for heating and air conditioning. It is also filled with electrical, telephone and network wires. See plenum cable.  Press, 211.

Rogers D, Hopfinger AJ. 1994. Application of genetic function approximation The need for function approximations arises in many branches of applied mathematics, and computer science in particular. In general, a function approximation problem asks us to select a function among a well-defined class that closely matches ("approximates") a target function in a  to quantitative structure-activity relationships and quantitative structure-property relationships. J Chem Inf Comput Sci 34:854-866.

Shag J. 1993. Linear model selection by cross-validation. J Am Stat Assoc 88:486-494.

Shi LM, Tong W, Fang H, Perkins R, Wu J, Tu M, et al. 2002. An integrated "four-phase" approach for priority setting of endocrine endocrine /en·do·crine/ (en´do-krin, en´do-krin)
1. secreting internally.

2. pertaining to internal secretions; hormonal. See also under system.


en·do·crine
adj.
 disruptors--phase I and II for prediction of potential estrogenic endocrine disruptor. SAR/QSAR Environ Res 13(1):69-88.

Sjoblom J, Svensson O, Josefson M, Kullberg H, Wold S. 1998. An evaluation of orthogonal signal correction applied to calibration transfer of near infrared spectra. Chemomr Intell Lab 44: 229-244.

Sjostrom M, Lindgren A, Uppgard L. 1997. Joint multivariate quantitative structure-property and structure-activity relationships for a series of technical nonionic surfactants. In: Quantitative Structure-Activity Relationships in Environmental Sciences--VII (Schuurmann G, Chen F, eds). Proceedings of the 7th International Workshop on QSAR in Environmental Sciences, 24-28 June 1996, Elsinore, Denmark. Pensacola, FL:SETAC Press, 435-449.

Stuper AJ, Jurs PC. 1976. ADAPT: A computer system for automated data analysis using pattern recognition techniques. J Chem Inf Comput Sci 16:99-105.

Todeschini R. 1997. MOBY (jargon) moby - /moh'bee/ (From MIT, seems to have been in use among model railroad fans years ago. Derived from Melville's "Moby Dick", some say from "Moby Pickle") 1. Large, immense, complex, impressive. "A Saturn V rocket is a truly moby frob.  DIGS--Software for Multilinear Regression Analysis and Variable Subset Selection by Genetic Algorithm. Release 2.1 for Windows. Milan: Talete Srl.

Todeschini R, Consonni V. 2000. Handbook of Molecular Descriptors. Weinheim: Wiley-VCH.

Todeschini R, Consonni V, Pavan pa·vane also pa·van  
n.
1. A slow, stately court dance of the 16th and 17th centuries, usually in duple meter.

2. A piece of music for this dance.
 M. 2001. DRAGON--Software for the Calculation of Molecular Descriptors. Release 1.12 for Windows. Available: http://www.disat.unimib/chm [accessed 25 March 2002].

Todeschini R, Gramatica P. 1997. 3D-modelling and prediction by WHIM descriptors. Part 6. Applications of WHIM descriptors in QSAR studies. Quant Struct-Act Rel 16:120-125.

Todeschini R, Maiocchi A, Consonni V. 1999. The K correlation index: theory development and its application in chemometrics. Chemomr Intell Lab 46:13-29.

Tong W, Perkins R, Fang H, Hong H, Xie Q, Branham W, et al. 2002. Development of quantitative structure-activity relationships (QSARs) and their use for priority setting in testing strategy of endocrine disruptors. Regul Res Perspect 1(3):1-16.

Topliss JG, Edwards RP. 1979. Chance factors in studies of quantitative structure-activity relationships. J Med them 22:1238-1244.

Tosato ML Piazza R, Chiorboli C, Passerini L, Ping A, Cruciani G, et al. 1992. Application of chemometrics to the screening of hazardous chemicals. Chemomr Intell Lab 16:155-167.

Tropsha A, Gramatica P, Gombar VJ. 2003. The importance of being earnest: validation is the absolute essential for successful application and interpretation of QSPR models. Quant Struct-Act Rel 22:69-77.

Tysklind M, Andersson P, Haglund P, Van Bavel B, Rappe C. 1995. Selection of polychlorinated biphenyls for use in quantitative structure-activity relationships. SAR QSAR Environ Res 4:11-19.

Van der Voet H. 1994. Comparing the predictive accuracy of models using a simple randomization randomization (ranˈ·d·m  test. Chemomr Intell Lab 25:313-323.

Vankeerberghen P, Smeyers-Verbeke J, Leardi R, Karr CL, Massart DL. 1995. Robust regression and outlier detection for non-linear models using genetic algorithms. Chemomr Intell Lab 28:73-87.

Verhaar HJM, Eriksson L, Sjostrom M, Schuurmann G, Seinen W, Hermens JLM. 1994. Modeling the toxicity of organophosphates: a comparison of the multiple linear regression and PLS regression methods. Quant Struct-Act Rel 13:133-143.

Wahlstrom B. 1988. The need for new strategies--the OECD existing chemicals program. In: The Use of QSAR for Chemicals Screening--Limitations and Possibilities. Report 8/1988. Stockholm: National Chemicals Inspectorate in·spec·tor·ate  
n.
1. The office or duties of an inspector.

2. A staff of inspectors.

3. An inspector's district.


inspectorate
Noun

1.
, 1-17.

Wehrens R, Buydens, LMC LMC Large Magellanic Cloud (also see SMC)
LMC Library Media Center
LMC Lees-McRae College (Banner Elk, NC)
LMC Lutheran Medical Center
LMC League of Minnesota Cities
LMC Local Medical Committee
. 1996. Evolutionary optimization: a tutorial. Trends Analyt Chem 17:193-203.

Wehrens R, Putter H, Buydens LMC. 2000. The bootstrap: a tutorial. Chemomr Intell Lab 54:35-52.

Wold H. 1982. Soft modeling. The basic design and some extensions. In: Systems under Indirect Observation, Vols I, II (Joreskog KG, Wold H, eds). Amsterdam: North-Holland, 18-34.

Wold S. 1992. Nonlinear partial least squares modeling. II. Spline In computer graphics, a smooth curve that runs through a series of given points. The term is often used to refer to any curve, because long before computers, a spline was a flat, pliable strip of wood or metal that was bent into a desired shape for drawing curves on paper. See Bezier and B-spline.  inner relation. Chemomr Intell Lab 14:71-84.

Wold S, Albano C, Dunn WJ III, Esbensen K, Hellberg S, Johansson E, et al. 1983. Pattern recognition: finding and using regularities in multivariate data. In: Food Research and Data Analysis (Martens H, Russwurm H, eds). Essex, UK: Applied Science Publishers, 147-188.

Wold S, Dunn WJ III. 1983. Multivariate quantitative structure-activity relationships (QSAR): conditions for their applicability. J Chem Inf Comput Sci 23:6-13.

Wold S, Johansson E, Cocchi M 1993. PLS--partial least squares projections to latent structures. In: 3D-QSAR in Drug Design, Theory, Methods, and Applications (Kubinyi H, ed). Leiden: ESCOM ESCOM European Society for the Cognitive Sciences of Music
ESCOM Electricity Supply Commission (South Africa)
ESCOM Electricity Supply Corporation of Malawi
ESCOM Enterprise Systems Connectivity (IBM) 
 Science Publishers, 523-550.

Wold S, Josefson M. 2000. Multivariate calibration of analytical data. In: Encyclopedia of Analytical Chemistry The Encyclopedia of Analytical Chemistry is an English-language multivolume encyclopedia published by John Wiley & Sons Ltd.

It is a comprehensive analytical chemistry reference, covering all aspects from theory and instrumentation through applications and techniques.
 (Meyers RA, ed). Chichester, UK: John Wiley & Sons, 9710-9736.

Wold S, Sjostrom M, Carlson R, Lundstedt T, Hellberg S, Skagerberg B, et al. 1986. Multivariate design. Anal Chem Acta 191:17-32.

Worth AP, Balls M. 2001. The importance of the prediction model in the development and validation of alternative tests. Altern Lab Anim 29:135-143.

Worth AP, Cronin MTD. 2000. Embedded Inserted into. See embedded system.  cluster modeling: a novel QSAR method for generating elliptic models of biological activity, in: Progress in the Reduction, Refinement and Replacement of Animal Experimentation (Balls M, van Zeller A-M A-M Alternating Maximization (algorithm) , Halder ME, eds). Amsterdam: Elsevier Science, 479-491.

--. 2001a. The use of bootstrap resampling to assess the uncertainty of Cooper statistics. Altern Lab Anim 29: 447-459.

--. 2001b. The use of pH measurements to predict the potential of chemicals to cause acute dermal dermal /der·mal/ (der´mal) pertaining to the dermis or to the skin.

der·mal or der·mic
adj.
Of or relating to the skin or dermis.
 and ocular ocular /oc·u·lar/ (ok´u-lar)
1. of, pertaining to, or affecting the eye.

2. eyepiece.


oc·u·lar
adj.
1. Of or relating to the eye or the sense of sight.
 toxicity. Toxicology toxicology, study of poisons, or toxins, from the standpoint of detection, isolation, identification, and determination of their effects on the human body. Toxicology may be considered the branch of pharmacology devoted to the study of the poisonous effects of drugs.  169:119-131.

--. In press. The use of discriminant analysis, logistic regression and classification tree analysis in the devlopment of classification models for human health effects. J Mol Struct.

Xu L, Zhang WJ. 2001. Comparison of different methods for variable selection. Anal Chim Acta 446:477-483.

Lennart Eriksson, (1) Joanna Jaworska, (2) Andrew P. Worth, (3) Mark T.D. Cronin, (4) Robert M. McDowell, (5) and Paola Gramatica (6)

(1) Umetrics, Umea, Sweden; (2) Procter & Gamble Eurocor, Central Product Safety, Strombeek-Bever, Belgium; (3) EuropeanChemicals Bureau, Institute for Health & Consumer Protection, Joint Research Centre, European Commission European Commission, branch of the governing body of the European Union (EU) invested with executive and some legislative powers. Located in Brussels, Belgium, it was founded in 1967 when the three treaty organizations comprising what was then the European Community , Ispra, Italy; (4) School of Pharmacy and Chemistry, Liverpool John Moores University Originally founded as a small mechanics institution (Liverpool Mechanics' School of Arts) in 1825, the institution grew over the centuries by converging and amalgamating with different colleges and eventually became the Liverpool Polytechnic. , Liverpool, United Kingdom; (5) U.S. Department of Agriculture, Animal and Plant Health Inspection Service, Risk Analysis Systems, Riverdale, Maryland Riverdale is the name (or former name) of two places in the state of Maryland in the United States of America:
  • Riverdale, Anne Arundel County, Maryland
  • Riverdale, Prince George's County, Maryland: now Riverdale Park, Maryland
, USA; (6) QSAR and Environmental Chemistry Research Unit, Department of Structural and Functional Biology, Insubria University, Varese, Italy

This article is part of the mini-monograph "Regulatory Acceptance of (Q)SARs for Human Health and Environmental Endpoints."

Address correspondence to L. Eriksson, Umetrics AB, POB PoB - Prisoner of Bill  7960, 907 19 Umea, Sweden. Telephone: 46-90-184852. Fax: 46-90-184899. E-mail: lennart.eriksson@umetrics.com

The authors declare they have no conflict of interest.

Received 2 May 2002; accepted 3 February 2003.
COPYRIGHT 2003 National Institute of Environmental Health Sciences
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Mini-Monograph
Author:McDowell, Robert M.
Publication:Environmental Health Perspectives
Date:Aug 1, 2003
Words:17929
Previous Article:Summary of a Workshop on Regulatory Acceptance of (Q)SARs for Human Health and Environmental Endpoints.(Mini-Monograph)
Next Article:Use of QSARs in international decision-making frameworks to predict ecologic effects and environmental fate of chemical substances.(Mini-Monograph)



Related Articles
Classification of paraspinal muscle impairments by surface electromyography.
The stroke rehabilitation assessment of movement (STREAM): a comparison with other measures used to evaluate effects of stroke and rehabilitation....
Summary of a Workshop on Regulatory Acceptance of (Q)SARs for Human Health and Environmental Endpoints.(Mini-Monograph)
Use of QSARs in international decision-making frameworks to predict ecologic effects and environmental fate of chemical substances.(Mini-Monograph)
Use of QSARs in international decision-making frameworks to predict health effects of chemical substances.(Regulatory Acceptance of (Q)SARs)
Validation of (Q)SARs models.(Correspondence)
Validation of (Q)SARs models: Jaworska et al.'s response.(Correspondence)
Categorizing patients with occupational low back pain by use of the Quebec Task Force Classification system versus pain pattern classification...
Toxicogenomics in risk assessment: an overview of an HESI collaborative research program.(Genomics and Risk Assessment: Mini-Monograph)
Assessment of prediction confidence and domain extrapolation of two structure-activity relationship models for predicting estrogen receptor binding...

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles