Printer Friendly

QSAR Study of Anthra[1,9-cd]pyrazol-6(2H)-one Derivatives as Potential Anticancer Agents Using Statistical Methods.

1. Introduction

The heterocycles and their derivatives constitute a class of cyclic compounds in which one or more carbon atoms of a reference carbocycle (e.g., cyclohexane, benzene, cyclopentane, and cyclopentadiene) are replaced by a heteroatom. The rapid development of heterochemistry comes from the study of living organisms (several bioactive heterocyclic compounds are extracted from animal and plant organisms).

Heterocyclic compounds find wide practical application in animal and human medicine (various drugs), in improving crops in agriculture (herbicides, fungicides, and insecticides), or are used as detergents, dyes, and explosives. They are also present in polymers, semiconductors, and photovoltaic cells [1-4].

The chemistry of heterocycles is a very broad field, given the number of heterocyclic compounds listed, which continues to expand. Among the different classes of heterocyclic compounds, mainly nitrogenous structures are present in many natural compounds of plant, animal, or synthetic origin. These structures are sometimes associated with each other, but in most cases they are linked to very diverse structural patterns. A number of hybrid compounds comprising mainly heterocycles containing nitrogen, sulfur, and/ or oxygen atoms have shown remarkable pharmacological activity [5-8].

Pyrazoles are chemical compounds of synthetic origin that have a five-membered heterocycle with two nitrogen atoms and three adjacent carbon atoms. Moreover, this structure is particularly rare in nature. Pyrazole derivatives, several members of the pyrazoles class, have shown good pharmacological effects or have the potential biological activities, such as anti-inflammatory [9], antiviral [10], antimicrobial [11], anticonvulsant [12], antitumor [13], fungicidal activities [14], and antihistaminic [15] activities.

The pyrazole ring is a structural isomer of imidazole; pyrazole name comes from the pyrrole ring to which a nitrogen atom was added: "azole." The two nitrogen atoms have different properties: one behaving like pyridine can undergo protonation in an acid medium; the other has the property of the pyrrole nitrogen doublet participating in the aromaticity of the ring [16].

Pyrazoles variously substituting aromatic and heteroaromatic groups have many biological activities, making them particularly interesting [17].

Chemically, for its anticancer activity, pyrazol exists in a variety of pharmacological targets. Drug discovery is a long and complex process. It is recognized that, on average, for a molecule that comes onto the market as an innovative drug, 10,000 molecules are synthesized and tested. In addition, the development of a drug usually requires between 10 and 15 years of research. It is indeed a matter of finding a molecule that must both have particular therapeutic properties and possess the minimum of undesirable side effects. The cost price of a drug is mainly due to its long, expensive, and ultimately useless syntheses. The development of reliable computer tools coupled with the growth of computing power has enabled the implementation of molecular modeling techniques, which have become, today, indispensable tools in the field of drug design. Among the techniques of chemoinformatics, we can mention QSAR techniques of finding a correlation between biological activity measured for a panel of compounds and some molecular descriptors. Quantitative structure-activity relationship (QSAR) methodology is an essential tool in medicinal chemistry [18,19].

Two disciplines of "computational chemistry" have been developed in response to this need: quantitative structure-activity relationships (QSARs) and quantitative structure-property relationships (QSPRs). They essentially consist of the search for similarities between molecules in large databases of existing molecules whose properties are known [20,21]. The discovery of such a relationship makes it possible to predict the physical, chemical, and biological properties of compounds, to develop new theories or to understand the phenomena observed. Our main objective in this work was to develop a novel model for studying the relationship between the structure and anticancer activity of pyrazol and their derivatives [22, 23].

To establish the relation between structural characteristics of molecule and its properties, the mathematical methods can be used. Multiple linear regression (MLR), partial least squares (PLS), multiple nonlinear regression (MNLR), and cross-validation analyses were applied to a series of pyrazol inhibitors in order to develop a QSAR model to reliably predict anticancer activity.

2. Material and Methods

2.1. Experimental Data. In the present study, we chose 32 substitutions of anthra[1,9-cd]pyrazol-6(2H)-one for which their anticancer activities are reported in the literature by Chen et al. [24]. On the other side and for the 2D-QSAR study, the reported values of IC50 have been converted into PIC50 by taking negative logarithm (PIC50 = log 10 IC50) and subsequently used as the dependent variable for the 3D-QSAR model development. Figure 1 represents the basic structure of the pyrazol and Table 1 shows the studied substitutions of the compounds and corresponding experimental activities of PIC50.

2.2. Validation of QSAR Models. The stability and robustness of the model developed are evaluated using correlation coefficients ([R.sup.2]), the adjusted [R.sup.2], the value MES (root of the average square of the errors), the value of standard deviations SD, and criteria of Fisher F. In addition, the choice of descriptors was supported by Student's t-test at a 95% confidence level. All the models have been validated by cross-validation, according to a leave-one-out (LOO) procedure and to check if the results obtained by cross-validation are not due to the chance of a Y-randomization procedure being involved. Also the model has been evaluated by external validation from data that are not part of the training set and the predictive power is then characterized by the correlation coefficient for the validation set ([R.sup.2] test). [22, 23, 25].

2.3. Calculation of the Molecular Descriptors. Before any modeling, it is necessary to calculate a certain number of descriptors because the parameters which described the anticancer activity of the pyrazoles are poorly known. Part of the success of any QSAR model lies in the choice of the molecular descriptors used. In general, the standard descriptors used for such an analysis are constitutional, topological, or even geometric descriptors. However, it is often difficult to link these parameters to the reactivity of the inhibitors with the target cells. The use of descriptors derived from quantum chemistry is less frequent in QSAR, whereas they have the advantage of being directly related to the reactivity properties of molecular systems [26, 27]. The thirty-two molecules were optimized using quantum mechanics using the DFT approximation and the B3LYP function associated with the 6-31G base set using the Gaussian 03 software. A number of electronic descriptors were then computed from the optimized molecules, including the dipole moment (DM), the energy of the boundary orbitals (EHOMO, ELUMO), the total energy (Etotal), and the repulsion energy (RE) [28, 29].

ChemBio Office (2015) was used to calculate the following parameters: molecular weight (MW), lipophilicity (logP), hydrogen bond acceptors (HA), and hydrogen bonding donors (HD). The ChemSketch program was used to calculate the following parameters: molar volume (MV (cm3)), molar refractivity (MR (cm3)), parachor (Pc (cm3)), density (g/cm3), refractive index, tension superficial (Dyne/Cm), and polarizability (cm3) [30, 31].

2.4. Statistical Analysis. Structure-activity models were generated using XLSTAT version 14 software starting with principal component analysis to minimize the matrix and then entering the multiple linear regression (MLR) method to study the relationship between a dependent variable and several independent variables. It is a mathematical technique that minimizes the difference between real and predicted values. It is also used to select the descriptors used as input parameters in multiple nonlinear regression and the neuron network to account for the nonlinear correlation between activity and structure [32].

The cross-validation technique is one of the most famous ways of selecting regression models that is based on the "leave-one-out" criterion. The leave-one-out procedure successively removes a molecule from the learning set containing 32 molecules. A QSPR model is built on a set of 31 compounds and the removed molecule is predicted by the model. This procedure is repeated 32 times to predict the properties of all molecules [33-36].

In order to ensure that a QSAR model is reliable, Y-randomization tests are one of the most used techniques. Indeed, it is not uncommon to obtain fortuitous correlations (or "chance correlation"), that is to say, a model displaying good statistical results (R2, MAE) for learning, but involving descriptors that in reality are not related to the modeled property. These random models can be detected by the Yrandomization procedure. They consist in randomly mixing the experimental properties for the learning set and, using the same descriptors, again training the learning algorithm to try to obtain a model. Normally, the models obtained must have very low performances. The distribution of the obtained models makes it possible to fix a heuristic threshold of meaning of the models. Thus, one can choose models that have at most 1% chance of being confused with a fortuitous model [34, 37].

3. Results and Discussion

3.1. Dataset for Analysis. QSAR study was carried out for a series of 32 substitutions of anthrax[1,9-cd]pyrazol-6(2H)-one, in order to determine a quantitative relationship between the structure and the antiviral activities. The values of the 16 descriptors are shown in Table 2. The results obtained for 3D-QSAR using ACP, MLR, MNLR, ANN, CV, and Y-randomization are represented in Tables 3 and 4.

3.2. Principal Component Analysis. The totality of the 16 descriptors coding the 32 molecules is submitted to a principal components analysis (PCA) [37]. 16 principal components were obtained (Figure 1).

The first three principal axes are sufficient to describe the information provided by the data matrix. Indeed, the percentages of variances are 52.6%, 16.62%, and 15.25% for the axes F1, F2, and F3, respectively. The total information is estimated to a percentage of 84.47%. Table 2 shows the correlation matrix (Pearson (n)) therefore obtained between different descriptors.

The Pearson correlation coefficients are summarized in Table 5. The obtained matrix provides information on the negative or positive correlation between variables. The principal component analysis (PCA) was conducted to identify the link between the different variables. Correlations between the 16 descriptors are shown in Table 5 as a correlation matrix and in Figure 2 these descriptors are represented in a correlation circle.

3.3. Multiple Linear Regression (MLR). In order to select the predominant descriptors that will affect the inhibitory activities of these compounds, correlation analysis was performed with statistical software XLSTAT2014 taking every calculated descriptor as an independent variable and PIC50 as a dependent variable. Based on the correlation analysis, the aforementioned stepwise multiple linear regression technique was used to establish the QSAR model. Several statistical parameters such as the regression coefficient (R), squared correlation coefficient ([R.sup.2]), adjusted squared correlation coefficient ([R.sup.2.sub.adj]), the mean squared error (MSE), the value of the value of Fischer (F), and the significance level (p) < 0.05 are used to verify the credibility of the developed models. Great value of F, small MSE, very small p value, and R and [R.sup.2] of nearly one indicate good QSAR model. In this study, all developed QSAR models are statistically significant with a significance level being p < [10.sup.-3]. Given that the p value is much smaller than 0.05, we are taking less than a 0.01% risk in assuming that the null hypothesis is wrong. The values of the multiple correlation coefficient (R) and of the square correlation coefficient (.[R.sup.2]) which are superior to 0.87 and 0.75, respectively, support the estimated capacity of the QSAR models.

[E.sub.LUMO], the molar volume (MV), the density, and the molecular weight (MW) were the descriptors that are dependent on the anticancer activity of the derivated pyrazol.

The QSAR model built using multiple linear regression (MLR) method is represented by the following equation:

[mathematical expression not reproducible] (1)

Statistical characteristics of the obtained equation:

[mathematical expression not reproducible]. (2)

The correlation between experimental plots and data predicted from QSAR derived multiple regressions given in

Table 3 shows that the predicted values are much closer to the experimental ones. It shows that the developed models can be successfully applied to predict the inhibition for other derivatives.

Negative correlation factors that affect the anticancer activity show that the increase in the values of these factors involves a decrease in the value of PIC50. PIC50 changes with the descriptor values, which are shown in (1), show that the MV, ELUMO, and the density vary in the same manner as the activity, so that the MW varies in the opposite direction.

PIC50 activity was linked with frontier orbital energies and especially energy BV which is the energy of the lowest unoccupied molecular orbital and reflects the electrophilic reactivity. This parameter is widely used for the explanation of the antiviral activity. The LUMO energy suggests that highly electrophilic compounds resulted in high cell penetration. The energy of Elumo is directly related to the electron affinity and characterizes the susceptibility of the molecule to be attacked by nucleophiles.

3.4. Partial Least Squares (PLS). The PLS have two objectives: to approximate the matrix X of molecular structure descriptors to the matrix Y of dependent variables and to maximize the correlation between them.

We proposed the data matrix constituted clearly from the descriptors proposed by MLR (Figure 3) corresponding to the 32 molecules, to the partial least squares (PLS) (Figure 4). This method used the coefficients R, [R.sup.2], and the F-values to select the best regression performance.

For the ELUMO, MV, MW, and density to PIC50, the following equations were used.

The molecular descriptors used were the ELUMO, MV, MW, and density. To correlate the molecule descriptors linearly to the following equations were used:

[mathematical expression not reproducible], (3)

[mathematical expression not reproducible]. (4)

The obtained coefficient of correlation in (2) is quite interesting (0.69). To improve the anticancer activity in a quantitative manner, taking into account several parameters, we have used the technique of the nonlinear regression model.

3.5. Multiple Nonlinear Regression (MNLR). The basic descriptors corresponding to the RLM 32 compounds were applied to the data matrix which is obvious (Figure 5). The coefficients R and [R.sup.2] and the mean squared error are used to select the best performance of the regression.

The resulting equations:

[mathematical expression not reproducible], (5)

[mathematical expression not reproducible]. (6)

The predicted values of PIC50 calculated from (3) are added to Table 3 compared to the observed values. The correlation between the predicted and observed values activities is shown in Figure 6.

The correlation coefficient obtained in the equation is very interesting (0.91) to show anticancer activity. We can say that the values obtained from nonlinear regression are highly correlated with those of the observed activity comparing the results obtained by the MLR method.

Validation of MNLR model is done by dividing the dataset into the training and the test set; the external validation of several correlation coefficients is PIC50 = 0.7 for MNLR for the whole test.

3.6. Validation. We use the procedure "leave-one-out" which removes successively a molecule of learning the game containing 24 molecules. This procedure is repeated 24 times in order to predict the properties of all the molecules.

The consistency and reliability of the MLR, MNLR, and PLS model are validated using the cross-validation technique with a good correlation being obtained with cross-validation Rcv = 0.86. So the predictive power of this model is very significant.

[mathematical expression not reproducible]. (7)

3.7. Scrambling or Y-Randomization. Y-randomization is broadly used in QSAR studies to ensure the portliness of obtained models. This method is used after the "best" regression model is selected to make sure that there is no chance for correlations. Scrambling validates the QSAR model by comparing the performance of the original model to that of models built for permuted (randomly shuffled) responses based on the original descriptor pool and the original procedure used to build the model. If the correlation coefficient of models built for permuted responses is close to that obtained by applying the full model, this result indicates that there is independence between the molecules, as the nearest target point measurement points do not obscure other experimental data and are not almost exclusively involved in the estimate, and the data used in this validation are evenly distributed in space. Therefore, the resulting model can be extrapolated to the entire series. (Table 3 and Figure 7).

[mathematical expression not reproducible]. (8)

The correlation coefficient value of the mixture of molecules was close to that obtained by applying the full model. This result demonstrates the absence of dependence between descriptors included in the model. Additionally, the closest measurement point of the target point does not hide other experimental data and is not involved exclusively in the estimate, and the data used in this validation are regularly distributed in space so the resulting model can be extrapolated for the entire series.

3.8. Proposed Novel Compounds. The values of the parameters obtained by DFT calculations for the proposed compounds are based on the information derived from (1), (2), and (3) (Table 3). It has been observed that the designed PLS have higher PIC50 values than the RLM and RNLM (Table 1).

Additionally, compounds X1 and X15 have higher PIC50 values than the existing compounds in the case of the 32 studied compounds.

3.9. Five Rules of Lipinski. According to the following empirical principles enunciated by Christopher Lipinski and grouped under the name of "rule of five" [38], this rule is the most used for the identification of "drug-like" compounds [38]; a substance will be better absorbed or penetrated, so

(1) the molecular weight is less than or equal to 500 Da,

(2) it has 5 or less hydrogen bond donors (sum of OH and NH),

(3) it has 10 or fewer hydrogen bond acceptors (sum of O and N),

(4) its log P value is less than or equal to 5.

The empirical conditions to satisfy the Lipinski rule and demonstrate good oral bioavailability involve a balance between the aqueous solubility of a compound and its ability to passively diffuse through various biological barriers.

These settings allow us some oral absorption or permeability membrane which occurs when the molecule evaluated follows the rule of Lipinski [39, 40].

Molecules that violate many of these rules can have problems with bioavailability. Therefore, this rule establishes some relevant structural parameters for the theoretical prediction of the oral bioavailability profile and is widely used in the design of new drugs [41].

The results of calculation (Table 7) show that all compounds satisfy the rules of Lipinski, suggesting that these compounds theoretically do not have problems with oral bioavailability except the molecules 5, 8, 13, and 29 which has a log(P) value of 5.

4. Conclusion

A quantitative analysis of the structure-property relationship (QSAR) was performed on 32 molecules derived from the derivated pyrazol. A QSAR model was established using the multiple linear regression (MLR), partial least squares (PLS), and multiple nonlinear regression (MNLR). Assessing the quality of the MLR, PLS, and RNLM models has shown that the predictive capability of RNLM was substantially better than that of the other methods. The predictive power of the model obtained was confirmed by LOO cross-validation. A strong correlation was observed between the experimental and predicted values of the biological activities, which indicated the validity and quality of the QSAR model developed in this work. We conclude that the most important finding from this research is that we have been able to design and predict new compounds with higher or lower values than existing compounds (Table 6) by adding suitable substituents by calculating their propriety using the RLM, RNLM, and PLS equations. Thus, the proposed models will reduce the time, the cost, and also the human mobilization.

https://doi.org/10.1155/2018/3121802

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

[1] M. Sameiro and T. Goncalves, "Fluorescent labeling of biomolecules with organic probes," Chemical Reviews, vol. 109, no. 1, pp. 190-212, 2009.

[2] L. D. Lavis and R. T. Raines, "Bright ideas for chemical biology," ACS Chemical Biology, vol. 3, no. 3, pp. 142-155, 2008.

[3] L. S. Hegedus, Transition Metals in the Synthesis of Complex Organic Molecules, University Science Books, Sausalito, Calif, USA, 2nd edition, 1999.

[4] M. Beller and C. Bolm, Weinheim, Transition Metals for Organic Chemistry, Wiley-VCH, Weinheim, Germany, 2nd edition, 2004.

[5] I. Rezzano, G. Buldain, and B. Frydman, "The Total Synthesis of Natural Products," The Journal of Organic Chemistry, vol. 47, no. 16, pp. 3059-3063,1982.

[6] A. G. Montalban, Heterocycles in Natural Product Synthesis, Wiley-VCH, New York, NY, USA, 2011, 299-339.

[7] D. Lednicer, "The Organic Chemistry of Drug Synthesis," in The Organic Chemistry of Drug Synthesis, vol. 7, pp. 84-216, John Wiley Sons: Hoboken, New Jersey, 2007.

[8] T. Eicher, The Chemistry of Heterocycles: Structures, Reactions, Synthesis and Applications, Wiley-VCH Verlag, Germany, 2nd edition, 2003.

[9] R. K. F. Marra, A. M. R. Bernardino, T. A. Proux et al., "4-(1H-Pyrazol-1-yl) Benzenesulfonamide Derivatives: Identifying New Active Antileishmanial Structures for Use against a Neglected Disease," Molecules, vol. 17, no. 11, pp. 12961-12973, 2012.

[10] V. K. Aggarwal, J. De Vicente, and R. V. Bonnert, "A novel one-pot method for the preparation of pyrazoles by 1,3-dipolar cycloadditions of diazo compounds generated in situ," The Journal of Organic Chemistry, vol. 68, no. 13, pp. 5381-5383, 2003.

[11] A. Jamwal, A. Javed, and V. Bhardwaj A, "review on Pyrazole derivatives of pharmacological potential," Journal of Pharmaceutical and BioScience, vol. 25th, pp. 2321-0125, 2013.

[12] G. M. Nitulescu, C. Draghici, and O. T. Olaru, "New potential antitumor pyrazole derivatives: Synthesis and cytotoxic evaluation," International Journal of Molecular Sciences, vol. 14, no. 11, pp. 21805-21818, 2013.

[13] A. M. Shamsuzzaman, "A concise review on the synthesis of pyrazole heterocycles," Journal of Nuclear Medicine & Radiation Therapy, vol. 06, no. 05, 2015.

[14] D. Pal, S. Saha, and S. Singh, "Importance of pyrazole moiety in the field of cancer," International Journal of Pharmacy and Pharmaceutical Sciences, vol. 4, no. 2, pp. 98-104, 2012.

[15] R. Aggarwal, V. Kumar, R. Kumar, and S. P. Singh, "Approaches towards the synthesis of 5-aminopyrazoles," Beilstein Journal of Organic Chemistry, vol. 7, pp. 179-197, 2011.

[16] K. Du, Y. Mei, X. Cao, P. Zhang, and H. Zheng, "The synthesis of pyrazole derivatives based on glucose," International Journal of Chemical Engineering and Applications, vol. 4, no. 4, 2015.

[17] K. Ajay Kumar and P. Jayaroopa, "Synthetic strategies and their pharmaceutical applications-an overview," International Journal of PharmTech Research, vol. 5, no. 4, pp. 1473-1486,2013.

[18] K. S. Bhadoriya, M. C. Sharma, and S. V Jain, "Pharmacophore modeling and atom-based 3D-QSAR studies on amino derivatives of indole as potent isoprenylcysteine carboxyl methyltransferase (Icmt) inhibitors," Journal of Molecular Structure, vol. 1081, pp. 466-476, 2015.

[19] N. Hernandez, R. Kiralj, M. M. C. Ferreira, and I. Talavera, "Critical comparative analysis, validation and interpretation of SVM and PLS regression models in a QSAR study on HIV1 protease inhibitors," Chemometrics and Intelligent Laboratory Systems, vol. 98, no. 1, pp. 65-77, 2009.

[20] K. Roy, I. Mitra, P. K. Ojha, S. Kar, R. N. Das, and H. Kabir, "Introduction of r m2(rank) metric incorporating rank-order predictions as an additional tool for validation of QSAR/QSPR models," Chemometrics and Intelligent Laboratory Systems, vol. 118, pp. 200-210, 2012.

[21] R. Sabet, M. Mohammadpour, A. Sadeghi, and A. Fassihi, "QSAR study of isatin analogues as in vitro anti-cancer agents," European Journal of Medicinal Chemistry, vol. 45, no. 3, pp. 1113-1118, 2010.

[22] L. M. A. Mullen, P. R. Duchowicz, and E. A. Castro, "QSAR treatment on a new class of triphenylmethyl-containing compounds as potent anticancer agents," Chemometrics and Intelligent Laboratory Systems, vol. 107, no. 2, pp. 269-275, 2011.

[23] B. Chen, T. Zhang, T. Bond, and Y. Gan, "Development of quantitative structure activity relationship (QSAR) model for disinfection byproduct (DBP) research: A review of methods and resources," Journal of Hazardous Materials, vol. 299, pp. 260-279, 2015.

[24] T.-C. Chen, J.-H. Guh, H.-W. Hsu et al., "Synthesis and biological evaluation of anthra[1,9-cd]pyrazol-6(2H)-one scaffold derivatives as potential anticancer agents," Arabian Journal of Chemistry, 2014.

[25] A. Worachartcheewan, P. Mandi, V. Prachayasittikul, A. P. Toropova, A. A. Toropov, and C. Nantasenamat, "Large-scale QSAR study of aromatase inhibitors using SMILES-based descriptors," Chemometrics and Intelligent Laboratory Systems, vol. 138, pp. 120-126, 2014.

[26] S. Chtita, R. Hmamouchi, M. Larif, M. Ghamali, M. Bouachrine, and T. Lakhlifi, "QSPR studies of 9-aniliioacridine derivatives for their DNA drug binding properties based on density functional theory using statistical methods: model, validation and influencing factors," Journal of Taibah University for Science, vol. 10, no. 6, pp. 868-876, 2016.

[27] E. G. Hadaji, M. Bourass, A. Ouammou, and M. Bouachrine, "3D-QSAR models to predict anti-cancer activity on a series of protein P38 MAP kinase inhibitors," Journal of Taibah Universityfor Science, vol. 11, no. 3, pp. 392-407, 2017.

[28] S. Chtita, M. Ghamali, M. Larif et al., "Prediction of biological activity of imidazo[1,2-a]pyrazine derivatives by combining DFT and QSAR results," International Journal of Innovative Research in Science, Engineering and Technology, vol. 2, no. 12, 2013.

[29] E. Hadaji, M. Bourass, A. Ouammou, and M. Bouachrine, "3D-QSAR Models to Predict the Antiviral Activities of a series of novel N-phenylbenzamide and N-phenylacetophenone Compounds based on density functional theory using statistical methods," Moroccan Journal of Chemistry, vol. 4 N<>1, pp. 204-214, 2016.

[30] S. Chtita, M. Larif, M. Ghamali, M. Bouachrine, and T. Lakhlifi, "DFT-based QSAR Studies of MK801 derivatives for non competitive antagonists of NMDA using electronic and topological descriptors," Journal of Taibah University for Science, vol. 9, no. 2, pp. 143-154, 2014.

[31] Advanced Chemistry Development Inc, "Toronto, Canada (2009)," http://www.acdlabs.com/resources/freeware/chemsketch/.

[32] XLSTAT, "software (XLSTAT Company)," 2015, http://www. xlstat.com.

[33] R. Improta, V. Barone, G. Scalmani, and M. J. Frisch, "A state-specific polarizable continuum model time dependent density functional theory method for excited state calculations in solution," The Journal of Chemical Physics, vol. 125, Article ID 054103, 2006.

[34] K. Roy, R. N. Das, P. Ambure, and R. B. Aher, "Be aware of error measures. Further studies on validation of predictive QSAR models," Chemometrics and Intelligent Laboratory Systems, vol. 152, pp. 18-33, 2016.

[35] M. Parac and S. Grimme, "Comparison of multireference moller-plesset theory and time-dependent methods for the calculation of vertical excitation energies of molecules," The Journal of Physical Chemistry A, vol. 106, no. 29, pp. 6844-6850, 2003.

[36] F. Senn and Y. C. Park, "Constricted variational density functional theory for spatially clearly separated charge-transfer excitations," The Journal of Chemical Physics, vol. 145, no. 24, Article ID 244108, 2016.

[37] D. VOET and J. G. VOET, BIOCHIME, De Boeck & Larcier, Bruxelles, Belgium, 2e eadition edition, 2005, pp. 532.

[38] A. Chikhi, "Calcule et modelisation des interactions peptide defomylase-substances antibacteriennes a l'aide de technique "docking" (arrimage) moleculaire," Doctorat d'etat en Microbiologie, 2007.

[39] C. Hansch, A. Leo, S. B. Mekapati, and A. Kurup, "Drug-like properties: concepts, structure design and methods," Bioorganic and Medicinal Chemistry, vol. 12, pp. 3391-3400, 2004.

[40] C. Hansch, A. Leo, and D. Hoekman, "Exploring QSAR, Vol. I and Vol. II, ACS Professional Reference Book," Oxford University, New York, NY, USA, 1995.

[41] M. H. Abraham, J. M. R. Gola, R. Kumarsingh, J. E. Cometto-Muniz, and W. S. Cain, "Drug-like properties: concepts, structure design and methods," Journal of Chromatography B: Biomedical Sciences and Applications, vol. 745, no. 1, pp. 103-115, 2000.

El Ghalia Hadaji, (1) Abdelkarim Ouammou, (1) and Mohammed Bouachrine (iD), (2)

(1) Faculty of Sciences Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco

(2) Equipe Materiaux, Environnement & Modelisation, ESTM, University Moulay Ismail, Meknes, Morocco

Correspondence should be addressed to Mohammed Bouachrine; bouachrine@gmail.com

Received 1 August 2017; Accepted 20 December 2017; Published 1 February 2018

Academic Editor: Viktor O. Iaroshenko

Caption: FIGURE 1: The principal components and their variances.

Caption: FIGURE 2: Circles correlations between descriptors.

Caption: FIGURE 3: Predicted anticancer activities of PIC50 by MLR in comparison with experimental values (training set in blue and test set in red).

Caption: FIGURE 4: Graphical representation of calculated and observed activity by PLS.

Caption: FIGURE 5: Graphical representation of calculated and observed PIC50 with MNLR (training set in blue and test set in red).

Caption: FIGURE 6: Correlation between observed PIC50 and predicted PIC50 (cross-validation).

Caption: FIGURE 7: Correlation between observed PIC50 and predicted PIC50 (Y-randomization).
TABLE 1: Studied compounds and radicals.

Compounds                       R1                        PIC50

1                               Cl                        1,258
2               NHC[H.sub.2]CH[(C[H.sub.3]).sub.2]        1,251
3                NHC[H.sub.2]C[H.sub.2]C[H.sub.3]         1,491
4                NHC[H.sub.2]C[H.sub.2]C[H.sub.2]         1.000
                      C[H.sub.2]C[H.sub.2]OH
5            NHC[H.sub.2]C[H.sub.2][C.sub.6][H.sub.5]     1,340
6                         NH-Cyclohexane                  1,109
7                         NH-Cyclopentane                 1,200
8                    NHC[H.sub.2]-Cyclohexane             1,302
9               NHC[H.sub.2]C[H.sub.2]C[H.sub.2]OH        1,160
10          NHC[H.sub.2]C[H.sub.2]C[H.sub.2]C[H.sub.3]    1,386
11                         NHC[H.sub.3]                   1,556
12                NHC[H.sub.2][C.sub.6][H.sub.5]          0,946
13              NHC[H.sub.2]C[H.sub.2]-Cyclohexane        1,505
14                                                        1,532
15                                                        0,926
16                                                        0,740
17                              Cl                        1,137
18              NHC[H.sub.2]CH[(C[H.sub.3]).sub.2]        1,455
19               NHC[H.sub.2]C[H.sub.2]C[H.sub.3]         1,484
20               NHC[H.sub.2]C[H.sub.2]C[H.sub.2]         1,486
                      C[H.sub.2]C[H.sub.2]OH
21           NHC[H.sub.2]C[H.sub.2][C.sub.6][H.sub.5]     1,489
22                        NH-Cyclohexane                  1,490
23                        NH-Cyclopentane                 1,346
24                   NHC[H.sub.2]-Cyclohexane             1,254
25              NHC[H.sub.2]C[H.sub.2]C[H.sub.2]OH        1,272
26          NHC[H.sub.2]C[H.sub.2]C[H.sub.2]C[H.sub.3]    1,519
27                         NH C[H.sub.3]                  1,487
28                NHC[H.sub.2][C.sub.6][H.sub.5]          1,493
29              NHC[H.sub.2]C[H.sub.2]-Cyclohexane        1,324
30                                                        1,505
31                                                        1,504
32                                                        1,497

TABLE 2: Structures of dataset used for QSAR analysis of a series
of pyrazol inhibitors.

     [E.sub.LUMO]   [E.sub.HOMO]   [E.sub.totale]    RE ev      MD

1       -2,596         -6,265        -32173,30      35590,20   5,70
2       -2,083         -5,261        -25451,80      46521,92   1,76
3       -2,187         -5,318        -24382,28      43034,30   1,87
4       -2,482         -5,362        -28567,65      54475,02   3,49
5       -2,274         -5,242        -29598,90      57352,68   1,55
6       -2,230         -5,187        -27558,57      52684,48   1,02
7       -2,169         -5,224        -25508,68      49729,89   1,74
8       -2,183         -5,265        -28627,08      56732,18   1,79
9       -2,014         -5,197        -26427,90      46287,76   2,01
10      -2,078         -5,259        -25451,76      46211,30   1,83
11      -2,090         -5,293        -22243,32      36510,24   1,82
12      -2,260         -5,374        -28528,83      53453,30   2,21
13      -2,178         -5,268        -29696,76      59558,41   1,71
14      -2,076         -5,266        -32713,39      63731,66   1,16
15      -2,231         -5,230        -31644,55      61133,18   0,78
16      -2,304         -5,315        -33657,19      63325,51   1,68
17      -2,396         -5,967        -36357,90      46089,73   7,72
18      -1,914         -5,125        -29669,05      57802,61   4,09
19      -1,911         -5,128        -28598,39      54046,80   4,14
20      -1,940         -5,159        -32787,47      63911,53   4,50
21      -2,039         -5,199        -33820,13      70238,20   4,30
22      -2,008         -5,116        -31777,54      65294,20   4,18
23      -1,999         -5,098        -30706,64      61164,80   4,11
24      -2,017         -5,142        -32847,84      68582,30   4,17
25      -2,047         -5,185        -30646,17      57276,36   5,66
26      -2,010         -5,137        -29669,01      57252,92   4,14
27      -1,917         -5,151        -26457,03      47005,42   4,17
28      -1,950         -5,167        -32749,50      65020,66   4,18
29      -2,091         -5,096        -33919,37      71855,38   3,36
30      -2,080         -5,081        -36939,25      76895,79   2,90
31      -1,932         -4,998        -35868,03      73878,45   3,17
32      -2,152         -5,215        -37883,52      76645,08   3,86

       MR      MV     Parac   IndiR    ST    Desty   Polaris   log P

1    69,18    162,4   488,5   1,80    81,8   1,57     27,42    4,18
2    87,75    219,8   636,1   1,73     70    1,33     34,79    4,30
3    83,16    202,9    598    1,76    75,4   1,37     32,97    3,98
4    93,96    233,4   695,2   1,74    78,6   1,38     37,25    3,53
5    103,02   247,1   731,8   1,77    76,8   1,37     40,48    5,10
6    94,93    232,4   686,7   1,75    76,1   1,37     37,63    4,98
7    90,32    214,7   646,7   1,78    82,3   1,41     35,8     4,42
8    99,58    252,4   726,8   1,72    68,7   1,31     39,47    5,59
9     84,7    200,4   615,1   1,79    88,6   1,46     33,57    3,28
10    87,8    219,4   638,1   1,73    71,4   1,33     34,8     4,54
11    73,9    169,9   517,9   1,82    86,2   1,47     29,29    3,22
12   98,39    230,6   691,7   1,80    80,9   1,41      39      4,94
13   104,21   268,9   766,9   1,70    66,1   1,28     41,31    6,35
14   109,7    271,1   790,4   1,74    72,2   1,36     43,48    5.00
15   105,06   254,6   750,4   1,76    75,4   1,40     41,65    5.00
16   104,55   241,8   739,4   1,81    87,4   1,53     41,44    4,85
17    79,4    195,5   546,9   1,75    61,2   1,52     31,47    3,24
18   95,91    253,4   677,3   1,68    50,9   1,32     38,02    3,36
19   91,49    238,2   646,2   1,69    54,1   1,34     32,27    3,04
20   101,75   268,5   736,6   1,68    56,6   1,36     40,33    2,59
21   112,17   290,9   791,9   1,70    54,9   1,31     44,47    4,16
22   102,95   258,8   714,7   1,73    58,1   1,39     40,81    4,04
23   98,35    242,7   676,1   1,74    60,1   1,43     38,98    3,48
24   107,56   274,8   753,3   1,71    56,4   1,36     42,64    4,65
25   92,53    236,4   659,4   1,71    60,5   1,42     36,68    2,34
26    96,1    254,3   684,8   1,70    52,5   1,31     38,09    3,60
27   82,27    206,1    569    1,73     58    1,42     32,61    2,28
28   107,56   274,8   753,3   1,71    56,4   1,34     42,64    4.00
29   112,17   290,9   791,9   1,70    54,9   1,33     44,47    5,41
30   117,99   312,5   842,2   1,70    52,7   1,32     46,77    4,06
31   113,38   296,5   803,6   1,69    53,9   1,34     44,94    3,90
32   112,4    275,3   775,5   1,75    62,9    1,5     44,56    3,75

     DH   AH     MW

1    1    3    254,67
2    2    4    291,35
3    2    4    277,33
4    3    5    321,38
5    2    4    339,40
6    2    4    317,39
7    2    4    303,37
8    2    4    331,42
9    3    5    293,33
10   2    4    291,35
11   2    4    249,27
12   2    4    325,37
13   2    4    345,45
14   2    5    369,42
15   2    5    369,42
16   2    6    383,41
17   1    4    298,73
18   2    5    335,41
19   2    5    321,38
20   3    6    365,43
21   2    5    383,45
22   2    5    361,45
23   2    5    347,42
24   2    5    375,47
25   3    6    337,38
26   2    5    335,41
27   2    5    293,33
28   2    5    369,42
29   2    5    389.50
30   2    6    413,48
31   2    6    399,45
32   2    7    413,43

TABLE 3: Comparison between activities observed and predicted
of statistically significant models obtained by 2D models
training set.

Comp.   PIC50   Pred. (PIC50) RLM   Residu.    CV    Comp.   PIC50

1       1,258         1,183          0,075    1,36     1     1,258
2       1,251         1,318         -0,067    1,32     3     1,491
4       1,000         1,017         -0,018    0,98     4     1,000
6       1,109         1,215         -0,106    1,11     5     1,340
7       1,200         1,247         -0,047    1,22     6     1,109
9       1,160         1,398         -0,238    1,31     8     1,302
10      1,386         1,318          0,068    1,39     9     1,160
11      1,556         1,395          0,161    1,23    10     1,386
12      0,946         1,173         -0,227     1      12     0,946
13      1,505         1,411          0,094    1,41    13     1,505
14      1,531         1,379          0,153    1,37    14     1,531
16      0,740         0,660          0,079     1      15     0,926
18      1,455         1,503         -0,048    1,44    16     0,740
19      1,484         1,422          0,062    1,43    18     1,455
20      1,486         1,480          0,005    1,44    19     1,484
21      1,489         1,475          0,013    1,46    20     1,486
22      1,490         1,341          0,149    1,43    21     1,489
24      1,254         1,387         -0,133    1,41    22     1,490
25      1,272         1,289         -0,017    1,3     23     1,346
26      1,519         1,404          0,115    1,47    26     1,519
28      1,493         1,486          0,006    1,43    27     1,487
29      1,324         1,387         -0,063    1,31    28     1,493
30      1,505         1,514         -0,009    1,43    30     1,505
31      1,504         1,512         -0,008    1,48    32     1,497

Comp.   Pred. (PIC50) RNLM   Residu.   Comp.   PIC50

1             1,192           0,066      1     1,258
2             1,363           0,128      2     1,251
4             1,089          -0,089      3     1,491
6             1,175           0,165      5     1,340
7             1,220          -0,111      6     1,109
9             1,352          -0,050      7     1,200
10            1,298          -0,137      8     1,302
11            1,478          -0,092     10     1,386
12            1,116          -0,170     11     1,556
13            1,422           0,083     14     1,531
14            1,387           0,145     15     0,926
16            0,858           0,068     17     1,137
18            0,805          -0,066     19     1,484
19            1,557          -0,102     20     1,486
20            1,484           0,001     21     1,489
21            1,508          -0,023     22     1,490
22            1,496          -0,007     23     1,346
24            1,345           0,144     24     1,254
25            1,305           0,040     25     1,272
26            1,441           0,077     26     1,519
28            1,432           0,055     27     1,487
29            1,518          -0,025     28     1,493
30            1,615          -0,110     29     1,324
31            1,488           0,009     31     1,504

Comp.   Pred. (PIC50) PLS   Residu.   Comp.    PIC50    Y-rd

1             1,242          0,016     20      1,49     1,39
2             1,390         -0,139      2      1,25     1,34
4             1,305          0,186      4        1      1,19
6             1,228          0,112     30      1,51     1,38
7             1,268         -0,160      7       1,2     1,26
9             1,313         -0,113     10      1,39     1,4
10            1,350         -0,048      6      1,11     1,11
11            1,390         -0,005     12      0,95      1
12            1,485          0,072     31       1,5     1,48
13            1,384          0,147     25      1,27     1,24
14            0,895          0,031     16      0,74     0,89
16            1,176         -0,039     18      1,46     1,47
18            1,476          0,008     19      1,48     1,47
19            1,488         -0,002     11      1,56     1,28
20            1,457          0,031     26      1,52     1,46
21            1,368          0,122     22      1,49     1,43
22            1,390         -0,044     24      1,25     1,42
24            1,394         -0,140     29      1,32     1,33
25            1,338         -0,067     13      1,51     1,41
26            1,437          0,081      1      1,26     1,25
28            1,505         -0,017      9      1,16     1,29
29            1,488          0,004     14      1,53     1,38
30            1,374         -0,050     21      1,49     1,3
31            1,490          0,014     28      1,49     1,45

TABLE 4: Comparison between observed and predicted
activities of statistically significant models
obtained by 2D models test set.

Comp.   PIC50   PIC50RLM test   Comp.   PIC50   PIC50 RLNM test

3       1,491       1,217         2     1,251        1,483
5       1,340       1,192         7     1,200        1,201
8       1,302       1,321        11     1,556        1,468
15      0,926       0,805        17     1,137        0,945
17      1,137       1,110        24     1,254        1,426
23      1,346       1,353        25     1,272        1,220
27      1,487       1,432        29     1,324        1,439
32      1,497       1,093        31     1,504        1,639

Comp.   Comp.   PIC50   PIC50 PLS test

3         4     1,000       1,071
5         9     1,160       1,465
8        12     0,946       1,225
15       13     1,505       1,414
17       16     0,740       0,759
23       18     1,455       1,531
27       30     1,505       1,462
32       32     1,497       1,110

TABLE 5: The correlation matrix (Pearson (n)) between
different obtained descriptors.

Desc.          [E.sub.LUMO]   [E.sub.LUMO]   [E.sub.TOT]     RE

[E.sub.LUMO]        1
[E.sub.LUMO]      0,777            1
[E.sub.TOT]       0,037          0,065            1
RE                0,357          0,555         -0,761        1
MD                0,029          -0,367        -0,452      0,047
MR                0,311          0,590         -0,649      0,963
MV                0,421          0,614         -0,648      0,958
Parac             0,278          0,593         -0,590      0,929
IndiR             -0,574         -0,454         0,352      -0,540
ST                -0,614         -0,386         0,494      -0,579
Desty             -0,503         -0,623        -0,114      -0,354
Polrs             0,273          0,563         -0,648      0,950
log P             -0,348         -0,023        -0,058      0,199
DH                0,334          0,563          0,210      0,173
AH                0,450          0,535         -0,589      0,744
MW                0,296          0,504         -0,791      0,989

Desc.            MD       MR       MV     Parac    IndiR      ST

[E.sub.LUMO]
[E.sub.LUMO]
[E.sub.TOT]
RE
MD               1
MR             -0,172     1
MV             -0,045   0,968      1
Parac          -0,248   0,989    0,962      1
IndiR          -0,340   -0,474   -0,675   -0,486     1
ST             -0,548   -0,454   -0,632   -0,401   0,904      1
Desty          0,285    -0,468   -0,593   -0,532   0,734    0,504
Polrs          -0,182   0,990    0,951    0,980    -0,439   -0,417
log P          -0,598   0,388    0,284    0,437    0,137    0,268
DH             -0,187   0,182    0,195    0,247    -0,173   0,046
AH             0,190    0,616    0,620    0,575    -0,379   -0,444
MW             0,020    0,962    0,937    0,929    -0,474   -0,515

Desc.          Desty    Polaris    logP     DH      AH     MW

[E.sub.LUMO]
[E.sub.LUMO]
[E.sub.TOT]
RE
MD
MR
MV
Parac
IndiR
ST
Desty            1
Polrs          -0,440      1
log P          -0,289    0,406      1
DH             -0,249    0,182    -0,301     1
AH             -0,007    0,599    -0,363   0,464     1
MW             -0,288    0,950    0,225    0,152   0,754   1

MW and RE are perfectly correlated (r = 0,989); MR and RE are
perfectly correlated (r = 0,963); MR and MW are perfectly
correlated (r = 0,962); MR and Polarisability are perfectly
correlated (r = 0,99); MR and Parachoc are perfectly correlated
(r = 0,989); MR and MV are perfectly correlated (r = 0,986);
parachoc and MV are perfectly correlated (r = 0.962);
Polarisability and Parachoc are perfectly correlated (r = 0.98).
The following variables then removed are Parachoc, RE, and MR.

TABLE 6: The proposed novel compounds.

Comp.                 r1                           r2

X1                    H                            H
X2                    OH                           H
X3                   COOH                          H
X4                    CN                           H
X5               C[H.sub.2]OH                      H
X6               C[H.sub.2]CI                      H
X7               C[Cl.sub.3]                       H
X8                    H                  C[H.sub.2]C[H.sub.2]OH
X9                    OH                 C[H.sub.2]C[H.sub.2]OH
X10                  COOH                C[H.sub.2]C[H.sub.2]OH
X11                   CN                 C[H.sub.2]C[H.sub.2]OH
X12              C[H.sub.2]OH            C[H.sub.2]C[H.sub.2]OH
X13              C[H.sub.2]CI            C[H.sub.2]C[H.sub.2]OH
X14              C[Cl.sub.3]             C[H.sub.2]C[H.sub.2]OH
X15     [([C.sub.6][H.sub.10]).sub.2]    C[H.sub.2]C[H.sub.2]OH

Comp.   [E.sub.LUMO]     MW      MV     Density   PIC50 RLM

X1         -2.277      237.26   147.4     1.6       1.52
X2         -2.421      253.26   152.6    1.65       1.48
X3         -2.555      281.27    166     1.69       1.36
X4         -2.901      262.27   168.2    1.55       0.85
X5         -2.238      267.28   168.7    1.58       1.42
X6         -2.741      285.73   180.7    1.58         1
X7         -3.01       354.62   204.7    1.73       0.71
X8         -2.096      281.31   182.9    1.53       1.38
X9         -2.235      297.31   188.1    1.57       1.29
X10        -2.266      325.32   201.5    1.61       1.25
X11        -2.365      306.32   203.7     1.5        1.2
X12        -2.059      311.34   204.2    1.52       1.33
X13        -2.554      329.78   216.2    1.52       0.93
X14        -2.817      398.67   240.2    1.65       0.53
X15        -2.065      445.6    332.6    1.33       1.52

Comp.   PIC50 RNLM   PIC50 PLS

X1         1.44        1.58
X2         1.3         1.51
X3         1.14        1.38
X4         1.17        0.93
X5         1.23        1.49
X6         1.05        1.05
X7         1.26        0.75
X8         1.23        1.47
X9         1.06        1.35
X10        1.04        1.29
X11        1.01        1.19
X12        1.19        1.39
X13        0.94        0.99
X14        1.23        0.59
X15        1.91        1.42

TABLE 7: Violations of Lipinski rule.

Comp.   log CP)   DH   AH   log (P)   Number of violations

1        4,18     1    3    254,67             0
2        4,30     2    4    291,35             0
3        3,98     2    4    277,33             0
4        3,53     3    5    321,38             0
5        5,10     2    4    339,40             1
6        4,98     2    4    317,39             0
7        4,42     2    4    303,37             0
8        5,59     2    4    331,42             1
9        3,28     3    5    293,33             0
10       4,54     2    4    291,35             0
11       3,22     2    4    249,27             0
12       4,94     2    4    325,37             0
13       6,35     2    4    345,45             1
14       5.00     2    5    369,42             0
15       5.00     2    5    369,42             0
16       4,85     2    6    383,41             0
17       3,24     1    4    298,73             0
18       3,36     2    5    335,41             0
19       3,04     2    5    321,38             0
20       2,59     3    6    365,43             0
21       4,16     2    5    383,45             0
22       4,04     2    5    361,45             0
23       3,48     2    5    347,42             0
24       4,65     2    5    375,47             0
25       2,34     3    6    337,38             0
26       3,60     2    5    335,41             0
27       2,28     2    5    293,33             0
28       4.00     2    5    369,42             0
29       5,41     2    5    389.50             1
30       4,06     2    6    413,48             0
31       3,90     2    6    399,45             0
32       3,75     2    7    413,43             0
COPYRIGHT 2018 Hindawi Limited
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Research Article
Author:Hadaji, El Ghalia; Ouammou, Abdelkarim; Bouachrine, Mohammed
Publication:Advances in Chemistry
Article Type:Report
Date:Jan 1, 2018
Words:8179
Previous Article:Review on Carbon Dioxide Absorption by Choline Chloride/Urea Deep Eutectic Solvents.
Next Article:Inhibition of Aluminium Corrosion in 1.0 M HCl by Caffeine: Experimental and DFT Studies.
Topics:

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters