Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives

pdf
Số trang Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives 7 Cỡ tệp Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives 89 KB Lượt tải Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives 0 Lượt đọc Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives 39
Đánh giá Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives
5 ( 22 lượt)
Nhấn vào bên dưới để tải tài liệu
Để tải xuống xem đầy đủ hãy nhấn vào bên trên
Chủ đề liên quan

Nội dung

Journal of Chemistry, Vol. 38, No.3, P. 91 - 96, 2000 Using a combination of theoretical descriptors and a neural network to predict the activity of a set of N-alkyl-n-acyl- -aminoamide derivatives Received 21-02-2000 Pham Van Tat, Pham Nu Ngoc Han Department of Chemistry, University of Dalat Summary A back -propagation artificial neural net has been trained to estimate the activity values of a set of 18 N-alkyl-N-acyl- -aminoamide derivatives from the results of molecular mechanics and RHF/PM3/SCF MO semi-empirical calculations. The input descriptors include molecular properties such as the partition coefficient P, 3d structure dependent parameters, charge dependent parameters, and topological descriptors. I - Introduction In general, the skeleton of N-alkyl-N-acyl-aminoamide derivatives is given in figure 1 [1]. Quantitative structure-activity relationship (QSAR) has been used extensively in H correlation molecular structure features of R2 O compounds to their biological, chemical, and N physical properties. The preferability of QSAR N R3 R 1 is that there is quantitative connection between 8 the microscopic (molecular structure) and the O R H R4 O 2 macroscopic (empirical) properties (particularly 13 biological activity) of a molecule. Furthermore, 3 (A) 9 11 N 7 4 N 10 R3 2 this connection can be used to predict empirical 15 1 properties of a compound with its molecular 5 O 16 6 R4 structure given. 12 COOH 14 The N-alkyl-N-acyl- -aminoamides were (B) synthesized and screened against several protein tyrosine phosphatases (PTPase), and several classes of potent and the selective inhibitors of various PTPase. The compounds of general formula in figure 1 bearing a cinnamate group Figure 1: The skeleton of N-alkyl-N-acyl- were shown to exhibit low micromolar aminoamide inhibitory activity against HePTP which is a A) general structure, B) the R1 group is phosphatase specific to hematopoeitic cells and replaced by a cinnamic group implicated in acute leukemia. 91 The electrostatic interaction, and bulk or are used here shown in table 1 and the steric effect, and transfer property (trans- numeration needed for indicating net charge is ferability) of the molecules are considered as given in figure 1. microscopic properties. Theoretical descriptors Table 1: Theoretical descriptors are used in calculations [4] 3d structure dependent parameters Volume Mol. Weight Polar Sp. Polar LogP Van der Waals volume Molecular weight of a molecular Molecular polarizability Specific polarizability of a molecule The logarithm of the partition coefficient P Charge dependent parameters Dipole MaxQpos MarQneg ABSQ ABSQon Dipole moment of the molecule Largest positive charge over the atoms Largest negative charge over the atoms Sum of absolute values of the charges on each atom Sum of absolute values of the charges on the nitrogen and oxygen in molecule Net charge of the atoms C1 C2 C3 C4 C5 C6 Net charge of 1 atom C7 Net charge of 2 atom O8 Net charge of 3 atom N9 Net charge of 4 atom C10 Net charge of 5 atom C11 Net charge of 6 atom O12 Net charge of 7 atom Net charge of 8 atom Net charge of 9 atom Net charge of 10 atom Net charge of 11 atom Net charge of 12 atom N13 Net charge of 13 atom C14 Net charge of 14 atom C15 Net charge of 15 atom C16 Net charge of 16 atom HOMO Highest occupied MO LUMO Lowest unoccupied MO Topological descriptors Description and calculation 92 X1 First-order molecular connectivity index computed over all single bonds of a hydrogen-suppressed graph of the molecule, no hydrogen atoms present. It has been found to be one of most useful 2-D descriptors when computing a QSAR/QSPR expression. The quantitative form of X1 is: X1 = ( i . j)0.5 where sume is over all ij bonds connecting atom (i) to atom (j). Single bonds of j = number of skeletal neighbors of atom i. i = i - h where i = number of atom is electrons in sigma orbital and h = number of hydrogen atoms bonded to skeletal atom i. X3 Third-order molecular connectivity index VX1 First-order valence connectivity index over all bonds for the entire molecule. VX1 = (1/ Vi . Vj)0.5 where v has been defined under VX0. VX1 = 1 V in the Kier and Hall notation. Ka3 (Kappa Alpha3) Third order shape index for molecules. It encodes atom identity involved in the assessing the shape of a molecule. Therefore, it can discern isomers of the same molecule. Ka3 = (A + - 1) (A + - 3)2 / (Pi + ) for A = odd number Ka3 = (A + - 2) (A + - 3)2 / (Pi + ) for A = even number A = number of atoms in the molecule. = ((Ri/RCps3) - 1) where summation is over all atoms in the molecule and Ri and RCsp3 are the radii for the ith atom and for an sp3 carbon atom. WienI VX0 Wiener Index is a topological parameter W as formulated by H. Wiener. The Wiener Index is based on the graph of molecule (skeletal system without hydrogen). The path number W is defined as the sum of the distance between any two carbon atoms in the molecule, in terms of carbon-carbon bonds. The brief method of calculation is as follows: Multiply the number of carbon atoms on one side of any bond by those on the other side; W = sum of these values for all bonds. The Wiender Index of a molecule is generally higher for larger molecules and provides some measure of the branching of the molecule. In particular, it is larger for extended molecules and smaller for more compact ones. It correlates with ovality and volume and in some cases, it can be used in place of one or both of these molecular descriptors. It is a untiless parameter. Zero order valence connectivity index computed over all atoms in the entire molecule. VX0 = (1/ iV)0.5 where the summation is over all atoms in the molecule. iV = (ZV - h)/(Z - ZV - 1) where: ZV - number of valence electrons in the skeletal atom i, Z - atomic number, h - number of hydrogen atoms bonded to atom i. In this work, we carried out the molecular mechanics and RHF/PM3/SCF MO semiempirical calculations from which the molecular properties were evaluated, and the investigated results were obtained by multiple linear regression analysis and neural network. II - Computational method 1. The data and related software 93 18 N-alkyl-N-acyl- -aminoamides and the activity values IC50 are taken from [1] and shown in table 2. The structures were optimized using molecular mechanics and RHF/PM3/SCF MO semiemprirical quantum chemical approaches with the help of the programs HyperChem 5.11 [6], Gaussian 98 [7], Alchemy 2000, SciQSAR 3.0 [4] and the statistical program Essential Regression 2.218 (3/1999) is a compiled MS Excel Marco (Add-in) [8], and NeuroSolution 3.0 program [2]. All sorts of calculations were carried out on the Pentium II 350 MHz computer with 128 M RAM at the Faculty of Chemistry, University of Dalat. 2. Multivariate linear regression analysis Table 2: Inhibitory activity of Substituted N-alkyl-N-acyl- -aminoamide molecules [1] No R2 R3 R4 IC50, µM No R2 R3 R4 IC50, µM 1 n-hexyl n-butyl -H 9.0 10 Phenyl Cyclohexyl -H 6.20 2 n-hexyl tert-butyl -H 7.5 11 Phenyl Benzyl -H 3.90 3 n-hexyl Cyclohyxyl -H 9.0 12 Phenyl -CH2COOH -H 10.4 4 n-hexyl Benzyl -H 6.0 13 Phenyl -CH2CO2Me -H 20.2 5 n-hexyl -CH2COOH -H 7.5 14 Phenyl -CH2CO2Et -H 9.60 6 n-hexyl -CH2CO2Me -H 7.2 15 Methyl Benzyl -H 7.20 7 n-hexyl -CH2CO2Et -H 10 16 Benzyl -H 15.0 8 9 Phenyl Phenyl -H -H 6.7 4.0 17 18 n-proyl Benzyl n-butyl Benzyl -H -H 6.10 6.30 n-butyl tert-butyl The regression equation used here is as follows [3, 4]: (1) A = PiXi + C th where the Xi - the i independent descriptor and Pi - the fitting parameter for the descriptor, the A-biological activity of the drug, and Cconstant. 3. Neural network A neural net is a tool that can be used to predict the value of a parameter using a computational system which is made up a number of simple, yet highly connected processing elements called nodes which process information by its dynamic state response to external inputs. A recent article that describes the use of a neural nets to correlate physical properties of compounds can be found in Soman's article [2, 5]. III - Results and discussion 1. Multivariate linear regression analysis 94 Ethyl Using a multivariate linear regression analysis is a fast method to identify the calculated properties that are important for the prediction of experimental quantities. The magnitude of fitting parameter Pi indicates the amount of the contribution of the descriptor to the activity. That is, the larger the magnitude of Pi is the more important it is to the activity. For the detailed observation of the data characteristics, the descriptors are selected in the linear regression analysis by leave-one-out method on basis the change of multiple R. The principal descriptors are series of the net charges of atoms located in the ring benzene, i.e. C2, C3, C5, C6, and other sites are O12, O8, C10, C11. The descriptors which represent net charges are principal to describe activity for Nalkyl-N-acyl- -aminoamide. Besides, there are also the topological descriptors for a neural net. All investigations descriptors, 3d structure dependent parameters were performed by the program Essential and charge dependent parameters, i.e. X1, VX0, Regression 2.218 (3/1999) which is a compiled VX1, logP, WienI, Polar, ABSQ, ABSQneg. MS Excel Macro (Add-in). The parameters The most important parameters seem to be X1, multiple R, R2, Standard Error, PRESS, VX0, VX1, WienI, Polar and ABSQneg. This Significance F and t-values were used to select means that for N-alkyl-N-acyl- -aminoamide, the best regression model. The best regression the electrostatic interaction, steric effect and the model has significant 10 variables in table 3. transferability are important to determine the DivIC50 = 1/ IC50 = -4.1674 + 0.4393 X1 activity. These properties should be as useful as 0.3211VX0 - 0.4123VX1 + 0.0254 Volume WienIof +the0.0056 MolWeight - 0.0129 and t-values best regression model Table 3: The regression statistics, Pi - values0.0011 Dipole - 0.3144ABSQ + 0.4165O8 - 1.48O12 (2) Regression Statistics Parameter Pi t-value Parameter Pi t-value No 1 Multiple R 0.9900 PX1 0.4393 13.75 Pmol.Weight 0.0056 8.901 2 R Square (R2) 0.9801 PVX0 -0.3211 -7.371 PDipole -0.0129 -4.010 3 Standard Error 0.0122 PVX1 -0.4123 -12.47 PABSQ -0.3144 -11.38 4 PRESS 0.0101 Pvolume 0.0254 10.47 PO8 0.4165 4.007 5 Significance F 0.00005 PWienI -0.0011 -13.94 PO12 -1.4800 -5.419 The descriptors found in equation (2) were used for the back -propagation neural net. 9 Nalkyl-N-acyl- -aminoamides are taken from a SDF file of Cambridge databases. The predicted DivIC50 values of 9 these derivatives by multiple linear regression are given in table 4 and the regression plot in figure 2. 2. Results of the back-propagation neural network The architecture of a neural net involves the number of descriptors for input layer which 10 being equal to the number of the variable in equation (2), the number of hidden layer is 1 and the number of nodes of hidden layer are 20, the number of descriptor for output layer is 1 (the DivIC50 value). We carried out training the neural net with a set of 18 N-alkyl-N-acyl- -aminoamides when the trained conditions are momentum of 0.7, transfer function is TanhAxon, Maximum Epochs of 2000. The NeuroSolution 3.0 was used in this work. Table 4: The predicted DivIc50 values by multiple regression and a 10 x 20 x 1 neural net No R2 R3 R4 R5 DivIC50.epx Predicted DivIC50 by multiple linear regression Predicted DivIC50 by neural net 1 n-hexyl Benzyl -H 3-Br 0.0507 0.05248 0.04971 2 n-hexyl Benzyl -H 3-Cl 0.0506 0.04768 0.05364 3 n-hexyl Benzyl -H 3-F 0.0519 0.05724 0.05082 4 n-hexyl Benzyl -H 3-OCH3 0.0615 0.06011 0.06301 5 n-hexyl Benzyl -H 0.1104 0.10134 0.10115 3-OH 95 6 n-hexyl Benzyl -H 3-NH3 0.0551 0.05636 0.05157 7 n-hexyl tert-butyl -H 3-Br 0.0535 0.05666 0.05102 8 n-hexyl Cyclohyexyl -H 3-Cl 0.0562 0.05824 0.05258 9 n-hexyl -CH2COOOH -H 3-F 0.0546 0.04765 0.05052 We used a 10 x 20 x 1 neural net to predict the DivIC50 values of 9 N-alkyl-N-acyl- aminoamide derivatives which the neural net was not trained in table 4. The correlation is illustrated in figure 3. The correlation coefficient R2 is 0.97866 with a standard deviation of 0.00259. These initial investigations are thus very promising to predict the inhibitory activity of new drugs. Figure 2: The activity values are predicted by multiple linear regression analysis Figure 3: The activity values are predicted by a 10 x 20 x 1 neural net Conclusion We have used the molecular mechanics and RHF/PM3/SCF MO semi-empirical calculations from which the molecular properties are evaluated and combined a multivariate linear regression analysis and a back-propagation neural net for the prediction of the DivIC50 values of 9 N-alkyl-N-acyl- -aminoamide derivatives and applicable for the development of new drugs. The above approach shows a promising technique. The predictive power of our neural network shows very good agreement with experimental values when the trained condition of a neural net is Maximum Epoch of 2000. The danger of overtraining of the neural net was checked with standard deviation. The correlation coefficients and standard deviations are appropriate. References 1. X. Cao, E. J. Moran, D. Siev, A. Lio, C. 96 2. 3. 4. 5. 6. 7. 8. Ohashi, A. M M. Mjalli. Bioorg. Med. Chem. Lett. 24, 2953-2958 (1995). A. G. Soman, J. A. Darsey, D. W. Noid and B. G. Sumpter. Chimicaoggi/Chemistry Today, March (1995). N. R. Draper, H. Smith. Applied regression analysis, 2nd Edition, Jhn Wiley & Sons, New York (1998). Scivision. SciQSAR 3.0 User' Guide, Burlingtong USA, Copyright (1999). Scivision. SciLogP 3.0 User' Guide, Burlington USA, Copyright (1999). Hypercube, Inc. Hyperchem Release 5.1 for Windows, October (1996). J. Michael Frishch. Gaussian 98 User's Reference, Gaussian, Inc, 1994-1998. D. David Steppan, Joachim Werner, P. Rober Yeater. Essential Regression and Experimental Design for Chemists and Engineers, Copyright, June (1998). 97
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.