TY - JOUR
T1 - A novel Automated Lazy Learning QSAR (ALL-QSAR) approach
T2 - Method development, applications, and virtual screening of chemical databases using validated ALL-QSAR models
AU - Zhang, Shuxing
AU - Golbraikh, Alexander
AU - Oloff, Scott
AU - Kohn, Harold
AU - Tropsha, Alexander
PY - 2006
Y1 - 2006
N2 - A novel automated lazy learning quantitative structure-activity relationship (ALL-QSAR) modeling approach has been developed on the basis of the lazy learning theory. The activity of a test compound is predicted from a locally weighted linear regression model using chemical descriptors and the biological activity of the training set compounds most chemically similar to this test compound. The weights with which training set compounds are included in the regression depend on the similarity of those compounds to a test compound. We have applied the ALL-QSAR method to several experimental chemical data sets including 48 anticonvulsant agents with known ED50 values, 48 dopamine D1-receptor antagonists with known competitive binding affinities (Ki), and a Tetrahymena pyriformis data set containing 250 phenolic compounds with toxicity IGC50 values. When applied to database screening, models developed for anticonvulsant agents identified several known anticonvulsant compounds that were not only absent in the training set but highly chemically dissimilar to the training set compounds. This initial success indicates that ALL-QSAR can be further exploited as a general tool for accurate bioactivity prediction and database screening in drug design and discovery. Because of its local nature, the ALL-QSAR approach appears to be especially well-suited for the development of highly predictive models for the sparse or unevenly distributed data sets.
AB - A novel automated lazy learning quantitative structure-activity relationship (ALL-QSAR) modeling approach has been developed on the basis of the lazy learning theory. The activity of a test compound is predicted from a locally weighted linear regression model using chemical descriptors and the biological activity of the training set compounds most chemically similar to this test compound. The weights with which training set compounds are included in the regression depend on the similarity of those compounds to a test compound. We have applied the ALL-QSAR method to several experimental chemical data sets including 48 anticonvulsant agents with known ED50 values, 48 dopamine D1-receptor antagonists with known competitive binding affinities (Ki), and a Tetrahymena pyriformis data set containing 250 phenolic compounds with toxicity IGC50 values. When applied to database screening, models developed for anticonvulsant agents identified several known anticonvulsant compounds that were not only absent in the training set but highly chemically dissimilar to the training set compounds. This initial success indicates that ALL-QSAR can be further exploited as a general tool for accurate bioactivity prediction and database screening in drug design and discovery. Because of its local nature, the ALL-QSAR approach appears to be especially well-suited for the development of highly predictive models for the sparse or unevenly distributed data sets.
UR - http://www.scopus.com/inward/record.url?scp=33750321978&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33750321978&partnerID=8YFLogxK
U2 - 10.1021/ci060132x
DO - 10.1021/ci060132x
M3 - Article
C2 - 16995729
AN - SCOPUS:33750321978
SN - 1549-9596
VL - 46
SP - 1984
EP - 1995
JO - Journal of Chemical Information and Modeling
JF - Journal of Chemical Information and Modeling
IS - 5
ER -