Prediction by Nonparametric Posterior Estimation in Virtual Screening
Prediction by Nonparametric Posterior Estimation in Virtual Screening
The ability to rank molecules according to their effectiveness in some domain, e.g. pesticide, drug, is important owing to the cost of synthesising and testing chemical compounds. Virtual screening seeks to do this computationally with potential savings of millions of pounds and large profits associated with reduced time to market. Recently, binary kernel discrimination (BKD) is introduced and becoming popular in Chemoinformatics domain. It produces scores based on the estimated likelihood ratio of active to inactive compounds that are then ranked. The likelihoods are estimated through a Parzen Windows approach using the binomial distribution function (to accommodate binary descriptor or "fingerprint" vectors representing the presence, or not, of certain sub-structural arrangements of atoms) in place of the usual Gaussian choice. This research aims to compute the likelihood ratio via direct estimate of posterior probability by using non-parametric generalisation of logistic regression the so-called “Kernel Logistic Regression”. Furthermore, complexity is then controlled by penalising the likelihood function by Lq-norm. The compounds are then rank descending on the basis of posterior probability. The 11 activity classes from the MDL Drug Data Report (MDDR) database are used. The results are found to be less accurate than a currently leading approach but are still comparable in a number of cases.
Pasupa, Kitsuchart
952ededb-8c97-41b7-a65b-6aba31de2669
27 March 2007
Pasupa, Kitsuchart
952ededb-8c97-41b7-a65b-6aba31de2669
Pasupa, Kitsuchart
(2007)
Prediction by Nonparametric Posterior Estimation in Virtual Screening.
The 2007 University of Sheffield Symposium On Data Modelling for New Researchers, Sheffield, United Kingdom.
Record type:
Conference or Workshop Item
(Poster)
Abstract
The ability to rank molecules according to their effectiveness in some domain, e.g. pesticide, drug, is important owing to the cost of synthesising and testing chemical compounds. Virtual screening seeks to do this computationally with potential savings of millions of pounds and large profits associated with reduced time to market. Recently, binary kernel discrimination (BKD) is introduced and becoming popular in Chemoinformatics domain. It produces scores based on the estimated likelihood ratio of active to inactive compounds that are then ranked. The likelihoods are estimated through a Parzen Windows approach using the binomial distribution function (to accommodate binary descriptor or "fingerprint" vectors representing the presence, or not, of certain sub-structural arrangements of atoms) in place of the usual Gaussian choice. This research aims to compute the likelihood ratio via direct estimate of posterior probability by using non-parametric generalisation of logistic regression the so-called “Kernel Logistic Regression”. Furthermore, complexity is then controlled by penalising the likelihood function by Lq-norm. The compounds are then rank descending on the basis of posterior probability. The 11 activity classes from the MDL Drug Data Report (MDDR) database are used. The results are found to be less accurate than a currently leading approach but are still comparable in a number of cases.
Text
DMNR2007_Poster.pdf
- Version of Record
More information
Published date: 27 March 2007
Additional Information:
Event Dates: 27 March 2007
Venue - Dates:
The 2007 University of Sheffield Symposium On Data Modelling for New Researchers, Sheffield, United Kingdom, 2007-03-27
Organisations:
Electronics & Computer Science
Identifiers
Local EPrints ID: 266587
URI: http://eprints.soton.ac.uk/id/eprint/266587
PURE UUID: 20efc371-cd4c-47a8-9d77-a84bf4c215e2
Catalogue record
Date deposited: 20 Aug 2008 16:41
Last modified: 14 Mar 2024 08:30
Export record
Contributors
Author:
Kitsuchart Pasupa
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics