Using varying negative examples to improve computational predictions of transcription factor binding sites
Using varying negative examples to improve computational predictions of transcription factor binding sites
The identification of transcription factor binding sites (TFBSs) is a non-trivial problem as the existing computational predictors produce a lot of false predictions. Though it is proven that combining these predictions with a meta-classifier, like Support Vector Machines (SVMs), can improve the overall results, this improvement is not as significant as expected. The reason for this is that the predictors are not reliable for the negative examples from non-binding sites in the promoter region. Therefore, using negative examples from different sources during training an SVM can be one of the solutions to this problem. In this study, we used different types of negative examples during training the classifier. These negative examples can be far away from the promoter regions or produced by randomisation or from the intronic region of genes. By using these negative examples during training, we observed their effect in improving predictions of TFBSs in the yeast. We also used a modified cross-validation method for this type of problem. Thus we observed substantial improvement in the classifier performance that could constitute a model for predicting TFBSs. Therefore, the major contribution of the analysis is that for the yeast genome, the position of binding sites could be predicted with high confidence using our technique and the predictions are of much higher quality than the predictions of the original prediction algorithms.
234-243
Springer Berlin, Heidelberg
Rezwan, Faisal
203f8f38-1f5d-485b-ab11-c546b4276338
Sun, Yi
52b4df91-6eec-4c04-8106-7cd195f1d0a6
Davey, Neil
45038a2a-60fa-475b-be2b-72b23c97bb0c
Adams, Rod
aba52023-234f-464a-b86f-504b200dc950
Rust, Alistair G.
27e6975d-abef-4037-a8ff-74b2a18cb687
Robinson, Mark
0191ef40-12cc-4b4d-9bcd-5547087add95
2012
Rezwan, Faisal
203f8f38-1f5d-485b-ab11-c546b4276338
Sun, Yi
52b4df91-6eec-4c04-8106-7cd195f1d0a6
Davey, Neil
45038a2a-60fa-475b-be2b-72b23c97bb0c
Adams, Rod
aba52023-234f-464a-b86f-504b200dc950
Rust, Alistair G.
27e6975d-abef-4037-a8ff-74b2a18cb687
Robinson, Mark
0191ef40-12cc-4b4d-9bcd-5547087add95
Rezwan, Faisal, Sun, Yi, Davey, Neil, Adams, Rod, Rust, Alistair G. and Robinson, Mark
(2012)
Using varying negative examples to improve computational predictions of transcription factor binding sites.
In Engineering Applications of Neural Networks - 13th International Conference, EANN 2012, Proceedings.
vol. 311,
Springer Berlin, Heidelberg.
.
(doi:10.1007/978-3-642-32909-8_24).
Record type:
Conference or Workshop Item
(Paper)
Abstract
The identification of transcription factor binding sites (TFBSs) is a non-trivial problem as the existing computational predictors produce a lot of false predictions. Though it is proven that combining these predictions with a meta-classifier, like Support Vector Machines (SVMs), can improve the overall results, this improvement is not as significant as expected. The reason for this is that the predictors are not reliable for the negative examples from non-binding sites in the promoter region. Therefore, using negative examples from different sources during training an SVM can be one of the solutions to this problem. In this study, we used different types of negative examples during training the classifier. These negative examples can be far away from the promoter regions or produced by randomisation or from the intronic region of genes. By using these negative examples during training, we observed their effect in improving predictions of TFBSs in the yeast. We also used a modified cross-validation method for this type of problem. Thus we observed substantial improvement in the classifier performance that could constitute a model for predicting TFBSs. Therefore, the major contribution of the analysis is that for the yeast genome, the position of binding sites could be predicted with high confidence using our technique and the predictions are of much higher quality than the predictions of the original prediction algorithms.
This record has no associated files available for download.
More information
Published date: 2012
Venue - Dates:
2012 International Conference on Artificial Intelligence and Computational Intelligence, AICI 2012, , Chengdu, China, 2012-10-26 - 2012-10-28
Identifiers
Local EPrints ID: 413747
URI: http://eprints.soton.ac.uk/id/eprint/413747
ISSN: 18650929
PURE UUID: 0c1114be-307d-441f-aac4-29fd91628eb8
Catalogue record
Date deposited: 04 Sep 2017 16:30
Last modified: 06 Jun 2024 01:51
Export record
Altmetrics
Contributors
Author:
Faisal Rezwan
Author:
Yi Sun
Author:
Neil Davey
Author:
Rod Adams
Author:
Alistair G. Rust
Author:
Mark Robinson
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics