Remote homology detection using a kernel method that combines sequence and secondary-structure similarity scores
Remote homology detection using a kernel method that combines sequence and secondary-structure similarity scores
Distant evolutionary relationships between proteins with low sequence similarity are difficult to recognise by computational methods. Consequently, many sequences obtained from large-scale sequencing projects cannot be assigned to any known proteins or families despite being evolutionarily related. To boost sensitivity, various sequence-based methods have been modified to make use of the better conserved secondary structure. Most of these methods are instance-based or generative. Here, we introduce a kernel-based remote homology detection method that allows for a combination of sequence and secondary-structure similarity scores in a discriminative approach. We studied the ability of the method to predict superfamily membership as defined by the SCOP database. We show that a kernel method that combined sequence similarity scores with predicted secondary-structure similarity scores performed similar to a classifier that used scores calculated from sequences and true secondary structures, but performed better than a sequence-only based classifier and achieved a better mean than recently published results on the same data-set. It can be concluded that SVM classifiers trained to predict homology between distantly related proteins, become more accurate, if a joint sequence/secondary-structure similarity score approach is used.
9
Wieser, Daniela
613cebb9-0a7c-427b-a695-cb7fa613b3a5
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
2009
Wieser, Daniela
613cebb9-0a7c-427b-a695-cb7fa613b3a5
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Wieser, Daniela and Niranjan, Mahesan
(2009)
Remote homology detection using a kernel method that combines sequence and secondary-structure similarity scores.
In Silico Biology, 9, .
Abstract
Distant evolutionary relationships between proteins with low sequence similarity are difficult to recognise by computational methods. Consequently, many sequences obtained from large-scale sequencing projects cannot be assigned to any known proteins or families despite being evolutionarily related. To boost sensitivity, various sequence-based methods have been modified to make use of the better conserved secondary structure. Most of these methods are instance-based or generative. Here, we introduce a kernel-based remote homology detection method that allows for a combination of sequence and secondary-structure similarity scores in a discriminative approach. We studied the ability of the method to predict superfamily membership as defined by the SCOP database. We show that a kernel method that combined sequence similarity scores with predicted secondary-structure similarity scores performed similar to a classifier that used scores calculated from sequences and true secondary structures, but performed better than a sequence-only based classifier and achieved a better mean than recently published results on the same data-set. It can be concluded that SVM classifiers trained to predict homology between distantly related proteins, become more accurate, if a joint sequence/secondary-structure similarity score approach is used.
This record has no associated files available for download.
More information
Published date: 2009
Organisations:
Southampton Wireless Group
Identifiers
Local EPrints ID: 268189
URI: http://eprints.soton.ac.uk/id/eprint/268189
PURE UUID: 4ea32249-2842-47d7-835c-9a4c80cac48d
Catalogue record
Date deposited: 11 Nov 2009 14:32
Last modified: 08 Jan 2022 03:06
Export record
Contributors
Author:
Daniela Wieser
Author:
Mahesan Niranjan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics