The University of Southampton
University of Southampton Institutional Repository

Enhancing Automatic Construction of Gene Subnetworks by Integrating Multiple Sources of Information

Enhancing Automatic Construction of Gene Subnetworks by Integrating Multiple Sources of Information
Enhancing Automatic Construction of Gene Subnetworks by Integrating Multiple Sources of Information
We present an approach to extracting information from textual documents of biological knowledge and demonstrate how cellular gene pathways may be inferred. Natural language processing techniques are used to represent title and abstract fields of publications to derive a gene similarity vectors which are subject to cluster analysis. Gene interactions are derived by parsing sentences in the abstracts to infer causal relationships. We show how high throughput transcriptome data may then be used to enhance the construction of gene pathways from information derived from text. Subnetworks constructed by integrating information automatically derived from literature with gene expression data is validated by comparing biological processes defined in the Gene Ontology 2(GO) database. We find that precision increases in $$58\%$$ of the clusters when enhanced in this manner while a decrease in precision is observed in a relatively small number of clusters. These results are compared to similar attempts at the same problem and appear to be better in terms of precision of network construction. We also show an example of a subnetwork found by this analysis that overlaps a known gene pathway in KEGG and MIPS databases.
1939-8018
331-340
Suwannaroj, Sujimarn
34fd3fae-4efb-4fae-9c57-0cf952041090
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Suwannaroj, Sujimarn
34fd3fae-4efb-4fae-9c57-0cf952041090
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Suwannaroj, Sujimarn and Niranjan, Mahesan (2008) Enhancing Automatic Construction of Gene Subnetworks by Integrating Multiple Sources of Information. Journal of Signal Processing Systems, 50 (3), 331-340.

Record type: Article

Abstract

We present an approach to extracting information from textual documents of biological knowledge and demonstrate how cellular gene pathways may be inferred. Natural language processing techniques are used to represent title and abstract fields of publications to derive a gene similarity vectors which are subject to cluster analysis. Gene interactions are derived by parsing sentences in the abstracts to infer causal relationships. We show how high throughput transcriptome data may then be used to enhance the construction of gene pathways from information derived from text. Subnetworks constructed by integrating information automatically derived from literature with gene expression data is validated by comparing biological processes defined in the Gene Ontology 2(GO) database. We find that precision increases in $$58\%$$ of the clusters when enhanced in this manner while a decrease in precision is observed in a relatively small number of clusters. These results are compared to similar attempts at the same problem and appear to be better in terms of precision of network construction. We also show an example of a subnetwork found by this analysis that overlaps a known gene pathway in KEGG and MIPS databases.

Text
SujimarnPaper.pdf - Other
Download (29kB)

More information

Published date: 6 February 2008
Organisations: Southampton Wireless Group

Identifiers

Local EPrints ID: 266705
URI: http://eprints.soton.ac.uk/id/eprint/266705
ISSN: 1939-8018
PURE UUID: 4eeec8ab-4582-4b7a-b146-eb85aeab5842
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 24 Sep 2008 07:41
Last modified: 15 Mar 2024 03:29

Export record

Contributors

Author: Sujimarn Suwannaroj
Author: Mahesan Niranjan ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×