The University of Southampton
University of Southampton Institutional Repository

TransformerGO: predicting protein–protein interactions by modelling the attention between sets of gene ontology terms

TransformerGO: predicting protein–protein interactions by modelling the attention between sets of gene ontology terms
TransformerGO: predicting protein–protein interactions by modelling the attention between sets of gene ontology terms
Motivation: protein–protein interactions (PPIs) play a key role in diverse biological processes but only a small subset of the interactions has been experimentally identified. Additionally, high-throughput experimental techniques that detect PPIs are known to suffer various limitations, such as exaggerated false positives and negatives rates. The semantic similarity derived from the Gene Ontology (GO) annotation is regarded as one of the most powerful indicators for protein interactions. However, while computational approaches for prediction of PPIs have gained popularity in recent years, most methods fail to capture the specificity of GO terms.

Results: we propose TransformerGO, a model that is capable of capturing the semantic similarity between GO sets dynamically using an attention mechanism. We generate dense graph embeddings for GO terms using an algorithmic framework for learning continuous representations of nodes in networks called node2vec. TransformerGO learns deep semantic relations between annotated terms and can distinguish between negative and positive interactions with high accuracy. TransformerGO outperforms classic semantic similarity measures on gold standard PPI datasets and state-of-the-art machine-learning-based approaches on large datasets from Saccharomyces cerevisiae and Homo sapiens. We show how the neural attention mechanism embedded in the transformer architecture detects relevant functional terms when predicting interactions.

Availability and implementation: https://github.com/Ieremie/TransformerGO.

Supplementary information: supplementary data are available at Bioinformatics online.
1367-4803
2269-2277
Ieremie, Ioan
f7eba675-d7c3-42f9-a1c4-47f51b538acb
Ewing, Rob M.
022c5b04-da20-4e55-8088-44d0dc9935ae
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Martelli, Pier luigi
e72158b4-4ad6-4d37-99f8-9344fd67efaa
Ieremie, Ioan
f7eba675-d7c3-42f9-a1c4-47f51b538acb
Ewing, Rob M.
022c5b04-da20-4e55-8088-44d0dc9935ae
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Martelli, Pier luigi
e72158b4-4ad6-4d37-99f8-9344fd67efaa

Ieremie, Ioan, Ewing, Rob M. and Niranjan, Mahesan , Martelli, Pier luigi (ed.) (2022) TransformerGO: predicting protein–protein interactions by modelling the attention between sets of gene ontology terms. Bioinformatics, 38 (8), 2269-2277. (doi:10.1093/bioinformatics/btac104).

Record type: Article

Abstract

Motivation: protein–protein interactions (PPIs) play a key role in diverse biological processes but only a small subset of the interactions has been experimentally identified. Additionally, high-throughput experimental techniques that detect PPIs are known to suffer various limitations, such as exaggerated false positives and negatives rates. The semantic similarity derived from the Gene Ontology (GO) annotation is regarded as one of the most powerful indicators for protein interactions. However, while computational approaches for prediction of PPIs have gained popularity in recent years, most methods fail to capture the specificity of GO terms.

Results: we propose TransformerGO, a model that is capable of capturing the semantic similarity between GO sets dynamically using an attention mechanism. We generate dense graph embeddings for GO terms using an algorithmic framework for learning continuous representations of nodes in networks called node2vec. TransformerGO learns deep semantic relations between annotated terms and can distinguish between negative and positive interactions with high accuracy. TransformerGO outperforms classic semantic similarity measures on gold standard PPI datasets and state-of-the-art machine-learning-based approaches on large datasets from Saccharomyces cerevisiae and Homo sapiens. We show how the neural attention mechanism embedded in the transformer architecture detects relevant functional terms when predicting interactions.

Availability and implementation: https://github.com/Ieremie/TransformerGO.

Supplementary information: supplementary data are available at Bioinformatics online.

Text
btac104 - Version of Record
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 15 February 2022
e-pub ahead of print date: 17 February 2022
Published date: 4 March 2022

Identifiers

Local EPrints ID: 495956
URI: http://eprints.soton.ac.uk/id/eprint/495956
ISSN: 1367-4803
PURE UUID: cccf1c39-635b-4340-92e0-1e1c21de7238
ORCID for Rob M. Ewing: ORCID iD orcid.org/0000-0001-6510-4001
ORCID for Mahesan Niranjan: ORCID iD orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 28 Nov 2024 17:32
Last modified: 30 Nov 2024 02:48

Export record

Altmetrics

Contributors

Author: Ioan Ieremie
Author: Rob M. Ewing ORCID iD
Author: Mahesan Niranjan ORCID iD
Editor: Pier luigi Martelli

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×