TransformerGO: predicting protein–protein interactions by modelling the attention between sets of gene ontology terms

Motivation: protein–protein interactions (PPIs) play a key role in diverse biological processes but only a small subset of the interactions has been experimentally identified. Additionally, high-throughput experimental techniques that detect PPIs are known to suffer various limitations, such as exaggerated false positives and negatives rates. The semantic similarity derived from the Gene Ontology (GO) annotation is regarded as one of the most powerful indicators for protein interactions. However, while computational approaches for prediction of PPIs have gained popularity in recent years, most methods fail to capture the specificity of GO terms.

Results: we propose TransformerGO, a model that is capable of capturing the semantic similarity between GO sets dynamically using an attention mechanism. We generate dense graph embeddings for GO terms using an algorithmic framework for learning continuous representations of nodes in networks called node2vec. TransformerGO learns deep semantic relations between annotated terms and can distinguish between negative and positive interactions with high accuracy. TransformerGO outperforms classic semantic similarity measures on gold standard PPI datasets and state-of-the-art machine-learning-based approaches on large datasets from Saccharomyces cerevisiae and Homo sapiens. We show how the neural attention mechanism embedded in the transformer architecture detects relevant functional terms when predicting interactions.

Availability and implementation: https://github.com/Ieremie/TransformerGO.

Supplementary information: supplementary data are available at Bioinformatics online.

10.1093/bioinformatics/btac104

1367-4803

2269-2277

Ieremie, Ioan

f7eba675-d7c3-42f9-a1c4-47f51b538acb

Ewing, Rob M.

022c5b04-da20-4e55-8088-44d0dc9935ae

Niranjan, Mahesan

5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Martelli, Pier luigi

e72158b4-4ad6-4d37-99f8-9344fd67efaa

4 March 2022

Ieremie, Ioan

f7eba675-d7c3-42f9-a1c4-47f51b538acb

Ewing, Rob M.

022c5b04-da20-4e55-8088-44d0dc9935ae

Niranjan, Mahesan

5cbaeea8-7288-4b55-a89c-c43d212ddd4f

Martelli, Pier luigi

e72158b4-4ad6-4d37-99f8-9344fd67efaa

Ieremie, Ioan, Ewing, Rob M. and Niranjan, Mahesan , Martelli, Pier luigi (ed.) (2022) TransformerGO: predicting protein–protein interactions by modelling the attention between sets of gene ontology terms. Bioinformatics, 38 (8), 2269-2277. (doi:10.1093/bioinformatics/btac104).

Record type: Article

Abstract

Text

btac104 - Version of Record

Available under License Creative Commons Attribution.

Download (2MB)

More information

Accepted/In Press date: 15 February 2022

e-pub ahead of print date: 17 February 2022

Published date: 4 March 2022

Learn more about Vision, Learning and Control research Learn more about Institute for Life Sciences research Learn more about School of Electronics and Computer Science research Learn more about Institute for Life Sciences research

Identifiers

Local EPrints ID: 495956

URI: http://eprints.soton.ac.uk/id/eprint/495956

DOI: doi:10.1093/bioinformatics/btac104

ISSN: 1367-4803

PURE UUID: cccf1c39-635b-4340-92e0-1e1c21de7238

ORCID for Rob M. Ewing:

orcid.org/0000-0001-6510-4001

ORCID for Mahesan Niranjan:

orcid.org/0000-0001-7021-140X

Catalogue record

Date deposited: 28 Nov 2024 17:32

Last modified: 30 Nov 2024 02:48

Export record

Altmetrics

Share this record

Share this on Facebook Share this on Twitter Share this on Weibo

Contributors

Author: Ioan Ieremie

Author: Rob M. Ewing

Author: Mahesan Niranjan

Editor: Pier luigi Martelli

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Library staff additional information