The University of Southampton
University of Southampton Institutional Repository

Discovering microproteins: making the most of ribosome profiling data

Discovering microproteins: making the most of ribosome profiling data
Discovering microproteins: making the most of ribosome profiling data
Building a reference set of protein-coding open reading frames (ORFs) has revolutionized biological process discovery and understanding. Traditionally, gene models have been confirmed using cDNA sequencing and encoded translated regions inferred using sequence-based detection of start and stop combinations longer than 100 amino-acids to prevent false positives. This has led to small ORFs (smORFs) and their encoded proteins left un-annotated. Ribo-seq allows deciphering translated regions from untranslated irrespective of the length. In this review, we describe the power of Ribo-seq data in detection of smORFs while discussing the major challenge posed by data-quality, -depth and -sparseness in identifying the start and end of smORF translation. In particular, we outline smORF cataloguing efforts in humans and the large differences that have arisen due to variation in data, methods and assumptions. Although current versions of smORF reference sets can already be used as a powerful tool for hypothesis generation, we recommend that future editions should consider these data limitations and adopt unified processing for the community to establish a canonical catalogue of translated smORFs.
1547-6286
943-954
Chothani, Sonia
24850611-01f3-46ae-af99-8c2693e6ca8f
Ho, Lena
c8a0385a-14e4-4ce4-94fe-69b4fe351ad1
Schafer, Sebastian
dbe31362-99e3-4ec1-b791-0463b1a0e255
Rackham, Owen
8122eb1f-6e9f-4da5-90e1-ce108ccbbcbf
Chothani, Sonia
24850611-01f3-46ae-af99-8c2693e6ca8f
Ho, Lena
c8a0385a-14e4-4ce4-94fe-69b4fe351ad1
Schafer, Sebastian
dbe31362-99e3-4ec1-b791-0463b1a0e255
Rackham, Owen
8122eb1f-6e9f-4da5-90e1-ce108ccbbcbf

Chothani, Sonia, Ho, Lena, Schafer, Sebastian and Rackham, Owen (2023) Discovering microproteins: making the most of ribosome profiling data. RNA Biology, 20 (1), 943-954. (doi:10.1080/15476286.2023.2279845).

Record type: Article

Abstract

Building a reference set of protein-coding open reading frames (ORFs) has revolutionized biological process discovery and understanding. Traditionally, gene models have been confirmed using cDNA sequencing and encoded translated regions inferred using sequence-based detection of start and stop combinations longer than 100 amino-acids to prevent false positives. This has led to small ORFs (smORFs) and their encoded proteins left un-annotated. Ribo-seq allows deciphering translated regions from untranslated irrespective of the length. In this review, we describe the power of Ribo-seq data in detection of smORFs while discussing the major challenge posed by data-quality, -depth and -sparseness in identifying the start and end of smORF translation. In particular, we outline smORF cataloguing efforts in humans and the large differences that have arisen due to variation in data, methods and assumptions. Although current versions of smORF reference sets can already be used as a powerful tool for hypothesis generation, we recommend that future editions should consider these data limitations and adopt unified processing for the community to establish a canonical catalogue of translated smORFs.

Text
Discovering microproteins making the most of ribosome profiling data - Version of Record
Available under License Creative Commons Attribution.
Download (3MB)

More information

Accepted/In Press date: 30 October 2023
e-pub ahead of print date: 27 November 2023

Identifiers

Local EPrints ID: 502688
URI: http://eprints.soton.ac.uk/id/eprint/502688
ISSN: 1547-6286
PURE UUID: 1afaef7d-f444-4a5e-8b0a-eb3c9adee940
ORCID for Owen Rackham: ORCID iD orcid.org/0000-0002-4390-0872

Catalogue record

Date deposited: 04 Jul 2025 16:42
Last modified: 22 Aug 2025 02:30

Export record

Altmetrics

Contributors

Author: Sonia Chothani
Author: Lena Ho
Author: Sebastian Schafer
Author: Owen Rackham ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×