The University of Southampton
University of Southampton Institutional Repository

Development and application of methods for resolving molecular diagnoses from patient sequence data for monogenic diseases

Development and application of methods for resolving molecular diagnoses from patient sequence data for monogenic diseases
Development and application of methods for resolving molecular diagnoses from patient sequence data for monogenic diseases
Identifying molecular causes of disease from sequenced genomes can be extremely challenging, and usually requires tiered filtering with the possibility of causal variant(s) being missed. The first stage of this study was focused on understanding the specific properties and features of genes including essentiality, haploinsufficiency, and selection and therefore, linking these properties to facilitate the prediction of disease causal genes. Gene essentiality refers to genes that is required for the survival of the cells. This study found 20 gene-specific scores in the literature, each of which measures various genetic features. It then showed that until now, no reliable single score has been predictive of genic deleteriousness. This systematic review helped in identifying the gaps and challenges in the prediction of disease genes that might have an impact on the diagnosis of monogenic diseases. This information on genes rather than variants broadens the scope of thinking to better assess gene pathogenicity. The second stage gathered all this information to build a model to filter the clinical sequence data and decrease the number of potential disease-causing genes to follow-up. Further, essentiality specific pathogenicity prioritisation (ESPP) was constructed to prioritise disease causing genes and showed improved performance in identifying disease genes that score high—helping to exclude non-disease genes that score low—as compared to any single score. The third stage evaluated the proposed gene-level score to guide prioritization of disease genes by testing the score using multiple databases and integration with alternative scores. This contributes significantly to improving data interpretation. The results were encouraging as two genes, named CNOT1 and RYR3, that were prioritised by ESPP as strong candidates for Mendelian diseases, were subsequently confirmed to be causal. Another finding from the sum of ranks of alternative scores (ESPP, LOEUF and CoNeS) found four genes (SETD1A, SMARCC2, KDM3B, MED12L) that were ranked highly and are now known to contain disease variations. Ultimately, applying such models to monogenic disease patient sequence data will help identify molecular causes for these conditions.
University of Southampton
Alyousfi, Dareen Mohammed
d3304c17-f4a4-4928-a721-cf8886302c0e
Alyousfi, Dareen Mohammed
d3304c17-f4a4-4928-a721-cf8886302c0e
Collins, Andrew
7daa83eb-0b21-43b2-af1a-e38fb36e2a64

Alyousfi, Dareen Mohammed (2022) Development and application of methods for resolving molecular diagnoses from patient sequence data for monogenic diseases. University of Southampton, Doctoral Thesis, 210pp.

Record type: Thesis (Doctoral)

Abstract

Identifying molecular causes of disease from sequenced genomes can be extremely challenging, and usually requires tiered filtering with the possibility of causal variant(s) being missed. The first stage of this study was focused on understanding the specific properties and features of genes including essentiality, haploinsufficiency, and selection and therefore, linking these properties to facilitate the prediction of disease causal genes. Gene essentiality refers to genes that is required for the survival of the cells. This study found 20 gene-specific scores in the literature, each of which measures various genetic features. It then showed that until now, no reliable single score has been predictive of genic deleteriousness. This systematic review helped in identifying the gaps and challenges in the prediction of disease genes that might have an impact on the diagnosis of monogenic diseases. This information on genes rather than variants broadens the scope of thinking to better assess gene pathogenicity. The second stage gathered all this information to build a model to filter the clinical sequence data and decrease the number of potential disease-causing genes to follow-up. Further, essentiality specific pathogenicity prioritisation (ESPP) was constructed to prioritise disease causing genes and showed improved performance in identifying disease genes that score high—helping to exclude non-disease genes that score low—as compared to any single score. The third stage evaluated the proposed gene-level score to guide prioritization of disease genes by testing the score using multiple databases and integration with alternative scores. This contributes significantly to improving data interpretation. The results were encouraging as two genes, named CNOT1 and RYR3, that were prioritised by ESPP as strong candidates for Mendelian diseases, were subsequently confirmed to be causal. Another finding from the sum of ranks of alternative scores (ESPP, LOEUF and CoNeS) found four genes (SETD1A, SMARCC2, KDM3B, MED12L) that were ranked highly and are now known to contain disease variations. Ultimately, applying such models to monogenic disease patient sequence data will help identify molecular causes for these conditions.

Text
Development and application of methods for resolving molecular diagnoses from patient sequence data for monogenic diseases - Version of Record
Available under License University of Southampton Thesis Licence.
Download (5MB)
Text
Permission to deposit thesis - form_TAN
Restricted to Repository staff only

More information

Submitted date: July 2021
Published date: 30 June 2022

Identifiers

Local EPrints ID: 474722
URI: http://eprints.soton.ac.uk/id/eprint/474722
PURE UUID: c1bf715a-74ea-4436-a683-2416a6d78389
ORCID for Andrew Collins: ORCID iD orcid.org/0000-0001-7108-0771

Catalogue record

Date deposited: 02 Mar 2023 17:30
Last modified: 17 Mar 2024 02:38

Export record

Contributors

Author: Dareen Mohammed Alyousfi
Thesis advisor: Andrew Collins ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×