The University of Southampton
University of Southampton Institutional Repository

An Information-Theoretic Definition of Cell Type

An Information-Theoretic Definition of Cell Type
An Information-Theoretic Definition of Cell Type
Individual cells are often classified into cell ‘types’ based on the expression of so-called marker genes. Such marker-based classification assumes that cells of a given type are (at least approximately) interchangeable with respect to the expression of their associated markers. This traditional approach to cellular classification has been disrupted by single-cell RNA-sequencing technologies, which are able to measure genome-wide gene expression across thousands of individual cells. While potentially providing a wealth of data for cellular classification, these technologies have revealed that cells ostensibly of the same type are often highly heterogeneous (i.e. not interchangeable) with respect to the expression of established marker genes.
A myriad of single-cell clustering methods has recently been developed to overcome the issue of heterogeneity with respect to marker gene expression and identify cell types directly from single-cell expression data. These methods typically proceed via: (1) unsupervised identification of clusters from single-cell expression data sets; (2) mapping of identified clusters to known cell types based on the expression of previously established marker genes. However, this two-step cluster-based approach to cellular classification is less biologically intuitive than the traditional marker-based approach, involving substantial mathematical and biological assumptions regarding the nature of cell type.
In this thesis, I formalise the traditional marker gene approach to cellular classification using notions from information theory, and show how this formalism can be applied to identifying cell types from single-cell RNA-sequencing data. Specifically, I develop a novel clustering method based on the assumption that cells of the same type should be minimally heterogeneous – i.e. approximately interchangeable – with respect to the measured expression of a set of genes. Thus, this work offers an intuitive, formal definition of cell type that unites the traditional and current approaches to cellular classification through the mathematics of information theory.
University of Southampton
Casey, Michael, John
3f316614-e401-4955-b400-0815e03af431
Casey, Michael, John
3f316614-e401-4955-b400-0815e03af431
Macarthur, Benjamin
2c0476e7-5d3e-4064-81bb-104e8e88bb6b

Casey, Michael, John (2021) An Information-Theoretic Definition of Cell Type. University of Southampton, Doctoral Thesis, 166pp.

Record type: Thesis (Doctoral)

Abstract

Individual cells are often classified into cell ‘types’ based on the expression of so-called marker genes. Such marker-based classification assumes that cells of a given type are (at least approximately) interchangeable with respect to the expression of their associated markers. This traditional approach to cellular classification has been disrupted by single-cell RNA-sequencing technologies, which are able to measure genome-wide gene expression across thousands of individual cells. While potentially providing a wealth of data for cellular classification, these technologies have revealed that cells ostensibly of the same type are often highly heterogeneous (i.e. not interchangeable) with respect to the expression of established marker genes.
A myriad of single-cell clustering methods has recently been developed to overcome the issue of heterogeneity with respect to marker gene expression and identify cell types directly from single-cell expression data. These methods typically proceed via: (1) unsupervised identification of clusters from single-cell expression data sets; (2) mapping of identified clusters to known cell types based on the expression of previously established marker genes. However, this two-step cluster-based approach to cellular classification is less biologically intuitive than the traditional marker-based approach, involving substantial mathematical and biological assumptions regarding the nature of cell type.
In this thesis, I formalise the traditional marker gene approach to cellular classification using notions from information theory, and show how this formalism can be applied to identifying cell types from single-cell RNA-sequencing data. Specifically, I develop a novel clustering method based on the assumption that cells of the same type should be minimally heterogeneous – i.e. approximately interchangeable – with respect to the measured expression of a set of genes. Thus, this work offers an intuitive, formal definition of cell type that unites the traditional and current approaches to cellular classification through the mathematics of information theory.

Text
Thesis - Version of Record
Available under License University of Southampton Thesis Licence.
Download (37MB)
Text
Permission to deposit thesis (signed) - Version of Record
Restricted to Repository staff only
Available under License University of Southampton Thesis Licence.

More information

Submitted date: October 2021

Identifiers

Local EPrints ID: 456818
URI: http://eprints.soton.ac.uk/id/eprint/456818
PURE UUID: 1e08cf57-7535-4abf-b5e9-b324f8fdb3bb
ORCID for Benjamin Macarthur: ORCID iD orcid.org/0000-0002-5396-9750

Catalogue record

Date deposited: 12 May 2022 16:33
Last modified: 17 Mar 2024 02:51

Export record

Contributors

Author: Michael, John Casey
Thesis advisor: Benjamin Macarthur ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×