The University of Southampton
University of Southampton Institutional Repository

A Morse-theoretical clustering algorithm for annotated networks and spectral bounds for fuzzy clustering

A Morse-theoretical clustering algorithm for annotated networks and spectral bounds for fuzzy clustering
A Morse-theoretical clustering algorithm for annotated networks and spectral bounds for fuzzy clustering
Given a set of objects X a clustering algorithm is a formal procedure that groups together objects which are similar and separates the ones which are not, thus mimicking the human ability to categorise and group together objects. Clustering algorithms have been growing for decades and clustering has become a standard data analytic technique for many fields. Standard clustering methods however fail to integrate object metadata, often readily available to the user, in the analysis.

We present in this thesis a novel clustering algorithm, called Morse, which integrates metadata information and Morse theory, a well-known topological theory, to reveal the "basins of attraction" induced by the metadata. The algorithm is described in its general form together with a study of its performance on the LFR benchmark model. We tested Morse in a real-world scenario and showed it helped to identify phenotypes of asthma based on blood gene expression profiles. We also looked at Morse in the axiomatic setting proposed by Kleinberg and introduce a novel axiom, Monotonic Consistency, that avoids the widely-reported problematic behaviour of Kleinberg's Consistency, and a possibility result for Monotonic Consistency given again by Morse. Furthermore, we extended Kleinberg's axiomatic setting to graph clustering and proved an impossibility result for Consistency, and a possibility result for Monotonic Consistency given again by Morse.

Lastly, we explored how a general clustering algorithm affects the structure of a graph using a graph spectral distance. In this direction, we proved two different bounds for such distance with respect a graph and its quotient graph induced by a hard partition, and generalised these results to fuzzy partitions.
University of Southampton
Strazzeri, Fabio
2fa6d25b-1ab5-43b9-a21c-c1e1454d0cb1
Strazzeri, Fabio
2fa6d25b-1ab5-43b9-a21c-c1e1454d0cb1
Sanchez Garcia, Ruben
8246cea2-ae1c-44f2-94e9-bacc9371c3ed

Strazzeri, Fabio (2018) A Morse-theoretical clustering algorithm for annotated networks and spectral bounds for fuzzy clustering. University of Southampton, Doctoral Thesis, 145pp.

Record type: Thesis (Doctoral)

Abstract

Given a set of objects X a clustering algorithm is a formal procedure that groups together objects which are similar and separates the ones which are not, thus mimicking the human ability to categorise and group together objects. Clustering algorithms have been growing for decades and clustering has become a standard data analytic technique for many fields. Standard clustering methods however fail to integrate object metadata, often readily available to the user, in the analysis.

We present in this thesis a novel clustering algorithm, called Morse, which integrates metadata information and Morse theory, a well-known topological theory, to reveal the "basins of attraction" induced by the metadata. The algorithm is described in its general form together with a study of its performance on the LFR benchmark model. We tested Morse in a real-world scenario and showed it helped to identify phenotypes of asthma based on blood gene expression profiles. We also looked at Morse in the axiomatic setting proposed by Kleinberg and introduce a novel axiom, Monotonic Consistency, that avoids the widely-reported problematic behaviour of Kleinberg's Consistency, and a possibility result for Monotonic Consistency given again by Morse. Furthermore, we extended Kleinberg's axiomatic setting to graph clustering and proved an impossibility result for Consistency, and a possibility result for Monotonic Consistency given again by Morse.

Lastly, we explored how a general clustering algorithm affects the structure of a graph using a graph spectral distance. In this direction, we proved two different bounds for such distance with respect a graph and its quotient graph induced by a hard partition, and generalised these results to fuzzy partitions.

Text
Final thesis - Version of Record
Available under License University of Southampton Thesis Licence.
Download (6MB)

More information

Published date: November 2018

Identifiers

Local EPrints ID: 435291
URI: http://eprints.soton.ac.uk/id/eprint/435291
PURE UUID: 24ee60a2-63f9-4f94-9a87-845a4aa72a87
ORCID for Ruben Sanchez Garcia: ORCID iD orcid.org/0000-0001-6479-3028

Catalogue record

Date deposited: 30 Oct 2019 17:30
Last modified: 17 Mar 2024 03:21

Export record

Contributors

Author: Fabio Strazzeri
Thesis advisor: Ruben Sanchez Garcia ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×