A Morse-theoretical clustering algorithm for annotated networks and spectral bounds for fuzzy clustering
A Morse-theoretical clustering algorithm for annotated networks and spectral bounds for fuzzy clustering
Given a set of objects X a clustering algorithm is a formal procedure that groups together objects which are similar and separates the ones which are not, thus mimicking the human ability to categorise and group together objects. Clustering algorithms have been growing for decades and clustering has become a standard data analytic technique for many fields. Standard clustering methods however fail to integrate object metadata, often readily available to the user, in the analysis.
We present in this thesis a novel clustering algorithm, called Morse, which integrates metadata information and Morse theory, a well-known topological theory, to reveal the "basins of attraction" induced by the metadata. The algorithm is described in its general form together with a study of its performance on the LFR benchmark model. We tested Morse in a real-world scenario and showed it helped to identify phenotypes of asthma based on blood gene expression profiles. We also looked at Morse in the axiomatic setting proposed by Kleinberg and introduce a novel axiom, Monotonic Consistency, that avoids the widely-reported problematic behaviour of Kleinberg's Consistency, and a possibility result for Monotonic Consistency given again by Morse. Furthermore, we extended Kleinberg's axiomatic setting to graph clustering and proved an impossibility result for Consistency, and a possibility result for Monotonic Consistency given again by Morse.
Lastly, we explored how a general clustering algorithm affects the structure of a graph using a graph spectral distance. In this direction, we proved two different bounds for such distance with respect a graph and its quotient graph induced by a hard partition, and generalised these results to fuzzy partitions.
University of Southampton
Strazzeri, Fabio
2fa6d25b-1ab5-43b9-a21c-c1e1454d0cb1
November 2018
Strazzeri, Fabio
2fa6d25b-1ab5-43b9-a21c-c1e1454d0cb1
Sanchez Garcia, Ruben
8246cea2-ae1c-44f2-94e9-bacc9371c3ed
Strazzeri, Fabio
(2018)
A Morse-theoretical clustering algorithm for annotated networks and spectral bounds for fuzzy clustering.
University of Southampton, Doctoral Thesis, 145pp.
Record type:
Thesis
(Doctoral)
Abstract
Given a set of objects X a clustering algorithm is a formal procedure that groups together objects which are similar and separates the ones which are not, thus mimicking the human ability to categorise and group together objects. Clustering algorithms have been growing for decades and clustering has become a standard data analytic technique for many fields. Standard clustering methods however fail to integrate object metadata, often readily available to the user, in the analysis.
We present in this thesis a novel clustering algorithm, called Morse, which integrates metadata information and Morse theory, a well-known topological theory, to reveal the "basins of attraction" induced by the metadata. The algorithm is described in its general form together with a study of its performance on the LFR benchmark model. We tested Morse in a real-world scenario and showed it helped to identify phenotypes of asthma based on blood gene expression profiles. We also looked at Morse in the axiomatic setting proposed by Kleinberg and introduce a novel axiom, Monotonic Consistency, that avoids the widely-reported problematic behaviour of Kleinberg's Consistency, and a possibility result for Monotonic Consistency given again by Morse. Furthermore, we extended Kleinberg's axiomatic setting to graph clustering and proved an impossibility result for Consistency, and a possibility result for Monotonic Consistency given again by Morse.
Lastly, we explored how a general clustering algorithm affects the structure of a graph using a graph spectral distance. In this direction, we proved two different bounds for such distance with respect a graph and its quotient graph induced by a hard partition, and generalised these results to fuzzy partitions.
Text
Final thesis
- Version of Record
More information
Published date: November 2018
Identifiers
Local EPrints ID: 435291
URI: http://eprints.soton.ac.uk/id/eprint/435291
PURE UUID: 24ee60a2-63f9-4f94-9a87-845a4aa72a87
Catalogue record
Date deposited: 30 Oct 2019 17:30
Last modified: 17 Mar 2024 03:21
Export record
Contributors
Author:
Fabio Strazzeri
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics