A numerical measure of the instability of Mapper-type algorithms
A numerical measure of the instability of Mapper-type algorithms
Mapper is an unsupervised machine learning algorithm generalising the notion of clustering to obtain a geometric description of a dataset. The procedure splits the data into possibly overlapping bins which are then clustered. The output of the algorithm is a graph where nodes represent clusters and edges represent the sharing of data points between two clusters. However, several parameters must be selected before applying Mapper and the resulting graph may vary dramatically with the choice of parameters.
We define an intrinsic notion of Mapper instability that measures the variability of the output as a function of the choice of parameters required to construct a Mapper output. Our results and discussion are general and apply to all Mapper-type algorithms. We derive theoretical results that provide estimates for the instability and suggest practical ways to control it. We provide also experiments to illustrate our results and in particular we demonstrate that a reliable candidate Mapper output can be identified as a local minimum of instability regarded as a function of Mapper input parameters.
1-45
Belchi Guillamon, Francisco
41c7c5e5-b259-45d8-89f9-7b7937517c53
Brodzki, Jacek
b1fe25fd-5451-4fd0-b24b-c59b75710543
Burfitt, Matthew
5a79a4c4-38d9-4f48-988b-0ea03fe2b75c
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
12 August 2020
Belchi Guillamon, Francisco
41c7c5e5-b259-45d8-89f9-7b7937517c53
Brodzki, Jacek
b1fe25fd-5451-4fd0-b24b-c59b75710543
Burfitt, Matthew
5a79a4c4-38d9-4f48-988b-0ea03fe2b75c
Niranjan, Mahesan
5cbaeea8-7288-4b55-a89c-c43d212ddd4f
Belchi Guillamon, Francisco, Brodzki, Jacek, Burfitt, Matthew and Niranjan, Mahesan
(2020)
A numerical measure of the instability of Mapper-type algorithms.
Journal of Machine Learning Research, 21, , [202].
Abstract
Mapper is an unsupervised machine learning algorithm generalising the notion of clustering to obtain a geometric description of a dataset. The procedure splits the data into possibly overlapping bins which are then clustered. The output of the algorithm is a graph where nodes represent clusters and edges represent the sharing of data points between two clusters. However, several parameters must be selected before applying Mapper and the resulting graph may vary dramatically with the choice of parameters.
We define an intrinsic notion of Mapper instability that measures the variability of the output as a function of the choice of parameters required to construct a Mapper output. Our results and discussion are general and apply to all Mapper-type algorithms. We derive theoretical results that provide estimates for the instability and suggest practical ways to control it. We provide also experiments to illustrate our results and in particular we demonstrate that a reliable candidate Mapper output can be identified as a local minimum of instability regarded as a function of Mapper input parameters.
Text
A Numerical Measure of the Instability of Mapper-Type Algorithms
- Version of Record
More information
Accepted/In Press date: 2 August 2020
e-pub ahead of print date: 12 August 2020
Published date: 12 August 2020
Identifiers
Local EPrints ID: 444403
URI: http://eprints.soton.ac.uk/id/eprint/444403
PURE UUID: e94ab90d-6720-4853-81a9-78507f9aee57
Catalogue record
Date deposited: 16 Oct 2020 16:32
Last modified: 17 Mar 2024 05:57
Export record
Contributors
Author:
Francisco Belchi Guillamon
Author:
Matthew Burfitt
Author:
Mahesan Niranjan
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics