Tree congruence: quantifying similarity between dendrogram topologies
Tree congruence: quantifying similarity between dendrogram topologies
Tree congruence metrics are typically global indices that describe the similarity or dissimilarity between dendrograms. This study principally focuses on topological congruence metrics that quantify similarity between two dendrograms and can give a normalised score between 0 and 1. Specifically, this article describes and tests two metrics the Clade Retention Index (CRI) and the MASTxCF which is derived from the combined information available from a maximum agreement subtree and a strict consensus. The two metrics were developed to study differences between evolutionary trees, but their applications are multidisciplinary and can be used on hierarchical cluster diagrams derived from analyses in science, technology, maths or social sciences disciplines. A comprehensive, but non-exhaustive review of other tree congruence metrics is provided and nine metrics are further analysed. 1,620 pairwise analyses of simulated dendrograms (which could be derived from any type of analysis) were conducted and are compared in Pac-man piechart matrices. Kendalls tau-b is used to demonstrate the concordance of the different metrics and Spearmans rho ranked correlations are used to support these findings. The results support the use of the CRI and MASTxCF as part of a suite of metrics, but it is recommended that permutation metrics such as SPR distances and weighted metrics are disregarded for the specific purpose of measuring similarity.
Vidovic, Steven
abba74b7-4c91-4f08-b8f1-4d9e836a43e4
Vidovic, Steven
abba74b7-4c91-4f08-b8f1-4d9e836a43e4
Vidovic, Steven
(2019)
Tree congruence: quantifying similarity between dendrogram topologies.
bioRxiv.
(doi:10.1101/766840).
Abstract
Tree congruence metrics are typically global indices that describe the similarity or dissimilarity between dendrograms. This study principally focuses on topological congruence metrics that quantify similarity between two dendrograms and can give a normalised score between 0 and 1. Specifically, this article describes and tests two metrics the Clade Retention Index (CRI) and the MASTxCF which is derived from the combined information available from a maximum agreement subtree and a strict consensus. The two metrics were developed to study differences between evolutionary trees, but their applications are multidisciplinary and can be used on hierarchical cluster diagrams derived from analyses in science, technology, maths or social sciences disciplines. A comprehensive, but non-exhaustive review of other tree congruence metrics is provided and nine metrics are further analysed. 1,620 pairwise analyses of simulated dendrograms (which could be derived from any type of analysis) were conducted and are compared in Pac-man piechart matrices. Kendalls tau-b is used to demonstrate the concordance of the different metrics and Spearmans rho ranked correlations are used to support these findings. The results support the use of the CRI and MASTxCF as part of a suite of metrics, but it is recommended that permutation metrics such as SPR distances and weighted metrics are disregarded for the specific purpose of measuring similarity.
This record has no associated files available for download.
More information
In preparation date: 13 September 2019
Identifiers
Local EPrints ID: 434204
URI: http://eprints.soton.ac.uk/id/eprint/434204
PURE UUID: 899406f6-02e9-49b6-91ad-05282dcc5b23
Catalogue record
Date deposited: 16 Sep 2019 16:30
Last modified: 16 Mar 2024 04:37
Export record
Altmetrics
Contributors
Author:
Steven Vidovic
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics