A review and evaluation of elastic distance functions for time series clustering
A review and evaluation of elastic distance functions for time series clustering
Time series clustering is the act of grouping time series data without recourse to a label. Algorithms that cluster time series can be classified into two groups: those that employ a time series specific distance measure and those that derive features from time series. Both approaches usually rely on traditional clustering algorithms such as k-means. Our focus is on partitional clustering algorithms that employ elastic distance measures, i.e. distances that perform some kind of realignment whilst measuring distance. We describe nine commonly used elastic distance measures and compare their performance with k-means and k-medoids clusterer. Our findings, based on experiments using the UCR time series archive, are surprising. We find that, generally, clustering with DTW distance is not better than using Euclidean distance and that distance measures that employ editing in conjunction with warping are significantly better than other approaches. We further observe that using k-medoids clusterer rather than k-means improves the clusterings for all nine elastic distance measures. One function, the move–split–merge (MSM) distance, is the best performing algorithm of this study, with time warp edit (TWE) distance a close second. Our conclusion is that MSM or TWE with k-medoids clusterer should be considered as a good alternative to DTW for clustering time series with elastic distance measures. We provide implementations, extensive results and guidance on reproducing results on the associated GitHub repository.
Derivative dynamic time warping, Dynamic barycentre averaging, Dynamic time warping, Edit distance on real sequences, Edit distance with real penalty, Longest common subsequence, Move–split–merge, Time series clustering, Time warp edit distance, Weighted derivative dynamic time warping, Weighted dynamic time warping, k-Means, k-medoids clusterer
765-809
Holder, Christopher
1799d5f6-9e4b-4a11-bddb-5f491bef9dcb
Middlehurst, Matthew
44ae267d-b9ec-42b2-b818-d901b221daf9
Bagnall, Anthony
d31e6506-2a00-4358-ba3f-baefd48d59d8
Holder, Christopher
1799d5f6-9e4b-4a11-bddb-5f491bef9dcb
Middlehurst, Matthew
44ae267d-b9ec-42b2-b818-d901b221daf9
Bagnall, Anthony
d31e6506-2a00-4358-ba3f-baefd48d59d8
Holder, Christopher, Middlehurst, Matthew and Bagnall, Anthony
(2024)
A review and evaluation of elastic distance functions for time series clustering.
Knowledge and Information Systems, 66 (2), .
(doi:10.1007/s10115-023-01952-0).
Abstract
Time series clustering is the act of grouping time series data without recourse to a label. Algorithms that cluster time series can be classified into two groups: those that employ a time series specific distance measure and those that derive features from time series. Both approaches usually rely on traditional clustering algorithms such as k-means. Our focus is on partitional clustering algorithms that employ elastic distance measures, i.e. distances that perform some kind of realignment whilst measuring distance. We describe nine commonly used elastic distance measures and compare their performance with k-means and k-medoids clusterer. Our findings, based on experiments using the UCR time series archive, are surprising. We find that, generally, clustering with DTW distance is not better than using Euclidean distance and that distance measures that employ editing in conjunction with warping are significantly better than other approaches. We further observe that using k-medoids clusterer rather than k-means improves the clusterings for all nine elastic distance measures. One function, the move–split–merge (MSM) distance, is the best performing algorithm of this study, with time warp edit (TWE) distance a close second. Our conclusion is that MSM or TWE with k-medoids clusterer should be considered as a good alternative to DTW for clustering time series with elastic distance measures. We provide implementations, extensive results and guidance on reproducing results on the associated GitHub repository.
Text
s10115-023-01952-0
- Version of Record
More information
Accepted/In Press date: 25 July 2023
e-pub ahead of print date: 7 September 2024
Keywords:
Derivative dynamic time warping, Dynamic barycentre averaging, Dynamic time warping, Edit distance on real sequences, Edit distance with real penalty, Longest common subsequence, Move–split–merge, Time series clustering, Time warp edit distance, Weighted derivative dynamic time warping, Weighted dynamic time warping, k-Means, k-medoids clusterer
Identifiers
Local EPrints ID: 489950
URI: http://eprints.soton.ac.uk/id/eprint/489950
ISSN: 0219-1377
PURE UUID: c86b1075-d03e-4a81-b89f-b6d2dd3f2d52
Catalogue record
Date deposited: 08 May 2024 16:30
Last modified: 11 May 2024 02:12
Export record
Altmetrics
Contributors
Author:
Christopher Holder
Author:
Matthew Middlehurst
Author:
Anthony Bagnall
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics