The University of Southampton
University of Southampton Institutional Repository

Determining window sizes using species estimation for accurate process mining over streams

Determining window sizes using species estimation for accurate process mining over streams
Determining window sizes using species estimation for accurate process mining over streams

Streaming process mining deals with the real-time analysis of event streams. A common approach for it is to adopt windowing mechanisms that select event data from a stream for subsequent analysis. However, the size of these windows denotes a crucial parameter, as it influences the representativeness of the window content and, by extension, of the analysis results. Given that process dynamics are subject to changes and potential concept drift, a static, fixed window size leads to inaccurate representations that introduce bias in the analysis. In this work, we present a novel approach for streaming process mining that addresses these limitations by adjusting window sizes. Specifically, we dynamically determine suitable window sizes based on estimators for the representativeness of samples as developed for species estimation in biodiversity research. Evaluation results on real-world data sets show improvements over existing approaches that adopt static window sizes in terms of accuracy and robustness to concept drifts.

Data Representativeness, Log completeness, streaming process mining, Window size
0302-9743
109-124
Springer Science and Business Media B.V.
Imenkamp, Christian
5c9bc4b9-d833-4c04-8806-6f511e4e19f7
Kabierski, Martin
918ee488-bd1e-4820-8071-f5e8d1f7b3f8
Reiter, Hendrik
a357c35a-95af-4822-ada8-a1f0ba4a7f76
Weidlich, Matthias
b30201a6-39b5-4882-9e81-a2c9d954c1a7
Hasselbring, Wilhelm
ee89c5c9-a900-40b1-82c1-552268cd01bd
Koschmider, Agnes
6f04798e-353d-41fe-a4cc-40c7703c65cf
Krogstie, John
Rinderle-Ma, Stefanie
Kappel, Gerti
Proper, Henderik A.
Imenkamp, Christian
5c9bc4b9-d833-4c04-8806-6f511e4e19f7
Kabierski, Martin
918ee488-bd1e-4820-8071-f5e8d1f7b3f8
Reiter, Hendrik
a357c35a-95af-4822-ada8-a1f0ba4a7f76
Weidlich, Matthias
b30201a6-39b5-4882-9e81-a2c9d954c1a7
Hasselbring, Wilhelm
ee89c5c9-a900-40b1-82c1-552268cd01bd
Koschmider, Agnes
6f04798e-353d-41fe-a4cc-40c7703c65cf
Krogstie, John
Rinderle-Ma, Stefanie
Kappel, Gerti
Proper, Henderik A.

Imenkamp, Christian, Kabierski, Martin, Reiter, Hendrik, Weidlich, Matthias, Hasselbring, Wilhelm and Koschmider, Agnes (2025) Determining window sizes using species estimation for accurate process mining over streams. Krogstie, John, Rinderle-Ma, Stefanie, Kappel, Gerti and Proper, Henderik A. (eds.) In Advanced Information Systems Engineering - 37th International Conference, CAiSE 2025, Proceedings. vol. 15701 LNCS, Springer Science and Business Media B.V. pp. 109-124 . (doi:10.1007/978-3-031-94569-4_7).

Record type: Conference or Workshop Item (Paper)

Abstract

Streaming process mining deals with the real-time analysis of event streams. A common approach for it is to adopt windowing mechanisms that select event data from a stream for subsequent analysis. However, the size of these windows denotes a crucial parameter, as it influences the representativeness of the window content and, by extension, of the analysis results. Given that process dynamics are subject to changes and potential concept drift, a static, fixed window size leads to inaccurate representations that introduce bias in the analysis. In this work, we present a novel approach for streaming process mining that addresses these limitations by adjusting window sizes. Specifically, we dynamically determine suitable window sizes based on estimators for the representativeness of samples as developed for species estimation in biodiversity research. Evaluation results on real-world data sets show improvements over existing approaches that adopt static window sizes in terms of accuracy and robustness to concept drifts.

This record has no associated files available for download.

More information

Published date: 15 June 2025
Additional Information: Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.
Venue - Dates: 37th International Conference on Advanced Information Systems Engineering, CAiSE 2025, , Vienna, Austria, 2025-06-16 - 2025-06-20
Keywords: Data Representativeness, Log completeness, streaming process mining, Window size

Identifiers

Local EPrints ID: 506639
URI: http://eprints.soton.ac.uk/id/eprint/506639
ISSN: 0302-9743
PURE UUID: b9220e3d-708b-4215-8735-8e7c62db0fe8
ORCID for Wilhelm Hasselbring: ORCID iD orcid.org/0000-0001-6625-4335

Catalogue record

Date deposited: 12 Nov 2025 17:48
Last modified: 13 Nov 2025 03:10

Export record

Altmetrics

Contributors

Author: Christian Imenkamp
Author: Martin Kabierski
Author: Hendrik Reiter
Author: Matthias Weidlich
Author: Wilhelm Hasselbring ORCID iD
Author: Agnes Koschmider
Editor: John Krogstie
Editor: Stefanie Rinderle-Ma
Editor: Gerti Kappel
Editor: Henderik A. Proper

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×