An algorithmic approach to identification of gray areas: analysis of sleep scoring expert ensemble non agreement areas using a multinomial mixture model
An algorithmic approach to identification of gray areas: analysis of sleep scoring expert ensemble non agreement areas using a multinomial mixture model
Machine learning (ML) models have become a key component in modern world services. In decision-making domains where human expertise is crucial, for example, for manually scoring biological signal data, human uncertainties undermine experts’ trust in the outcomes of these models. The field of sleep staging in particular, which requires experts to score complex biological signal is notably impacted by scoring uncertainties. Data consisting of an ensemble of independent scorers are collected to estimate inter scorer agreement and the uncertainty associated with manual scoring. However, scorers’ uncertainty lacks statistical modeling, which poses difficulties in validating ML algorithms and leads to issues of reliability and explainability. From the ensemble of scorers, uncertainty zones, called gray areas, are highlighted by samples where scorers disagree. The objective of our work is to provide a framework introducing and inferring gray areas. We present a flexible and easy-to-use multi-objective method based on multinomial mixture models clustering the different levels of scorer agreement and summarize the results into two sets of high-agreement and gray area clusters, which are called supra-clusters. The threshold is selected according to the maximization of the distance between two distributions of scorers agreement measure. Effective results were obtained by the method after it was fitted on simulated data. Additionally, the method was applied to a real case of uncertainty analysis in the sleep staging domain. A series of actual sleep stages scored by an ensemble of 10 independent scorers for a dataset of 50 participants was used.
Ensemble of manual scorers, Gray area, Multinomial mixture model, Sleep stages, Uncertainty quantification
352-365
Jouan, Gabriel
41a11a8b-6b6f-45e9-a34c-e653458ab262
Arnardottir, Erna Sif
9bfbbe32-8214-47a9-86ba-43be85458830
Islind, Anna Sigridur
46e6353f-a1b6-4628-916c-18e817695d03
Óskarsdóttir, María
d159ed8f-9dd3-4ff3-8b00-d43579ab71be
1 September 2024
Jouan, Gabriel
41a11a8b-6b6f-45e9-a34c-e653458ab262
Arnardottir, Erna Sif
9bfbbe32-8214-47a9-86ba-43be85458830
Islind, Anna Sigridur
46e6353f-a1b6-4628-916c-18e817695d03
Óskarsdóttir, María
d159ed8f-9dd3-4ff3-8b00-d43579ab71be
Jouan, Gabriel, Arnardottir, Erna Sif, Islind, Anna Sigridur and Óskarsdóttir, María
(2024)
An algorithmic approach to identification of gray areas: analysis of sleep scoring expert ensemble non agreement areas using a multinomial mixture model.
European Journal of Operational Research, 317 (2), .
(doi:10.1016/j.ejor.2023.09.039).
Abstract
Machine learning (ML) models have become a key component in modern world services. In decision-making domains where human expertise is crucial, for example, for manually scoring biological signal data, human uncertainties undermine experts’ trust in the outcomes of these models. The field of sleep staging in particular, which requires experts to score complex biological signal is notably impacted by scoring uncertainties. Data consisting of an ensemble of independent scorers are collected to estimate inter scorer agreement and the uncertainty associated with manual scoring. However, scorers’ uncertainty lacks statistical modeling, which poses difficulties in validating ML algorithms and leads to issues of reliability and explainability. From the ensemble of scorers, uncertainty zones, called gray areas, are highlighted by samples where scorers disagree. The objective of our work is to provide a framework introducing and inferring gray areas. We present a flexible and easy-to-use multi-objective method based on multinomial mixture models clustering the different levels of scorer agreement and summarize the results into two sets of high-agreement and gray area clusters, which are called supra-clusters. The threshold is selected according to the maximization of the distance between two distributions of scorers agreement measure. Effective results were obtained by the method after it was fitted on simulated data. Additionally, the method was applied to a real case of uncertainty analysis in the sleep staging domain. A series of actual sleep stages scored by an ensemble of 10 independent scorers for a dataset of 50 participants was used.
This record has no associated files available for download.
More information
Accepted/In Press date: 30 September 2023
Published date: 1 September 2024
Additional Information:
Publisher Copyright:
© 2023 Elsevier B.V.
Keywords:
Ensemble of manual scorers, Gray area, Multinomial mixture model, Sleep stages, Uncertainty quantification
Identifiers
Local EPrints ID: 508390
URI: http://eprints.soton.ac.uk/id/eprint/508390
ISSN: 0377-2217
PURE UUID: fc459cf9-2460-4ec0-88f4-797c31394e62
Catalogue record
Date deposited: 20 Jan 2026 17:51
Last modified: 21 Jan 2026 03:11
Export record
Altmetrics
Contributors
Author:
Gabriel Jouan
Author:
Erna Sif Arnardottir
Author:
Anna Sigridur Islind
Author:
María Óskarsdóttir
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics