The University of Southampton
University of Southampton Institutional Repository

World of ScoreCraft: novel multi-scorer experiment on the impact of a decision support system in sleep staging

World of ScoreCraft: novel multi-scorer experiment on the impact of a decision support system in sleep staging
World of ScoreCraft: novel multi-scorer experiment on the impact of a decision support system in sleep staging

Manual scoring of polysomnography (PSG) is a time-intensive task, prone to inter-scorer variability that can impact diagnostic reliability. This study investigates the integration of decision support systems (DSS) into PSG scoring workflows, focusing on their effects on accuracy, scoring time and potential biases toward recommendations from artificial intelligence (AI) compared to human-generated recommendations. Using a novel online scoring platform, we conducted a repeated-measures study with sleep technologists, who scored traditional and self-applied PSGs. Participants were occasionally presented with recommendations labelled as either human- or AI-generated. As the goal of this study was to isolate the effect of perceived recommendation sources on scorer behaviour, all recommendations were human-generated. We found that traditional PSGs tended to be scored slightly more accurately than self-applied PSGs, but this difference was not statistically significant. Correct recommendations significantly improved scoring accuracy for both PSG types, while incorrect recommendations reduced accuracy. No significant bias was observed toward or against AI-generated recommendations compared to human-generated recommendations. These findings highlight the potential of DSSs to enhance PSG scoring reliability. However, ensuring the accuracy of the suggestions is critical to maximising its benefits. Future research should explore the long-term impacts of DSS on scoring workflows and strategies for integrating AI in clinical practice.

artificial intelligence, decision support system, scoring accuracy, sleep staging
0962-1105
Holm, Benedikt
d9ae89eb-36f6-4355-bdf0-90507762c9c5
Óskarsson, Arnar
edc875d2-081d-4598-83ad-32b9e0395726
Þorleifsson, Björn Elvar
005f1847-7536-406a-90ba-e2627c398703
Hafsteinsson, Hörður Þór
eabd8b3e-dab9-425b-b9f9-33a3fad83a8e
Sigurðardóttir, Sigríður
d9f6342c-8d60-416b-8333-f15a5f98901f
Grétarsdóttir, Heiður
95adea68-1fa4-4f8c-953f-2c29199db443
Hoelke, Kenan
8bfe17e9-7a4a-406b-82ef-0d188f14fc5e
Jouan, Gabriel Marc Marie
742665b0-c27b-44d2-adb1-113cb931998d
Penzel, Thomas
3fbd161c-8584-422a-b1a1-6b395cda7bb2
Arnardottir, Erna Sif
9bfbbe32-8214-47a9-86ba-43be85458830
Óskarsdóttir, María
d159ed8f-9dd3-4ff3-8b00-d43579ab71be
Holm, Benedikt
d9ae89eb-36f6-4355-bdf0-90507762c9c5
Óskarsson, Arnar
edc875d2-081d-4598-83ad-32b9e0395726
Þorleifsson, Björn Elvar
005f1847-7536-406a-90ba-e2627c398703
Hafsteinsson, Hörður Þór
eabd8b3e-dab9-425b-b9f9-33a3fad83a8e
Sigurðardóttir, Sigríður
d9f6342c-8d60-416b-8333-f15a5f98901f
Grétarsdóttir, Heiður
95adea68-1fa4-4f8c-953f-2c29199db443
Hoelke, Kenan
8bfe17e9-7a4a-406b-82ef-0d188f14fc5e
Jouan, Gabriel Marc Marie
742665b0-c27b-44d2-adb1-113cb931998d
Penzel, Thomas
3fbd161c-8584-422a-b1a1-6b395cda7bb2
Arnardottir, Erna Sif
9bfbbe32-8214-47a9-86ba-43be85458830
Óskarsdóttir, María
d159ed8f-9dd3-4ff3-8b00-d43579ab71be

Holm, Benedikt, Óskarsson, Arnar, Þorleifsson, Björn Elvar, Hafsteinsson, Hörður Þór, Sigurðardóttir, Sigríður, Grétarsdóttir, Heiður, Hoelke, Kenan, Jouan, Gabriel Marc Marie, Penzel, Thomas, Arnardottir, Erna Sif and Óskarsdóttir, María (2025) World of ScoreCraft: novel multi-scorer experiment on the impact of a decision support system in sleep staging. Journal of Sleep Research, [e70113]. (doi:10.1111/jsr.70113).

Record type: Article

Abstract

Manual scoring of polysomnography (PSG) is a time-intensive task, prone to inter-scorer variability that can impact diagnostic reliability. This study investigates the integration of decision support systems (DSS) into PSG scoring workflows, focusing on their effects on accuracy, scoring time and potential biases toward recommendations from artificial intelligence (AI) compared to human-generated recommendations. Using a novel online scoring platform, we conducted a repeated-measures study with sleep technologists, who scored traditional and self-applied PSGs. Participants were occasionally presented with recommendations labelled as either human- or AI-generated. As the goal of this study was to isolate the effect of perceived recommendation sources on scorer behaviour, all recommendations were human-generated. We found that traditional PSGs tended to be scored slightly more accurately than self-applied PSGs, but this difference was not statistically significant. Correct recommendations significantly improved scoring accuracy for both PSG types, while incorrect recommendations reduced accuracy. No significant bias was observed toward or against AI-generated recommendations compared to human-generated recommendations. These findings highlight the potential of DSSs to enhance PSG scoring reliability. However, ensuring the accuracy of the suggestions is critical to maximising its benefits. Future research should explore the long-term impacts of DSS on scoring workflows and strategies for integrating AI in clinical practice.

Text
Journal of Sleep Research - 2025 - Holm - World of ScoreCraft Novel Multi‐Scorer Experiment on the Impact of a Decision - Version of Record
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 27 May 2025
e-pub ahead of print date: 19 June 2025
Published date: 19 June 2025
Keywords: artificial intelligence, decision support system, scoring accuracy, sleep staging

Identifiers

Local EPrints ID: 503586
URI: http://eprints.soton.ac.uk/id/eprint/503586
ISSN: 0962-1105
PURE UUID: 07069041-371a-4b76-83f3-f46c3f00b256
ORCID for María Óskarsdóttir: ORCID iD orcid.org/0000-0001-5095-5356

Catalogue record

Date deposited: 05 Aug 2025 16:58
Last modified: 01 Oct 2025 02:19

Export record

Altmetrics

Contributors

Author: Benedikt Holm
Author: Arnar Óskarsson
Author: Björn Elvar Þorleifsson
Author: Hörður Þór Hafsteinsson
Author: Sigríður Sigurðardóttir
Author: Heiður Grétarsdóttir
Author: Kenan Hoelke
Author: Gabriel Marc Marie Jouan
Author: Thomas Penzel
Author: Erna Sif Arnardottir
Author: María Óskarsdóttir ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×