The University of Southampton
University of Southampton Institutional Repository

How data drift impacts the safety and interpretability of machine learning models predicting risk from blood glucose control

How data drift impacts the safety and interpretability of machine learning models predicting risk from blood glucose control
How data drift impacts the safety and interpretability of machine learning models predicting risk from blood glucose control
Objective: predictive algorithms trained from historical data and deployed in dynamic environments are at risk from data drift. Machine learning models using data collected by sensors making continuous measurements could be impacted by both changes in the device itself and their users, driving drift and impacting safety. To maintain predictive performance, algorithms must be continuously monitored and tuned to overcome fundamental changes to both input data (covariate shift) and the relationship with the output (concept drift). Here, we aim to understand how changes to user behaviour, physiology and sensors could impact the safety of models using automated sensor readings from continuous glucose monitors (CGM).

Methods and analysis: in this paper, we investigate how data drift in a machine learning model trained to predict short-term risk from blood glucose control for individuals with type-1 diabetes. We simulate how changes in both user behaviour and accuracy of the sensor could lead to covariate shift and concept drift. For each scenario, we quantify the changes to input data (Jensen-Shannon divergence), the impact to model performance metrics and the explainability of the model (ie, shift in feature importance).

Results: we demonstrate that using a combination of covariate shift detection, multiple performance metrics and feature importance offers a powerful methodology of identifying different types of drifts in sensor data. For blood glucose management, our scenarios focused on user behaviour (ie, changes to blood glucose dynamics and CGM use) and device/sensor noise and variability, finding more simplistic approaches to drift detection could incorrectly identify risk to model safety.

Conclusion: machine learning and AI can enhance clinical decision-making, but often lack the transparency required to ensure ongoing safety. Combining complementary monitoring techniques enables clearer identification of changes in data or model behaviour, helping determine when retraining or intervention is needed.
Leung, Ho-Hin
4a71b64d-1a17-45e8-90c6-121b41dd793b
Duckworth, Christopher
992c216c-8f66-48a8-8de6-2f04b4f736e6
Burns, Dan
40b9dc88-a54a-4365-b747-4456d9203146
Guy, Matthew
1a40b2ed-3aec-4fce-9954-396840471c28
Boniface, Michael
f30bfd7d-20ed-451b-b405-34e3e22fdfba
Leung, Ho-Hin
4a71b64d-1a17-45e8-90c6-121b41dd793b
Duckworth, Christopher
992c216c-8f66-48a8-8de6-2f04b4f736e6
Burns, Dan
40b9dc88-a54a-4365-b747-4456d9203146
Guy, Matthew
1a40b2ed-3aec-4fce-9954-396840471c28
Boniface, Michael
f30bfd7d-20ed-451b-b405-34e3e22fdfba

Leung, Ho-Hin, Duckworth, Christopher, Burns, Dan, Guy, Matthew and Boniface, Michael (2026) How data drift impacts the safety and interpretability of machine learning models predicting risk from blood glucose control. BMJ Digital Health & AI, 2 (1). (doi:10.1136/bmjdhai-2025-000269).

Record type: Article

Abstract

Objective: predictive algorithms trained from historical data and deployed in dynamic environments are at risk from data drift. Machine learning models using data collected by sensors making continuous measurements could be impacted by both changes in the device itself and their users, driving drift and impacting safety. To maintain predictive performance, algorithms must be continuously monitored and tuned to overcome fundamental changes to both input data (covariate shift) and the relationship with the output (concept drift). Here, we aim to understand how changes to user behaviour, physiology and sensors could impact the safety of models using automated sensor readings from continuous glucose monitors (CGM).

Methods and analysis: in this paper, we investigate how data drift in a machine learning model trained to predict short-term risk from blood glucose control for individuals with type-1 diabetes. We simulate how changes in both user behaviour and accuracy of the sensor could lead to covariate shift and concept drift. For each scenario, we quantify the changes to input data (Jensen-Shannon divergence), the impact to model performance metrics and the explainability of the model (ie, shift in feature importance).

Results: we demonstrate that using a combination of covariate shift detection, multiple performance metrics and feature importance offers a powerful methodology of identifying different types of drifts in sensor data. For blood glucose management, our scenarios focused on user behaviour (ie, changes to blood glucose dynamics and CGM use) and device/sensor noise and variability, finding more simplistic approaches to drift detection could incorrectly identify risk to model safety.

Conclusion: machine learning and AI can enhance clinical decision-making, but often lack the transparency required to ensure ongoing safety. Combining complementary monitoring techniques enables clearer identification of changes in data or model behaviour, helping determine when retraining or intervention is needed.

Text
e000269.full - Version of Record
Available under License Creative Commons Attribution.
Download (5MB)

More information

Accepted/In Press date: 13 January 2026
e-pub ahead of print date: 9 February 2026
Published date: 1 March 2026

Identifiers

Local EPrints ID: 510082
URI: http://eprints.soton.ac.uk/id/eprint/510082
PURE UUID: bc9df7cd-e111-4d3b-b40b-2cfdf01f8c16
ORCID for Christopher Duckworth: ORCID iD orcid.org/0000-0003-0659-2177
ORCID for Dan Burns: ORCID iD orcid.org/0000-0001-6976-1068
ORCID for Matthew Guy: ORCID iD orcid.org/0000-0002-6818-2010
ORCID for Michael Boniface: ORCID iD orcid.org/0000-0002-9281-6095

Catalogue record

Date deposited: 17 Mar 2026 17:35
Last modified: 21 Mar 2026 03:19

Export record

Altmetrics

Contributors

Author: Ho-Hin Leung
Author: Christopher Duckworth ORCID iD
Author: Dan Burns ORCID iD
Author: Matthew Guy ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×