The University of Southampton
University of Southampton Institutional Repository

Predicting cognitive performance from three-electrode auditory brainstem responses using convolutional neural networks

Predicting cognitive performance from three-electrode auditory brainstem responses using convolutional neural networks
Predicting cognitive performance from three-electrode auditory brainstem responses using convolutional neural networks
Age-related cognitive decline is a growing global concern, motivating the search for noninvasive, accessible biomarkers to support early detection and monitoring. Click-evoked auditory brainstem responses (ABRs), collected in routine clinical settings, offer a promising signal source. Building on prior evidence that ABRs relate to cognitive function, this study investigates whether raw human ABR waveforms can predict cognitive performance using deep learning without manual peak measurements. We used a dataset from 118 adults spanning a broad range of cognitive abilities, pairing each click-evoked ABR with a cognitive score. A 1-D convolutional neural network (CNN) was trained to learn time-series patterns directly from the raw signal. Model performance was evaluated using two, five, and ten-fold cross-validation and compared against traditional wave V metrics and a randomized-input baseline. After adjusting scores for age, the CNN achieved a mean area under the receiver operating characteristic curve of 0.77 ± 0.06, outperforming all benchmarks. To interpret model decisions, gradient-weighted class activation mapping (Grad-CAM) was applied. Three key latency windows were identified: 1.8–2.3, 3.2–4.0, and 4.9–6.5 ms. These correspond to canonical ABR waves I, III, and V, supporting the physiological relevance of the learned features and highlighting Wave III as a previously underutilized marker. Although limited to click stimuli from a single recording system, this work demonstrates that a CNN can extract meaningful features from raw ABRs collected using only three electrodes and predict cognitive status more accurately than traditional methods using the same dataset.
1930-0395
Malekifar, Mobina
ac3bd428-3838-4857-8698-c18351650a38
Hamza, Yasmeen
d6e729c6-e95c-4ae6-88ba-d4019648d205
Yang, Ye
62f1a148-f3e9-4ea5-a347-e29502a8609a
Cao, Hung
a9ecfd8a-a04b-4c0d-9b89-5c722595094c
Zeng, Fan-Gang
b23758e4-029d-4428-97bf-2613514aac71
Malekifar, Mobina
ac3bd428-3838-4857-8698-c18351650a38
Hamza, Yasmeen
d6e729c6-e95c-4ae6-88ba-d4019648d205
Yang, Ye
62f1a148-f3e9-4ea5-a347-e29502a8609a
Cao, Hung
a9ecfd8a-a04b-4c0d-9b89-5c722595094c
Zeng, Fan-Gang
b23758e4-029d-4428-97bf-2613514aac71

Malekifar, Mobina, Hamza, Yasmeen, Yang, Ye, Cao, Hung and Zeng, Fan-Gang (2026) Predicting cognitive performance from three-electrode auditory brainstem responses using convolutional neural networks. IEEE SENSORS, 26 (8). (doi:10.1109/JSEN.2026.3671708).

Record type: Article

Abstract

Age-related cognitive decline is a growing global concern, motivating the search for noninvasive, accessible biomarkers to support early detection and monitoring. Click-evoked auditory brainstem responses (ABRs), collected in routine clinical settings, offer a promising signal source. Building on prior evidence that ABRs relate to cognitive function, this study investigates whether raw human ABR waveforms can predict cognitive performance using deep learning without manual peak measurements. We used a dataset from 118 adults spanning a broad range of cognitive abilities, pairing each click-evoked ABR with a cognitive score. A 1-D convolutional neural network (CNN) was trained to learn time-series patterns directly from the raw signal. Model performance was evaluated using two, five, and ten-fold cross-validation and compared against traditional wave V metrics and a randomized-input baseline. After adjusting scores for age, the CNN achieved a mean area under the receiver operating characteristic curve of 0.77 ± 0.06, outperforming all benchmarks. To interpret model decisions, gradient-weighted class activation mapping (Grad-CAM) was applied. Three key latency windows were identified: 1.8–2.3, 3.2–4.0, and 4.9–6.5 ms. These correspond to canonical ABR waves I, III, and V, supporting the physiological relevance of the learned features and highlighting Wave III as a previously underutilized marker. Although limited to click stimuli from a single recording system, this work demonstrates that a CNN can extract meaningful features from raw ABRs collected using only three electrodes and predict cognitive status more accurately than traditional methods using the same dataset.

Text
IEEE_Accepted_26 - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (2MB)

More information

Accepted/In Press date: 4 March 2026
e-pub ahead of print date: 12 March 2026
Published date: 15 April 2026

Identifiers

Local EPrints ID: 510775
URI: http://eprints.soton.ac.uk/id/eprint/510775
ISSN: 1930-0395
PURE UUID: 18029bad-d4ec-4411-809e-b95d85448815
ORCID for Yasmeen Hamza: ORCID iD orcid.org/0000-0003-3294-6629

Catalogue record

Date deposited: 21 Apr 2026 16:56
Last modified: 22 Apr 2026 02:11

Export record

Altmetrics

Contributors

Author: Mobina Malekifar
Author: Yasmeen Hamza ORCID iD
Author: Ye Yang
Author: Hung Cao
Author: Fan-Gang Zeng

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×