Using machine-assisted topic analysis to expedite thematic analysis of free-text data: Exemplar investigation of factors influencing health behaviours and wellbeing during the COVID-19 pandemic
Using machine-assisted topic analysis to expedite thematic analysis of free-text data: Exemplar investigation of factors influencing health behaviours and wellbeing during the COVID-19 pandemic
Objectives: investigate the use of machine learning to expedite thematic analysis of qualitative data concerning factors that influenced health behaviours and wellbeing during the COVID-19 pandemic.
Design: qualitative investigation using Machine-Assisted Topic Analysis (MATA) of free-text data collected from a prospective cohort.
Methods: free-text survey data (2177 responses from 762 participants) of influences on health behaviours and wellbeing were collected among UK participants recruited online, using Qualtrics at 3, 6, 12 and 24 months after the COVID-19 pandemic started. MATA, which employs structural topic modelling (STM), was used (in R) to discern latent topics within the responses. Two researchers independently labelled topics and collaboratively organized them into themes, with ‘sense checking’ from two additional researchers. Plots and rankings were generated, showing change in topic prevalence by time. Total researcher time to complete analysis was collated.
Results: fifteen STM-generated topics were labelled and integrated into six themes: the influences of and impacts on (1) health behaviours, (2) physical health (3) mood and (4) how these interacted, partly moderated by (5) external influences of control and (6) reflections on wellbeing and personal growth. Topic prevalence varied meaningfully over time, aligning with changes in the pandemic context. Themes were generated (excluding write-up) with 20 h combined researcher time.
Conclusions: MATA shows promise as a resource-saving method for thematic analysis of large qualitative datasets whilst maintaining researcher control and insight. Findings show the interconnection between health behaviours, physical health and wellbeing over the pandemic, and the influence of control and reflective processes.
COVID-19, artificial intelligence, health, machine learning, qualitative, thematic analysis, wellbeing
Ward, Emma
658309b5-8286-4589-85bf-4298fae34188
Naughton, Felix
31fe7ac9-9faa-4033-ac89-cf7dd6a947e4
Belderson, Pippa
ce1a5646-c144-4889-bb02-916d84920ad3
Papakonstantinou, Trisevgeni
6e39c90c-6cf8-4311-8b5f-a7bcb2a37141
Ainsworth, Ben
b02d78c3-aa8b-462d-a534-31f1bf164f81
Hanson, Sarah
df3791ba-32bf-4a5a-afc3-96380731e8c6
Notley, Caitlin
e60ebda1-5016-483e-9b88-fef942ecb58f
Bondaronek, Paulina
315e63f0-9b9c-451a-87ae-736c663e08ca
11 September 2025
Ward, Emma
658309b5-8286-4589-85bf-4298fae34188
Naughton, Felix
31fe7ac9-9faa-4033-ac89-cf7dd6a947e4
Belderson, Pippa
ce1a5646-c144-4889-bb02-916d84920ad3
Papakonstantinou, Trisevgeni
6e39c90c-6cf8-4311-8b5f-a7bcb2a37141
Ainsworth, Ben
b02d78c3-aa8b-462d-a534-31f1bf164f81
Hanson, Sarah
df3791ba-32bf-4a5a-afc3-96380731e8c6
Notley, Caitlin
e60ebda1-5016-483e-9b88-fef942ecb58f
Bondaronek, Paulina
315e63f0-9b9c-451a-87ae-736c663e08ca
Ward, Emma, Naughton, Felix, Belderson, Pippa, Papakonstantinou, Trisevgeni, Ainsworth, Ben, Hanson, Sarah, Notley, Caitlin and Bondaronek, Paulina
(2025)
Using machine-assisted topic analysis to expedite thematic analysis of free-text data: Exemplar investigation of factors influencing health behaviours and wellbeing during the COVID-19 pandemic.
British Journal of Health Psychology, 30 (3), [e70017].
(doi:10.1111/bjhp.70017).
Abstract
Objectives: investigate the use of machine learning to expedite thematic analysis of qualitative data concerning factors that influenced health behaviours and wellbeing during the COVID-19 pandemic.
Design: qualitative investigation using Machine-Assisted Topic Analysis (MATA) of free-text data collected from a prospective cohort.
Methods: free-text survey data (2177 responses from 762 participants) of influences on health behaviours and wellbeing were collected among UK participants recruited online, using Qualtrics at 3, 6, 12 and 24 months after the COVID-19 pandemic started. MATA, which employs structural topic modelling (STM), was used (in R) to discern latent topics within the responses. Two researchers independently labelled topics and collaboratively organized them into themes, with ‘sense checking’ from two additional researchers. Plots and rankings were generated, showing change in topic prevalence by time. Total researcher time to complete analysis was collated.
Results: fifteen STM-generated topics were labelled and integrated into six themes: the influences of and impacts on (1) health behaviours, (2) physical health (3) mood and (4) how these interacted, partly moderated by (5) external influences of control and (6) reflections on wellbeing and personal growth. Topic prevalence varied meaningfully over time, aligning with changes in the pandemic context. Themes were generated (excluding write-up) with 20 h combined researcher time.
Conclusions: MATA shows promise as a resource-saving method for thematic analysis of large qualitative datasets whilst maintaining researcher control and insight. Findings show the interconnection between health behaviours, physical health and wellbeing over the pandemic, and the influence of control and reflective processes.
Text
BJHP V4 Manuscript_FINAL_Non-Anonymised_Accepted_07.08.25
- Accepted Manuscript
Text
British J Health Psychol - 2025 - Ward - Using machine‐assisted topic analysis to expedite thematic analysis of free‐text
- Version of Record
More information
Accepted/In Press date: 7 August 2025
e-pub ahead of print date: 11 September 2025
Published date: 11 September 2025
Keywords:
COVID-19, artificial intelligence, health, machine learning, qualitative, thematic analysis, wellbeing
Identifiers
Local EPrints ID: 505045
URI: http://eprints.soton.ac.uk/id/eprint/505045
ISSN: 1359-107X
PURE UUID: 3a2d0c89-ce61-4957-a800-c2b5019c2e01
Catalogue record
Date deposited: 24 Sep 2025 16:55
Last modified: 25 Sep 2025 01:44
Export record
Altmetrics
Contributors
Author:
Emma Ward
Author:
Felix Naughton
Author:
Pippa Belderson
Author:
Trisevgeni Papakonstantinou
Author:
Ben Ainsworth
Author:
Sarah Hanson
Author:
Caitlin Notley
Author:
Paulina Bondaronek
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics