The University of Southampton
University of Southampton Institutional Repository

Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques

Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques
Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques
Introduction: machine-assisted topic analysis (MATA) uses artificial intelligence methods to help qualitative researchers analyze large datasets. This is useful for researchers to rapidly update healthcare interventions during changing healthcare contexts, such as a pandemic. We examined the potential to support healthcare interventions by comparing MATA with “human-only” thematic analysis techniques on the same dataset (1,472 user responses from a COVID-19 behavioral intervention).

Methods: in MATA, an unsupervised topic-modeling approach identified latent topics in the text, from which researchers identified broad themes. In human-only codebook analysis, researchers developed an initial codebook based on previous research that was applied to the dataset by the team, who met regularly to discuss and refine the codes. Formal triangulation using a “convergence coding matrix” compared findings between methods, categorizing them as “agreement”, “complementary”, “dissonant”, or “silent”.

Results: human analysis took much longer than MATA (147.5 vs. 40 h). Both methods identified key themes about what users found helpful and unhelpful. Formal triangulation showed both sets of findings were highly similar. The formal triangulation showed high similarity between the findings. All MATA codes were classified as in agreement or complementary to the human themes. When findings differed slightly, this was due to human researcher interpretations or nuance from human-only analysis.

Discussion: results produced by MATA were similar to human-only thematic analysis, with substantial time savings. For simple analyses that do not require an in-depth or subtle understanding of the data, MATA is a useful tool that can support qualitative researchers to interpret and analyze large datasets quickly. This approach can support intervention development and implementation, such as enabling rapid optimization during public health emergencies.
Machine learning techniques, interventions, public health, qualitative analysis, triangulation, machine learning techniques
2296-2565
Towler, Lauren
ebb4fb4e-703f-4e52-a9dc-53e72ca68e8f
Bondaronek, Paulina
315e63f0-9b9c-451a-87ae-736c663e08ca
Papakonstantinou, Trisevgeni
6e39c90c-6cf8-4311-8b5f-a7bcb2a37141
Amlôt, Richard
d93f5263-ea24-4b12-b505-f51694220b8e
Chadborn, Tim
fb42e42c-cac4-46bc-8f4f-07844add4d93
Ainsworth, Ben
b02d78c3-aa8b-462d-a534-31f1bf164f81
Yardley, Lucy
64be42c4-511d-484d-abaa-f8813452a22e
Towler, Lauren
ebb4fb4e-703f-4e52-a9dc-53e72ca68e8f
Bondaronek, Paulina
315e63f0-9b9c-451a-87ae-736c663e08ca
Papakonstantinou, Trisevgeni
6e39c90c-6cf8-4311-8b5f-a7bcb2a37141
Amlôt, Richard
d93f5263-ea24-4b12-b505-f51694220b8e
Chadborn, Tim
fb42e42c-cac4-46bc-8f4f-07844add4d93
Ainsworth, Ben
b02d78c3-aa8b-462d-a534-31f1bf164f81
Yardley, Lucy
64be42c4-511d-484d-abaa-f8813452a22e

Towler, Lauren, Bondaronek, Paulina, Papakonstantinou, Trisevgeni, Amlôt, Richard, Chadborn, Tim, Ainsworth, Ben and Yardley, Lucy (2023) Applying machine-learning to rapidly analyze large qualitative text datasets to inform the COVID-19 pandemic response: comparing human and machine-assisted topic analysis techniques. Frontiers in Public Health, 11, [1268223]. (doi:10.3389/fpubh.2023.1268223).

Record type: Article

Abstract

Introduction: machine-assisted topic analysis (MATA) uses artificial intelligence methods to help qualitative researchers analyze large datasets. This is useful for researchers to rapidly update healthcare interventions during changing healthcare contexts, such as a pandemic. We examined the potential to support healthcare interventions by comparing MATA with “human-only” thematic analysis techniques on the same dataset (1,472 user responses from a COVID-19 behavioral intervention).

Methods: in MATA, an unsupervised topic-modeling approach identified latent topics in the text, from which researchers identified broad themes. In human-only codebook analysis, researchers developed an initial codebook based on previous research that was applied to the dataset by the team, who met regularly to discuss and refine the codes. Formal triangulation using a “convergence coding matrix” compared findings between methods, categorizing them as “agreement”, “complementary”, “dissonant”, or “silent”.

Results: human analysis took much longer than MATA (147.5 vs. 40 h). Both methods identified key themes about what users found helpful and unhelpful. Formal triangulation showed both sets of findings were highly similar. The formal triangulation showed high similarity between the findings. All MATA codes were classified as in agreement or complementary to the human themes. When findings differed slightly, this was due to human researcher interpretations or nuance from human-only analysis.

Discussion: results produced by MATA were similar to human-only thematic analysis, with substantial time savings. For simple analyses that do not require an in-depth or subtle understanding of the data, MATA is a useful tool that can support qualitative researchers to interpret and analyze large datasets quickly. This approach can support intervention development and implementation, such as enabling rapid optimization during public health emergencies.

Text
TOWLER 2023 - FRONT PUB HEALTH - Applying machine learning - Version of Record
Available under License Creative Commons Attribution.
Download (1MB)

More information

Accepted/In Press date: 16 October 2023
e-pub ahead of print date: 31 October 2023
Published date: 2023
Additional Information: Funding Information: The author(s) declare financial support was received for the research, authorship, and/or publication of this article. The study was funded by United Kingdom Research and Innovation Medical Research Council (UKRI MRC) Rapid Response Call: UKRI CV220-009. The Germ Defence intervention was hosted by the Lifeguide Team, supported by the NIHR Biomedical Research Centre, University of Southampton. LY is a National Institute for Health Research (NIHR) Senior Investigator and team lead for University of Southampton Biomedical Research Centre. LY is affiliated to the National Institute for Health Research Health Protection Research Unit (NIHR HPRU) in Behavioural Science and Evaluation of Interventions at the University of Bristol in partnership with Public Health England (PHE). Publisher Copyright: Copyright © 2023 Towler, Bondaronek, Papakonstantinou, Amlôt, Chadborn, Ainsworth and Yardley.
Keywords: Machine learning techniques, interventions, public health, qualitative analysis, triangulation, machine learning techniques

Identifiers

Local EPrints ID: 483677
URI: http://eprints.soton.ac.uk/id/eprint/483677
ISSN: 2296-2565
PURE UUID: 7e7ded4c-3e6f-4c54-9a48-0ea7f31fd766
ORCID for Lauren Towler: ORCID iD orcid.org/0000-0002-6597-0927
ORCID for Ben Ainsworth: ORCID iD orcid.org/0000-0002-5098-1092
ORCID for Lucy Yardley: ORCID iD orcid.org/0000-0002-3853-883X

Catalogue record

Date deposited: 03 Nov 2023 17:50
Last modified: 30 Nov 2024 02:56

Export record

Altmetrics

Contributors

Author: Lauren Towler ORCID iD
Author: Paulina Bondaronek
Author: Trisevgeni Papakonstantinou
Author: Richard Amlôt
Author: Tim Chadborn
Author: Ben Ainsworth ORCID iD
Author: Lucy Yardley ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×