The University of Southampton
University of Southampton Institutional Repository

An AI-powered research assistant in the lab: a practical guide for text analysis through iterative collaboration with LLMs

An AI-powered research assistant in the lab: a practical guide for text analysis through iterative collaboration with LLMs
An AI-powered research assistant in the lab: a practical guide for text analysis through iterative collaboration with LLMs

Analyzing texts such as open-ended responses, headlines, or social media posts is a time- and labor-intensive process highly susceptible to bias. However, large language models (LLMs) are promising tools for text analysis, using either a predefined (top-down) or a data-driven (bottom-up) taxonomy, without sacrificing quality. Here, we present a step-by-step tutorial to efficiently develop, test, and apply taxonomies for analyzing unstructured data through an iterative and collaborative process between researchers and an LLM. Using personal goals provided by participants as an example, we demonstrate how we used this method to write prompts to review datasets and generate a taxonomy of life domains, evaluate and refine the taxonomy through prompt and direct modifications, and apply the taxonomy to categorize an entire dataset with high intercoder reliability, while achieving high levels of human–LLM intercoder agreement, reducing analysis time by approximately 87.5%. This test offers a proof of concept, suggesting that with the right procedures LLMs can be used to generate reliable bottom-up categorizations. We discuss the possibilities and limitations of using LLMs for text analysis.

Artificial intelligence, GPT, Large language models, Qualitative analysis, Taxonomies, Text analysis, Tutorial
1554-351X
Carmona-Díaz, Gino
4e7eff34-a935-4a9c-9dce-a77514547392
Jiménez-Leal, William
78dbbdce-d0ca-4320-8838-a694bd9e3a5a
Grisales, María Alejandra
61c65bf6-8c26-4574-902c-2140fd83c392
Sripada, Chandra
8e63053e-1ef5-49e8-bb65-3d15885dec6a
Amaya, Santiago
72e22f13-0556-4072-af4a-b7d0c38c6bf7
Inzlicht, Michael
b45f166d-f385-4337-b385-b646dbe8596d
Bermúdez, Juan Pablo
39d9048a-d5e0-486c-b1bd-e5c6312c4969
Carmona-Díaz, Gino
4e7eff34-a935-4a9c-9dce-a77514547392
Jiménez-Leal, William
78dbbdce-d0ca-4320-8838-a694bd9e3a5a
Grisales, María Alejandra
61c65bf6-8c26-4574-902c-2140fd83c392
Sripada, Chandra
8e63053e-1ef5-49e8-bb65-3d15885dec6a
Amaya, Santiago
72e22f13-0556-4072-af4a-b7d0c38c6bf7
Inzlicht, Michael
b45f166d-f385-4337-b385-b646dbe8596d
Bermúdez, Juan Pablo
39d9048a-d5e0-486c-b1bd-e5c6312c4969

Carmona-Díaz, Gino, Jiménez-Leal, William, Grisales, María Alejandra, Sripada, Chandra, Amaya, Santiago, Inzlicht, Michael and Bermúdez, Juan Pablo (2026) An AI-powered research assistant in the lab: a practical guide for text analysis through iterative collaboration with LLMs. Behavior Research Methods, 58 (4), [99]. (doi:10.3758/s13428-026-02966-6).

Record type: Article

Abstract

Analyzing texts such as open-ended responses, headlines, or social media posts is a time- and labor-intensive process highly susceptible to bias. However, large language models (LLMs) are promising tools for text analysis, using either a predefined (top-down) or a data-driven (bottom-up) taxonomy, without sacrificing quality. Here, we present a step-by-step tutorial to efficiently develop, test, and apply taxonomies for analyzing unstructured data through an iterative and collaborative process between researchers and an LLM. Using personal goals provided by participants as an example, we demonstrate how we used this method to write prompts to review datasets and generate a taxonomy of life domains, evaluate and refine the taxonomy through prompt and direct modifications, and apply the taxonomy to categorize an entire dataset with high intercoder reliability, while achieving high levels of human–LLM intercoder agreement, reducing analysis time by approximately 87.5%. This test offers a proof of concept, suggesting that with the right procedures LLMs can be used to generate reliable bottom-up categorizations. We discuss the possibilities and limitations of using LLMs for text analysis.

Text
s13428-026-02966-6 - Version of Record
Available under License Creative Commons Attribution.
Download (826kB)

More information

Accepted/In Press date: 9 February 2026
Published date: 30 March 2026
Additional Information: © 2026. The Author(s).
Keywords: Artificial intelligence, GPT, Large language models, Qualitative analysis, Taxonomies, Text analysis, Tutorial

Identifiers

Local EPrints ID: 510773
URI: http://eprints.soton.ac.uk/id/eprint/510773
ISSN: 1554-351X
PURE UUID: 3e906b2b-46c5-4235-976c-82a40bdfbb24
ORCID for Juan Pablo Bermúdez: ORCID iD orcid.org/0000-0001-5239-2980

Catalogue record

Date deposited: 21 Apr 2026 16:53
Last modified: 22 Apr 2026 02:12

Export record

Altmetrics

Contributors

Author: Gino Carmona-Díaz
Author: William Jiménez-Leal
Author: María Alejandra Grisales
Author: Chandra Sripada
Author: Santiago Amaya
Author: Michael Inzlicht
Author: Juan Pablo Bermúdez ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×