The University of Southampton
University of Southampton Institutional Repository

Artificial intelligence-generated and human expert-designed vocabulary tests: a comparative study

Artificial intelligence-generated and human expert-designed vocabulary tests: a comparative study
Artificial intelligence-generated and human expert-designed vocabulary tests: a comparative study

Artificial intelligence (AI) technologies have the potential to reduce the workload for the second language (L2) teachers and test developers. We propose two AI distractor-generating methods for creating Chinese vocabulary items: semantic similarity and visual similarity. Semantic similarity refers to antonyms and synonyms, while visual similarity refers to the phenomenon that two phrases share one or more characters in common. This study explores the construct validity of the two types of selected-response vocabulary tests (AI-generated items and human expert-designed items) and compares their item difficulty and item discrimination. Both quantitative and qualitative data were collected. Seventy-eight students from Beijing Language and Culture University were asked to respond to AI-generated and human expert-designed items respectively. Students’ scores were analyzed using the two-parameter item response theory (2PL-IRT) model. Thirteen students were then invited to report their test taking strategies in the think-aloud section. The findings from the students’ item responses revealed that the human expert-designed items were easier but had more discriminating power than the AI-generated items. The results of think-aloud data indicated that the AI-generated items and expert-designed items might assess different constructs, in which the former elicited test takers’ bottom-up test-taking strategies while the latter seemed more likely to trigger test takers’ rote memorization ability.

Artificial intelligence, Computerised test, construct validity, vocabulary test
2158-2440
1-12
Luo, Yunjiu
20c491ef-72ae-494d-a4fe-2a6f850bce46
Wei, Wei
cfb0a0ca-3c06-49c0-be81-54e39ff2dad6
Zheng, Ying
abc38a5e-a4ba-460e-92e2-b766d11d2b29
Luo, Yunjiu
20c491ef-72ae-494d-a4fe-2a6f850bce46
Wei, Wei
cfb0a0ca-3c06-49c0-be81-54e39ff2dad6
Zheng, Ying
abc38a5e-a4ba-460e-92e2-b766d11d2b29

Luo, Yunjiu, Wei, Wei and Zheng, Ying (2022) Artificial intelligence-generated and human expert-designed vocabulary tests: a comparative study. SAGE Open, 12 (1), 1-12. (doi:10.1177/21582440221082130).

Record type: Article

Abstract

Artificial intelligence (AI) technologies have the potential to reduce the workload for the second language (L2) teachers and test developers. We propose two AI distractor-generating methods for creating Chinese vocabulary items: semantic similarity and visual similarity. Semantic similarity refers to antonyms and synonyms, while visual similarity refers to the phenomenon that two phrases share one or more characters in common. This study explores the construct validity of the two types of selected-response vocabulary tests (AI-generated items and human expert-designed items) and compares their item difficulty and item discrimination. Both quantitative and qualitative data were collected. Seventy-eight students from Beijing Language and Culture University were asked to respond to AI-generated and human expert-designed items respectively. Students’ scores were analyzed using the two-parameter item response theory (2PL-IRT) model. Thirteen students were then invited to report their test taking strategies in the think-aloud section. The findings from the students’ item responses revealed that the human expert-designed items were easier but had more discriminating power than the AI-generated items. The results of think-aloud data indicated that the AI-generated items and expert-designed items might assess different constructs, in which the former elicited test takers’ bottom-up test-taking strategies while the latter seemed more likely to trigger test takers’ rote memorization ability.

Text
Accepted Manuscript_Sage open 2022 - Accepted Manuscript
Available under License Creative Commons Attribution.
Download (529kB)

More information

Published date: 1 January 2022
Additional Information: Publisher Copyright: © The Author(s) 2022.
Keywords: Artificial intelligence, Computerised test, construct validity, vocabulary test

Identifiers

Local EPrints ID: 454476
URI: http://eprints.soton.ac.uk/id/eprint/454476
ISSN: 2158-2440
PURE UUID: c4d38f2f-436e-4688-88ca-cd1a28a12614
ORCID for Ying Zheng: ORCID iD orcid.org/0000-0003-2574-0358

Catalogue record

Date deposited: 10 Feb 2022 17:44
Last modified: 17 Mar 2024 07:07

Export record

Altmetrics

Contributors

Author: Yunjiu Luo
Author: Wei Wei
Author: Ying Zheng ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×