The University of Southampton
University of Southampton Institutional Repository

Scoring games fairly: Biases and interference in games based assessment

Scoring games fairly: Biases and interference in games based assessment
Scoring games fairly: Biases and interference in games based assessment
Gaming is an interactive medium that has much in common with education. Both games and good classroom practice are learning environments, with overall objectives, scaffolded progression, checks along the way, and regular, purposeful feedback. Games also provide a space to practice complex skills such as collaboration, or managing a system. These skills are rarely directly assessed in compulsory education because they are difficult to evidence efficiently. Games are fun learning environments for many children, and they could provide a means to resolve this problem. However, the structure of gaming data is not aligned to many assessment analysis methods. Gaming data is conditionally dependent, there are continuous variables as well as categorical and dichotomous responses, there is often more than one possible proxy for ability, and there are very large amounts of data missing. Aspects that assessors traditionally force to be constants, such as the number of attempts or the response time, become variables in games, and it is important to know their limitations and worth as variables. This interdisciplinary study looks at these problems in scoring performance in games. It uses a quantitative methodology, with a case study secondary data set from MangaHigh. MangaHigh is a website with a range of dynamic maths games for primary and secondary aged learners, and over a million children were using the site at the time of data extraction. Using a sample data set, chosen by criterion sampling, the impact of missing data, response times and additional attempts was explored through insights and methods from Item Response Theory (IRT) and other quantitative analysis techniques. Demographic data also helped to contextualize the findings and inform decision-making. In the analysis, choice of game mechanics were found to have an impact on the extent and nature of missing data, which was found to have a complex relationship with the target variable, ability. The choice of measure, such as mean, recency-weighted mean, high score or most recent score was found to be central to determining the grade. Several issues when the child competed against a human or bot competitor or collaborator were identified. Response time functioned as a context variable to define valid attempts, helping to identify non-targeted behaviours such as browsing, conceding or wandering off. As gamers have suggested, response time appeared to also function as a proxy for ability, but there does not seem to be a linear relationship between ability and time. Instead, ‘speed’ seems to be the proxy, and this was found to be a function of the response time, the child, the game and also the band score and game mechanics. Outside of an optimal range, short response times could act as a confounding variable. There was evidence that some stability of performance may also act a proxy of ability. Finally, adding a familiarity weighting when a child comes back for a second attempt proved problematic, but a novelty weighting for early attempts can work. Having said that, although games became easier with each subsequent attempt, evidence from the first attempt playing appears unreliable, and the data has features that are characteristic of guessing behaviour. Although a large number of problems were identified, this analysis also found some clear ways forward to adjust the assessment and games design, and the collection of data to make scores from games more meaningful and reduce bias in the scoring process. On the basis of this study, there are many design choices that could improve or deteriorate the quality of data gathered in gaming environments.
University of Southampton
Walsh, Clare Elizabeth
3972b47c-5ce7-45fc-b843-7dcbde9504de
Walsh, Clare Elizabeth
3972b47c-5ce7-45fc-b843-7dcbde9504de

Walsh, Clare Elizabeth (2020) Scoring games fairly: Biases and interference in games based assessment. Doctoral Thesis, 308pp.

Record type: Thesis (Doctoral)

Abstract

Gaming is an interactive medium that has much in common with education. Both games and good classroom practice are learning environments, with overall objectives, scaffolded progression, checks along the way, and regular, purposeful feedback. Games also provide a space to practice complex skills such as collaboration, or managing a system. These skills are rarely directly assessed in compulsory education because they are difficult to evidence efficiently. Games are fun learning environments for many children, and they could provide a means to resolve this problem. However, the structure of gaming data is not aligned to many assessment analysis methods. Gaming data is conditionally dependent, there are continuous variables as well as categorical and dichotomous responses, there is often more than one possible proxy for ability, and there are very large amounts of data missing. Aspects that assessors traditionally force to be constants, such as the number of attempts or the response time, become variables in games, and it is important to know their limitations and worth as variables. This interdisciplinary study looks at these problems in scoring performance in games. It uses a quantitative methodology, with a case study secondary data set from MangaHigh. MangaHigh is a website with a range of dynamic maths games for primary and secondary aged learners, and over a million children were using the site at the time of data extraction. Using a sample data set, chosen by criterion sampling, the impact of missing data, response times and additional attempts was explored through insights and methods from Item Response Theory (IRT) and other quantitative analysis techniques. Demographic data also helped to contextualize the findings and inform decision-making. In the analysis, choice of game mechanics were found to have an impact on the extent and nature of missing data, which was found to have a complex relationship with the target variable, ability. The choice of measure, such as mean, recency-weighted mean, high score or most recent score was found to be central to determining the grade. Several issues when the child competed against a human or bot competitor or collaborator were identified. Response time functioned as a context variable to define valid attempts, helping to identify non-targeted behaviours such as browsing, conceding or wandering off. As gamers have suggested, response time appeared to also function as a proxy for ability, but there does not seem to be a linear relationship between ability and time. Instead, ‘speed’ seems to be the proxy, and this was found to be a function of the response time, the child, the game and also the band score and game mechanics. Outside of an optimal range, short response times could act as a confounding variable. There was evidence that some stability of performance may also act a proxy of ability. Finally, adding a familiarity weighting when a child comes back for a second attempt proved problematic, but a novelty weighting for early attempts can work. Having said that, although games became easier with each subsequent attempt, evidence from the first attempt playing appears unreliable, and the data has features that are characteristic of guessing behaviour. Although a large number of problems were identified, this analysis also found some clear ways forward to adjust the assessment and games design, and the collection of data to make scores from games more meaningful and reduce bias in the scoring process. On the basis of this study, there are many design choices that could improve or deteriorate the quality of data gathered in gaming environments.

Text
Thesis Clare Walsh for depositing
Available under License University of Southampton Thesis Licence.
Download (4MB)
Text
cew2g15 Permission to deposit_Rw
Restricted to Repository staff only

More information

Published date: January 2020

Identifiers

Local EPrints ID: 448273
URI: http://eprints.soton.ac.uk/id/eprint/448273
PURE UUID: 35278fc9-06b8-4c31-b715-f9b29d230b21
ORCID for Clare Elizabeth Walsh: ORCID iD orcid.org/0000-0002-7757-2301

Catalogue record

Date deposited: 19 Apr 2021 16:30
Last modified: 12 Dec 2021 11:09

Export record

Contributors

Author: Clare Elizabeth Walsh ORCID iD

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×