READ ME File For Data in support of the Southampton doctoral thesis 'The effect of immediate feedback and difficulty level on learning and engagement in a Spanish learning mobile application'. Dataset DOI: 10.5258/SOTON/D2673 ReadMe Author: Simon Kurt Erik Paul Jonsson, University of Southampton This dataset supports the thesis entitled: The effect of immediate feedback and difficulty level on learning and engagement in a Spanish learning mobile application AWARDED BY: Univeristy of Southampton DATE OF AWARD: 2023 DESCRIPTION OF THE DATA The data was collected the Lengua Spanish app, and was in the form of timestamped events, such as the app being started or the user pressing a button. The app automatically transferred the data to a server at University of Southampton where it was stored in a MySQL database. The database was then queried and the raw data was processed into four CSV files describee below. Short background: The app that collected this data is a Spanish learning app where users completed study sessions. Serveral study sessions completed after each other with no more than 10 minutes passing between each session where called a cluster. The user studied Spanish phrases and each phrase would be learnt and then revised with regular intervals. revisionData.csv contains data for each such revision. See thesis for further detail and context of these four csv files. This dataset contains: userData.CSV ------------ userid Unique user identifier date Date of first use of app cond.feedback Feedback condition cond.difficulty Difficulty condition duration.of.use Time between first and last usage in milliseconds completed.sessions Count of completed sessions tot.correct.presses Count of 'lifetime' correct presses tot.incorrect.presses Count of 'lifetime' incorrect presses tot.correct.sentences Count of 'lifetime' correct sentences tot.incorrect.sentences Count of 'lifetime' incorrect sentences prop.corr.press.lifetime Proportion 'lifetime' correct presses prop.corr.sentences.lifetime Proportion 'lifetime' correct sentences sentences.progress Count of unique phrases studied sentences.testcount Count of completed tests clusters Count of 'lifetime' completed clusters revision.corr Count of correct revisions revision.incorr Count of incorrect revisions time.used.str Duration of use as a formatted string individual.days.used Count of days the user has used the app run.streak.tol.2 Longest running streak in days with 2 day tolerance (see thesis for context) run.streak.tol.3 Longest running streak in days with 3 day tolerance (see thesis for context) run.streak.tol.5 Longest running streak in days with 5 day tolerance (see thesis for context) run.streak.tol.7 Longest running streak in days with 7 day tolerance (see thesis for context) clusterData.CSV --------------- date Date of cluster completion user.id Unique user identifier feedback Feedback condition cluster.nr Sequential number identiying cluster for the user cluster.id Unique cluster identifier in whole dataset prop.correct.presses Proportion correct presses prop.correct.sentences Proportion correct sentences nr.sessions.in.cluster Count of sessions in the cluster has.following.cluster Boolean of whether this was the last cluster the user completed or not. duration.min Duration of the cluster in minutes nr.correct.presses Count of correct presses in cluster nr.incorrect.presses Count of incorrect presses in cluster revisionData.CSV ---------------- date Date of cluster completion userid Unique user identifier feedback Feedback condition difficulty Difficulty condition cluster.nr Variable from parent cluster. See clusterData.csv cluster.id Variable from parent cluster. See clusterData.csv cluster.prop.correct.presses Variable from parent cluster. See clusterData.csv cluster.prop.correct.sentences Variable from parent cluster. See clusterData.csv session.nr.lifetime Variable from parent session. See sessionData.csv session.id Variable from parent session. See sessionData.csv sentence.id Variable from parent session. See sessionData.csv session.prop.correct.presses Variable from parent session. See sessionData.csv session.prop.correct.sentences Variable from parent session. See sessionData.csv test.nr Sequential number for revision and user. first.test.successful Whether the user got the sentence right on the first try same.cluster Same cluster hours.between Hours between the current and previous revision mins.into.cluster How many minutes into the cluster the revision was completed prev.cluster.prop.correct.presses Variable from the parent cluster of the revision before the current revision. See clusterData.csv prev.cluster.tot.presses Variable from the parent cluster of the revision before the current revision. See clusterData.csv prev.cluster.prop.correct.sentences Variable from the parent cluster of the revision before the current revision. See clusterData.csv prev.cluster.tot.sentences Variable from the parent cluster of the revision before the current revision. See clusterData.csv prev.session.prop.correct.presses Variable from the parent session of the revision before the current revision. See sessionData.csv prev.session.tot.presses Variable from the parent session of the revision before the current revision. See sessionData.csv prev.session.prop.correct.sentences Variable from the parent session of the revision before the current revision. See sessionData.csv prev.session.tot.sentences Variable from the parent session of the revision before the current revision. See sessionData.csv rev.exposure.incorrect Count of how many incorrect tests where completed before the user got the sentence right. prev.mins.into.cluster Mintues into cluster the previous revision was completed sessionData.CSV --------------- user.id Unique user identifier feedback Feedback condition difficulty Difficulty condition cluster.nr Variable from parent cluster. See clusterData.csv cluster.id Variable from parent cluster. See clusterData.csv session.nr Sequential number identiying session for the user session.id Unique session identifier in whole dataset prop.correct.presses Proportion correct presses in session prop.correct.sentences Proportion correct sentences in session nr.sessions.in.cluster Count of sessions in the parent cluster has.following.session Boolean of whether the session was the last session in the cluster or not nr.correct.sentences Count of the correct sentences in the session nr.incorrect.sentences Count of the incorrect sentences in the session nr.correct.presses Count of the correct presses in the session nr.incorrect.presses Count of the incorrect presses in the session Date of data collection: 2020-09-21 - 2021-03-31 Information about geographic location of data collection: Collected from mobile devices using an app and transmitted to a server at the Univeristy or Southampton. No location data was recorded on the mobile devices. Date that the file was created: June, 2023