READ ME File For Data in support of the Southampton doctoral thesis 
'The effect of immediate feedback and difficulty level on learning and engagement in a Spanish learning mobile application'.

Dataset DOI: 10.5258/SOTON/D2673

ReadMe Author: Simon Kurt Erik Paul Jonsson, University of Southampton 

This dataset supports the thesis entitled: The effect of immediate feedback and difficulty level on learning and engagement in a Spanish learning mobile application
AWARDED BY: Univeristy of Southampton
DATE OF AWARD: 2023


DESCRIPTION OF THE DATA
The data was collected the Lengua Spanish app, and was in the form of timestamped
events, such as the app being started or the user pressing a button. The app automatically
transferred the data to a server at University of Southampton where it was stored
in a MySQL database. The database was then queried and the raw data was processed
into four CSV files describee below.

Short background:
The app that collected this data is a Spanish learning app where users completed
study sessions. Serveral study sessions completed after each other with no more than
10 minutes passing between each session where called a cluster. The user studied
Spanish phrases and each phrase would be learnt and then revised with regular
intervals. revisionData.csv contains data for each such revision.

See thesis for further detail and context of these four csv files.



This dataset contains:


userData.CSV
------------
userid                          Unique user identifier
date                            Date of first use of app
cond.feedback                   Feedback condition
cond.difficulty                 Difficulty condition
duration.of.use                 Time between first and last usage in milliseconds
completed.sessions              Count of completed sessions
tot.correct.presses             Count of 'lifetime' correct presses
tot.incorrect.presses           Count of 'lifetime' incorrect presses
tot.correct.sentences           Count of 'lifetime' correct sentences
tot.incorrect.sentences         Count of 'lifetime' incorrect sentences
prop.corr.press.lifetime        Proportion 'lifetime' correct presses
prop.corr.sentences.lifetime    Proportion 'lifetime' correct sentences
sentences.progress              Count of unique phrases studied
sentences.testcount             Count of completed tests
clusters                        Count of 'lifetime' completed clusters
revision.corr                   Count of correct revisions
revision.incorr                 Count of incorrect revisions
time.used.str                   Duration of use as a formatted string
individual.days.used            Count of days the user has used the app
run.streak.tol.2                Longest running streak in days with 2 day tolerance (see thesis for context)
run.streak.tol.3                Longest running streak in days with 3 day tolerance (see thesis for context)
run.streak.tol.5                Longest running streak in days with 5 day tolerance (see thesis for context)
run.streak.tol.7                Longest running streak in days with 7 day tolerance (see thesis for context)



clusterData.CSV
---------------
date                            Date of cluster completion
user.id                         Unique user identifier
feedback                        Feedback condition
cluster.nr                      Sequential number identiying cluster for the user
cluster.id                      Unique cluster identifier in whole dataset
prop.correct.presses            Proportion correct presses
prop.correct.sentences          Proportion correct sentences
nr.sessions.in.cluster          Count of sessions in the cluster
has.following.cluster           Boolean of whether this was the last cluster the user completed or not.
duration.min                    Duration of the cluster in minutes
nr.correct.presses              Count of correct presses in cluster
nr.incorrect.presses            Count of incorrect presses in cluster



revisionData.CSV
----------------
date                                  Date of cluster completion
userid                                Unique user identifier
feedback                              Feedback condition
difficulty                            Difficulty condition
cluster.nr                            Variable from parent cluster. See clusterData.csv
cluster.id                            Variable from parent cluster. See clusterData.csv
cluster.prop.correct.presses          Variable from parent cluster. See clusterData.csv
cluster.prop.correct.sentences        Variable from parent cluster. See clusterData.csv
session.nr.lifetime                   Variable from parent session. See sessionData.csv
session.id                            Variable from parent session. See sessionData.csv
sentence.id                           Variable from parent session. See sessionData.csv
session.prop.correct.presses          Variable from parent session. See sessionData.csv
session.prop.correct.sentences        Variable from parent session. See sessionData.csv
test.nr                               Sequential number for revision and user.
first.test.successful                 Whether the user got the sentence right on the first try
same.cluster                          Same cluster
hours.between                         Hours between the current and previous revision
mins.into.cluster                     How many minutes into the cluster the revision was completed
prev.cluster.prop.correct.presses     Variable from the parent cluster of the revision before the current revision. See clusterData.csv
prev.cluster.tot.presses              Variable from the parent cluster of the revision before the current revision. See clusterData.csv
prev.cluster.prop.correct.sentences   Variable from the parent cluster of the revision before the current revision. See clusterData.csv
prev.cluster.tot.sentences            Variable from the parent cluster of the revision before the current revision. See clusterData.csv
prev.session.prop.correct.presses     Variable from the parent session of the revision before the current revision. See sessionData.csv
prev.session.tot.presses              Variable from the parent session of the revision before the current revision. See sessionData.csv
prev.session.prop.correct.sentences   Variable from the parent session of the revision before the current revision. See sessionData.csv
prev.session.tot.sentences            Variable from the parent session of the revision before the current revision. See sessionData.csv
rev.exposure.incorrect                Count of how many incorrect tests where completed before the user got the sentence right.
prev.mins.into.cluster                Mintues into cluster the previous revision was completed


sessionData.CSV
---------------
user.id                     Unique user identifier
feedback                    Feedback condition
difficulty                  Difficulty condition
cluster.nr                  Variable from parent cluster. See clusterData.csv
cluster.id                  Variable from parent cluster. See clusterData.csv
session.nr                  Sequential number identiying session for the user
session.id                  Unique session identifier in whole dataset
prop.correct.presses        Proportion correct presses in session
prop.correct.sentences      Proportion correct sentences in session
nr.sessions.in.cluster      Count of sessions in the parent cluster
has.following.session       Boolean of whether the session was the last session in the cluster or not
nr.correct.sentences        Count of the correct sentences in the session
nr.incorrect.sentences      Count of the incorrect sentences in the session
nr.correct.presses          Count of the correct presses in the session
nr.incorrect.presses        Count of the incorrect presses in the session



Date of data collection: 2020-09-21 - 2021-03-31

Information about geographic location of data collection:
Collected from mobile devices using an app and transmitted to a server at the
Univeristy or Southampton. No location data was recorded on the mobile devices.


Date that the file was created: June, 2023