READ ME File For 'Data for Thesis: Cognitive Processes in Writing-to-Learn'

Dataset DOI: https://doi.org/10.5258/SOTON/D3851

ReadMe Author: Amy Peters, University of Southampton, https://orcid.org/0000-0001-9833-7709

This dataset supports the thesis entitled 'Cognitive Processes in Writing-to-Learn'
AWARDED BY: University of Southampton
DATE OF AWARD: 2026

Data collection: 2020 - 2024

Licence: CC BY-NC-SA

Funder: ESRC South Coast Doctoral Training Partnership

--------------------
DATA & FILE OVERVIEW
--------------------

PAPER 1

- Paper1_Data.csv
Data for Paper 1. Data was analysed using IBM SPSS (Version 27.0.1.0 for Macintosh).

Variables
- ParticipantNumber
- PlanningCondition: Planning condition participant was randomly allocated to (1 = Outline; 2 = Synthetic)
- WritingCondition: Writing condition participant was randomly allocated to (1 = Essay; 2 = Free Recall)
- MCQScore_Total: Total score on multiple-choice questions (MCQs)
- MCQScore_Direct: Total score on direct MCQs
- MCQScore_Inferential: Total score on inferential MCQs
- SAQScore: Total score on short-answer questions (SAQs)
- Knowledge1_Average: Mean of subjective knowledge rating 1 (pre-writing)
- Knowledge2_Average: Mean of subjective knowledge rating 2 (post-writing)
- Knowledge3_Average: Mean of subjective knowledge rating 3 (two-days later)
- Understanding1_Average: Mean of understanding sub-scale subjective knowledge rating 1 (pre-writing)
- Organisation1_Average: Mean of organisation sub-scale subjective knowledge rating 1 (pre-writing)
- Understanding2_Average: Mean of understanding sub-scale subjective knowledge rating 2 (post-writing)
- Organisation2_Average: Mean of organisation sub-scale subjective knowledge rating 2 (post-writing)
- Understanding3_Average: Mean of understanding sub-scale subjective knowledge rating 3 (two-days later)
- Organisation3_Average: Mean of organisation sub-scale subjective knowledge rating 3 (two-days later)

PAPER 2

- Paper2_Data.zip (Paper2_Experiment1_Data.csv and Paper2_Experiment2_Data.csv)
Data for Paper 2, Experiments 1 and 2. Data was analysed using IBM SPSS (Version 27.0.1.0 for Macintosh).

Paper 2, Experiment 1
Variables
- ParticipantNumber
- TextCondition: Text topic participant was randomly allocated to (0 = Solar Activity; 1 = Deserts)
- QuestionCondition: Test condition participant was randomly allocated to (1 = No Text; 2 = Text Absent; 3 = Text Present)
- Condition: Combined condition (1 = Solar, No Text; 2 = Solar, Text Absent; 3 = Solar, Text Present; 4 = Deserts, No Text; 5 = Deserts, Text Absent; 6 = Deserts, Text Present)
- Include_in_MCQ_Analysis: Whether participant was included in analysis of MCQ scores (0 = No; 1 = Yes)
- SolarActivity_MCQ_Total: Total score on the solar activity multiple-choice questions (MCQs)
- SolarActivity_MCQ_DirectTotal: Total score on the solar activity direct MCQs
- SolarActivity_MCQ_InferentialTotal: Total score on the solar activity inferential MCQs
- Deserts_MCQ_Total: Total score on the deserts MCQs
- Deserts_MCQ_DirectTotal: Total score on the deserts direct MCQs
- Deserts_MCQ_InferentialTotal: Total score on the deserts inferential MCQs
- MCQTotalScore: Total score on MCQs
- MCQDirectTotalScore: Total score on direct MCQs
- MCQInferentialTotalScore: Total score on inferential MCQs
- Include_in_SAQ_Analysis: Whether participant was included in analysis of SAQ scores (0 = No; 1 = Yes)
- SolarActivity_SAQTotal: Total score on the solar activity short-answer questions (SAQs)
- Deserts_SAQTotal: Total score on the deserts SAQs
- SAQTotal: Total score on SAQs

Note: #NULL! is missing data.

Paper 2, Experiment 2
Variables
- ParticipantNumber
- TextCondition: Text topic participant was randomly allocated to (1 = Solar Activity; 2 = Deserts)
- WritingCondition: Writing condition participant was randomly allocated to (1 = Written Essay; 2 = Mental Essay; 3 = Written Recall; 4 = Mental Recall)
- TextWritingCondition: Combined condition (1 = Solar Activity/Written Essay; 2 = Solar Activity/Mental Essay; 3 = Solar Activity/Written Recall; 4 = Solar Activity/Mental Recall; 5 = Deserts/Written Essay; 6 = Deserts/Mental Essay; 7 = Deserts/Written Recall; 8 = Deserts/Mental Recall)
- MCQTotalScore: Total score on multiple-choice questions (MCQs)
- MCQDirectScore: Total score on direct MCQs
- MCQInferentialScore: Total score on inferential MCQs
- SAQTotalScore: Total score on short-answer questions (SAQs)
- Knowledge1Average: Mean of subjective knowledge rating 1 (pre-writing)
- Knowledge2Average: Mean of subjective knowledge rating 2 (post-writing)
- Knowledge3Average: Mean of subjective knowledge rating 3 (two-days later)
- MCQ_PercentageCorrect: Total score on MCQs as a percentage
- MCQDirect_PercentageCorrect: Total score on direct MCQs as a percentage
- MCQInferential_PercentageCorrect: Total score on inferential MCQs as a percentage
- SAQ_PercentageCorrect: Total score on SAQs as a percentage
- ModalityCondition: Writing modality condition (0 = Mental; 1 = Written)
- ActivityCondition: Writing activity condition (0 = Recall; 1 = Essay)
- prepostknowledgechange: Change in subjective knowledge rating from pre-writing to post-writing

PAPER 3

- Paper3_Data.zip (Paper3_Experiment1_Data.csv and Paper3_Experiment2_Data.csv)
Data for Paper 3, Experiments 1 and 2. Analysis was conducted in Python (Version 2.7.18) using the HDDM library (Wiecki et al., 2022). 

Variables
- Participant: Participant number
- Block: Experimental block number
- Condition: Experimental block production condition (spoken or written)
- Text: Text topic for experimental block
- Word: Target word (stimulus)
- Response: Response given by participant ('k' if they believed it was an in-text word; 'l' if they believed it was a non-text word)
- CorrectResponse: Response that should have been provided ('K' = in-text word; 'L' = non-text word)
- Correct: 1 = participant provided correct answer; 0 = participant provided incorrect answer
- ReactionTime: Reaction time for response in milliseconds

PAPER 4

- Paper4_Data.csv
Data for Paper 4. Data was analysed using IBM SPSS (Version 29.0.2.0 for Macintosh); Bayesian Hierarchical Drift Diffusion Modelling (HDDM) in Python (Version 2.7.18); R (Version 4.4.2) with Psych package (Revelle, 2024). 

Variables
- ParticipantNumber
- Writing_Condition: Condition participant was randomly allocated to (0 = Recall; 1 = Synthesis)
- Knowledge1Average: Mean of subjective knowledge rating 1 (prior knowledge), items 1-12
- Knowledge2Average: Mean of subjective knowledge rating 2 (pre-writing), items 1-12
- Knowledge3Average: Mean of subjective knowledge rating 3 (post-writing), items 1-12
- BehaviourTherapyMCQTotalScore: Total score of eight multiple-choice questions (MCQs) for behaviour therapy topic
- BehaviourTherapyMCQDirectTotalScore: Total score of four direct MCQs for behaviour therapy topic
- BehaviourTherapyMCQInferentialTotalScore: Total score of four inferential MCQs for behaviour therapy topic
- PsychoanalysisMCQTotalScore: Total score of eight MCQs for psychoanalysis topic
- PsychoanalysisMCQDirectTotalScore: Total score of four direct MCQs for psychoanalysis topic
- PsychoanalysisMCQInferentialTotalScore: Total score of four inferential MCQs for psychoanalysis topic
- MCQTotalScore: Total score of sixteen MCQs
- MCQDirectTotalScore: Total score of eight direct MCQs
- MCQInferentialTotalScore: Total score of eight inferential MCQs
- BehaviourTherapySAQTotalScore: Total score of four short-answer questions (SAQs) for behaviour therapy topic
- PsychoanalysisSAQTotalScore: Total score of four SAQs for psychoanalysis topic
- SAQTotalScore: Total score of eight SAQs
- Global_Linearity: Factor score for global linearity writing process measure
- Sentence_Production: Factor score for sentence production writing process measure
- Change_in_understanding_2_3: Calculated by subtracting Knowledge2Average from Knowledge3Average
- Global_Linearity_without_TMI: Factor score for global linearity writing process measure excluding text modification index (TMI)
- Sentence_Production_without_TMI: Factor score for sentence production writing process measure excluding TMI
- MCQTotalScore_PercentageCorrect: Total MCQ score as a percentage
- MCQTotalScore_Standardised: Total MCQ score as a decimal
- MCQTotalScore_BehaviourTherapy_PercentageCorrect: Total MCQ score on behaviour therapy topic as a percentage
- MCQTotalScore_BehaviourTherapy_Standardised: Total MCQ score on behaviour therapy topic as a decimal
- MCQTotalScore_Psychoanalysis_PercentageCorrect: Total MCQ score on psychoanalysis topic as a percentage
- MCQTotalScore_Psychoanalysis_Standardised: Total MCQ score on psychoanalysis topic as a decimal
- DirectMCQTotalScore_PercentageCorrect: Total direct MCQ score as a percentage
- DirectMCQTotalScore_Standardised: Total direct MCQ score as a decimal
- DirectMCQTotalScore_BehaviourTherapy_PercentageCorrect: Total direct MCQ score on behaviour therapy topic as a percentage
- DirectMCQTotalScore_BehaviourTherapy_Standardised: Total direct MCQ score on behaviour therapy topic as a decimal
- DirectMCQTotalScore_Psychoanalysis_PercentageCorrect: Total direct MCQ score on psychoanalysis topic as a percentage
- DirectMCQTotalScore_Psychoanalysis_Standardised: Total direct MCQ score on psychoanalysis topic as a decimal
- InferentialMCQTotalScore_PercentageCorrect: Total inferential MCQ score as a percentage 
- InferentialMCQTotalScore_Standardised: Total inferential MCQ score as a decimal
- InferentialMCQTotalScore_BehaviourTherapy_PercentageCorrect: Total inferential MCQ score on behaviour therapy topic as a percentage
- InferentialMCQTotalScore_BehaviourTherapy_Standardised: Total inferential MCQ score on behaviour therapy topic as a decimal
- InferentialMCQTotalScore_Psychoanalysis_PercentageCorrect: Total inferential MCQ score on psychoanalysis topic as a percentage
- InferentialMCQTotalScore_Psychoanalysis_Standardised: Total inferential MCQ score on psychoanalysis topic as a decimal
- SAQTotalScore_PercentageScore: Total SAQ score as a percentage
- SAQTotalScore_Standardised: Total SAQ score as a decimal 
- SAQTotalScore_BehaviourTherapy_PercentageScore: Total SAQ score on behaviour therapy topic as a percentage
- SAQTotalScore_BehaviourTherapy_Standardised: Total SAQ score on behaviour therapy topic as a decimal
- SAQTotalScore_Psychoanalysis_PercentageScore: Total SAQ score on psychoanalysis topic as a percentage 
- SAQTotalScore_Psychoanalysis_Standardised: Total SAQ score on psychoanalysis topic as a decimal
- RT_t_parameter_mean: t parameter estimate from HDDM winning model
