READ ME File For 'L3 ITALIAN GENEERIC SUBJECTS ACQUISITION DATA' Dataset DOI: https://doi.org/10.5258/SOTON/D2844 ReadMe Author: ELEONORA BOGLIONI, University of Southampton, ORCID ID: 0000-0002-3165-5371 This dataset supports the thesis entitled Acquisition of Noun Phrases with Kind Reference in L3 Italian DATE OF AWARD: 2024 Acquisition of Noun Phrases with Kind Reference in L3 Italian DESCRIPTION OF THE DATA Participants: 30 adult L1 Spanish–L2 Engliah–L3 Italian learners, from Spain. 30 adult L1 Engliah–L2 Spanish–L3 Italian learners, from England. 21 Italian native speakers, from Italy. 10 Spanish native speakers, from Spain. 10 English native speakers, from England. The main participant IDs consist of six letters and numbers, e.g. engspa1 or spaeng1. - The first 3 letters refer to the participant's first language (i.e. eng for English in engspa1) - The second 3 letters refers to the participant's second language (i.e. spa for Spanish in engspa1) - The numbers refer to the order of testing of the individual in the data (i.e. 1 to 30) The baseline participant IDs consist of three letters and numbers, e.g. ita1 or eng1. - The letters refer to the participant's first language (i.e. ita for Italian) - The numbers refer to the order of testing of the individual in the data (i.e. 1 to 21 for the Italian baseline, 1 to 10 for the English and Spanish baselines). ) This dataset consists of: 1. Sets of transcripts of video recordings (as the video recordings themselves have not been anonymised) 2. Collated results of an Acceptability Jdugments Task, Form–to–Meaning Tassk and Elicited Oral Production Task. 3. Collected scores of L2-L3 proficiency and immersion. 4. Summaries of biographic information about the main participants (L3 learners) and the native baselines. Details about the collection and format of each of these consistuent parts are covered below: 1. Video recording transcripts Each of the main participants completed two Form–to–Meaning and two Elicited Oral Production Tassks, For each task, one version was in the L3 (Italian) and the other was in the L2 (Spanish or English). Each of the baseline participants complete one Form–to–Meaning Task and one Elicited Oral Production Task, in the L1 (Italian, Engliah or Spanish). -Form–to–Meaning Task Participants answered yes/mo questions, using the langugae in which they were being tested. The purpose of this task was to assess intepretation of generic subjects. A binary coding was applied to the yes/no answers for the statistical analysis. -Elicited Oral Production Task Participants had to complete an unfinished answer, by utteing the sentence subject. The oral use of generic subjects was then tested in this task. A binary coding was applied to the answers for the statistical analysis Video recordings were transcribed into .xlsx files using Microsoft Excel. The transcripts are saved into two folders within the dataset: - Form–to–Meaning Task transcripts (7 files) - Elicited Oral Production Task transcripts (7 files) 2. Results of the Acceptability Jdugments Task, Form–to–Meaning Tassk and Elicited Oral Production Task - Acceptability Judgments Task This task looked into the knowledge ofsubjects and ojects with various generic meanings. The main prticipants took two task versions, one in the L3 (Italian) and one in the L2 (Spanish or English). The bseline participants took one version fof the task only, in their L1 (Italian, English or Spanish). Participants were presented with a series of written context stories, and had to rate two or three sentences (in isolation) as (un)acceptable continuations of the story.The sententence differed in the type of noun phrase deployed as subjet. They expressed their judgments on a 1-to-4 point scale. The task was administered via a web platform after the ELicited Oral Prouction and Form–to–Meaning Tassk recordings, in this order. Results (ratings) for each participant were input into a CSV file together. Missing values were deleted, so they were not subject to the statistical analysis. - Form–to–Meaning Task and Elcitied Oral Production Task Results (coded answers) of each participant were input into a csv filetogether, for each of these two tasks. The results are saved into 3 folders within the dataset: - Acceptability Judgment Task csv files (8 files) - Form–to–Meaning Task csv files (6 files) - Elicited Oral Production Task (6 files) 3. Results of L2-L3 proficiency and immersion Within the dataset, these documents are saved as .csv files, for a total of four files. They present the following information about the main (L3) participants: = L3 proficiency raw scores (1 file) - L2 proficiency raw scores (1 file) - L3 immersion aggregate scores (1 file) - L2 immersion aggregate scores (1 file) In the .csv files, missing data are reported as NA values. Proficiency data were collected online via GoogleDoc. Immersion data were also gathered through a the web (Language History Questionnaire 3.0 platform), as explained below (point 4). 4. Summary of biographical information The document about the main participants (L3 learners) summarises the relevant biographical facts (e.g., age) and the variables of interest in this study such as L2 and L3 proficiency (obatained by means of C-Tests), as well as L2 and L3 immersion scores (obtained by means of the Language History Questionnarie (LHQ 3.0) (Li, Zhang, Yu & Zhao, 2020). The LHQ questionnaire was completed by all parents online. The information provided was downloaded as .xlsx files. Aggregate scores for language immersion were automaically calculated by the web plafform, and input into csv files for the statistical analysis. The LHQ materials are available online: https://lhq-blclab.org/ The document about the baseline participants summaries the relevant biographical information, and the L1 proficiency scores (obtained by means of C-Tests). Date of data collection: August 2021 through July 2022 Information about geographic location of data collection: All the data were collected online via a web platform due to the COVID-19 pandemic. Licence: Creative Common Attribution International 4.0 Related projects/Funders: Data collection was funded by the Language Learning Dissertation Grant Program, awarded in 2020. https://onlinelibrary.wiley.com/page/journal/14679922/homepage/grant_programs.htm Related publication: None