READ ME File For 'SPANISH-ENGLISH BILINGUALS VIEWPOINT APSECT DEVELOPMENT DATA' Dataset DOI: 10.5258/SOTON/D2024 ReadMe Author: JAMES CORBET, University of Southampton, ORCID ID: 0000-0003-4660-9551 This dataset supports the thesis entitled Heritage Language Acquisition under Feature Reassembly: The development of Spanish viewpoint aspect morphology by heritage speakers in the United Kingdom. AWARDED BY: University of Southampton DATE OF AWARD: 2021 DESCRIPTION OF THE DATA Participants: - 25 Spanish-English bilingual children in the UK, from Argentina, Colombia, Mexico and Spain. - 16 Spanish L1 English L2 adults in the UK, parents of the children. Participant IDs consist of two letters and two numbers, e.g. MC01 or SP15 - The first letter reflects the country of origin of the Spanish-speaking parent (i.e. A, C, M, or S for Argentina, Colombia, Mexico, or Spain) - The second letter reflects the group (i.e. C for Child, or P for Parent) - The numbers reflect the order of recording of the individuals in the data (i.e. 01 to 25 for the children; 01 to 16 for their parents) This dataset consists of: 1. Sets of transcripts of audio recordings (as the audio recordings themselves have not been anonymised) 2. Collated results of a semantic interpretation task 3. A summary of biographic information about the children in the study. Details about the collection and format of each of these consistuent parts are covered below: 1. Audio recording transcripts Each child completed two narrative retelling tasks (one in English and Spanish), and the parents also completed one of the narrative retelling tasks. The tasks were: - Loch Ness Story Children narrated this story in both Spanish (at the start of the session) and English (at the end of the session). The sole purpose of this narrative retelling was to extract a speech rate measure for each language. Calculations of these speech rates are included on the transcripts - Cat Story Both the children and their parents narrated this story in Spanish only. The narrative sets up a wide range of contexts in which the use of past tense perfective and imperfective morphology is expected to be used. Audio recordings were transcribed into .txt files using Microsoft Notepad and should be accessible using any text editor software. The transcripts are saved into four folders within the dataset: - Loch Ness story transcripts English (24 files, lacking MC01) - Loch Ness story transcripts Spanish (25 files) - Cat story transcripts children (25 files) - Cat story transcripts parents (15 files, lacking MP01) 2. Results of the semantic interpretation task The semantic interpretation task was based on a similar task used by Domínguez et al. 2013 https://doi.org/10.1017/S1366728912000363 and Domínguez, Arche and Myles 2017 https://doi.org/10.1177/0267658317701991 In this task, parents and children were presented with a series of contexts in audio and written form, and then asked to rate two summaries or continuations of the context which differed in verbal morphology. The task was delivered using the iSurvey platform between the Spanish and English recordings, and results for each participant were input into a CSV file together. The ratings are coded as either 1 or -1 (accept or reject), which is then coded as 0 or 1 (accurate or inaccurate) depending on the form-context pair. I use the letter 'm' to indicate missing data in this CSV file. 3. Summary of biographical information This document summarises the variables of interests taken from Sharon Unsworth's Bilingual Language Experience Caculator (BiLEC), and other information such as speech rate derived from the children's narrative retellings. The BiLEC questionnaire was completed by parents, who provided information that was entered into an Excel spreadsheet (.xlsx) that automatically calculated a series of measures of input and output in each language for each child. The BiLEC task materials and manual are available online: https://www.iris-database.org/iris/app/home/detail?id=york%3a928327 Speech rates in Spanish and English were calculated using the recorded samples, and summaries of the calculations are provided on the relevant transcripts. The format is a spreadsheet (.csv). Date of data collection: November 2019 to November 2020 Information about geographic location of data collection: Most families lived in Southampton, Winchester and surrounding villages, but two families lived in London. Approximately three quarters of the data was collected in person, with the remaining quarter collected online via Microsoft Teams due to the COVID-19 pandemic. Licence: Creative Common Attribution International 4.0 Related projects/Funders: This project was funded by the ESRC South Coast Doctoral Training Partnership, grant number 1947508. www.southcoastdtp.ac.uk Related publication: Corbet J., and Domínguez, L. 2020. The Comprehension of Spanish Tense-Aspect Morphology by Spanish Heritage Speakers in the United Kingdom. Languages 5(4), no. 46. https://doi.org/10.3390/languages5040046