Dataset - Understanding the software and data used in the social sciences
Dataset - Understanding the software and data used in the social sciences
This is a repository for a UKRI Economic and Social Research Council (ESRC) funded project to understand the software used to analyse social sciences data. Any software produced has been made available under a BSD 2-Clause license and any data and other non-software derivative is made available under a CC-BY 4.0 International License. Note that the software that analysed the survey is provided for illustrative purposes - it will not work on the decoupled anonymised data set. Exceptions to this are: Data from the UKRI ESRC is mostly made available under a CC BY-NC-SA 4.0 Licence. Data from Gateway to Research is made available under an Open Government Licence (Version 3.0). Contents Survey data & analysis: esrc_data-survey-analysis-data.zip Other data: esrc_data-other-data.zip Transcripts: esrc_data-transcripts.zip Data Management Plan: esrc_data-dmp.zip Survey data & analysis The survey ran from 3rd February 2022 to 6th March 2023 during which 168 responses were received. Of these responses, three were removed because they were supplied by people from outside the UK without a clear indication of involvement with the UK or associated infrastructure. A fourth response was removed as both came from the same person which leaves us with 164 responses in the data. The survey responses, Question (Q) Q1-Q16, have been decoupled from the demographic data, Q17-Q23. Questions Q24-Q28 are for follow-up and have been removed from the data. The institutions (Q17) and funding sources (Q18) have been provided in a separate file as this could be used to identify respondents. Q17, Q18 and Q19-Q23 have all been independently shuffled. The data has been made available as Comma Separated Values (CSV) with the question number as the header of each column and the encoded responses in the column below. To see what the question and the responses correspond to you will have to consult the survey-results-key.csv which decodes the question and responses accordingly. A pdf copy of the survey questions is available on GitHub. The survey data has been decoupled into: survey-results-key.csv - maps a question number and the responses to the actual question values. q1-16-survey-results.csv- the non-demographic component of the survey responses (Q1-Q16). q19-23-demographics.csv - the demographic part of the survey (Q19-Q21, Q23). q17-institutions.csv - the institution/location of the respondent (Q17). q18-funding.csv - funding sources within the last 5 years (Q18). Please note the code that has been used to do the analysis will not run with the decoupled survey data. Other data files included CleanedLocations.csv - normalised version of the institutions that the survey respondents volunteered. DTPs.csv - information on the UKRI Doctoral Training Partnerships (DTPs) scaped from the UKRI DTP contacts web page in October 2021. projectsearch-1646403729132.csv.gz - data snapshot from the UKRI Gateway to Research released on the 24th February 2022 made available under an Open Government Licence. locations.csv - latitude and longitude for the institutions in the cleaned locations. subjects.csv - research classifications for the ESRC projects for the 24th February data snapshot. topics.csv - topic classification for the ESRC projects for the 24th February data snapshot. Interview transcripts The interview transcripts have been anonymised and converted to markdown so that it's easier to process in general. List of interview transcripts: 1269794877.md 1578450175.md 1792505583.md 2964377624.md 3270614512.md 40983347262.md 4288358080.md 4561769548.md 4938919540.md 5037840428.md 5766299900.md 5996360861.md 6422621713.md 6776362537.md 7183719943.md 7227322280.md 7336263536.md 75909371872.md 7869268779.md 8031500357.md 9253010492.md Data Management Plan The study's Data Management Plan is provided in PDF format and shows the different data sets used throughout the duration of the study and where they have been deposited, as well as how long the SSI will keep these records.
Chue Hong, Neil
0e045883-072e-41f6-ad01-9cf2fe5b00dc
Aragon, Selina
340b831b-220a-422b-b916-15f45644afb8
Antonioletti, Mario
0444d546-55ba-4131-95c5-e1eb00262f77
Walker, Johanna
b642943f-9aa9-4652-b12f-2a0c2beae1c9
Chue Hong, Neil
0e045883-072e-41f6-ad01-9cf2fe5b00dc
Aragon, Selina
340b831b-220a-422b-b916-15f45644afb8
Antonioletti, Mario
0444d546-55ba-4131-95c5-e1eb00262f77
Walker, Johanna
b642943f-9aa9-4652-b12f-2a0c2beae1c9
Abstract
This is a repository for a UKRI Economic and Social Research Council (ESRC) funded project to understand the software used to analyse social sciences data. Any software produced has been made available under a BSD 2-Clause license and any data and other non-software derivative is made available under a CC-BY 4.0 International License. Note that the software that analysed the survey is provided for illustrative purposes - it will not work on the decoupled anonymised data set. Exceptions to this are: Data from the UKRI ESRC is mostly made available under a CC BY-NC-SA 4.0 Licence. Data from Gateway to Research is made available under an Open Government Licence (Version 3.0). Contents Survey data & analysis: esrc_data-survey-analysis-data.zip Other data: esrc_data-other-data.zip Transcripts: esrc_data-transcripts.zip Data Management Plan: esrc_data-dmp.zip Survey data & analysis The survey ran from 3rd February 2022 to 6th March 2023 during which 168 responses were received. Of these responses, three were removed because they were supplied by people from outside the UK without a clear indication of involvement with the UK or associated infrastructure. A fourth response was removed as both came from the same person which leaves us with 164 responses in the data. The survey responses, Question (Q) Q1-Q16, have been decoupled from the demographic data, Q17-Q23. Questions Q24-Q28 are for follow-up and have been removed from the data. The institutions (Q17) and funding sources (Q18) have been provided in a separate file as this could be used to identify respondents. Q17, Q18 and Q19-Q23 have all been independently shuffled. The data has been made available as Comma Separated Values (CSV) with the question number as the header of each column and the encoded responses in the column below. To see what the question and the responses correspond to you will have to consult the survey-results-key.csv which decodes the question and responses accordingly. A pdf copy of the survey questions is available on GitHub. The survey data has been decoupled into: survey-results-key.csv - maps a question number and the responses to the actual question values. q1-16-survey-results.csv- the non-demographic component of the survey responses (Q1-Q16). q19-23-demographics.csv - the demographic part of the survey (Q19-Q21, Q23). q17-institutions.csv - the institution/location of the respondent (Q17). q18-funding.csv - funding sources within the last 5 years (Q18). Please note the code that has been used to do the analysis will not run with the decoupled survey data. Other data files included CleanedLocations.csv - normalised version of the institutions that the survey respondents volunteered. DTPs.csv - information on the UKRI Doctoral Training Partnerships (DTPs) scaped from the UKRI DTP contacts web page in October 2021. projectsearch-1646403729132.csv.gz - data snapshot from the UKRI Gateway to Research released on the 24th February 2022 made available under an Open Government Licence. locations.csv - latitude and longitude for the institutions in the cleaned locations. subjects.csv - research classifications for the ESRC projects for the 24th February data snapshot. topics.csv - topic classification for the ESRC projects for the 24th February data snapshot. Interview transcripts The interview transcripts have been anonymised and converted to markdown so that it's easier to process in general. List of interview transcripts: 1269794877.md 1578450175.md 1792505583.md 2964377624.md 3270614512.md 40983347262.md 4288358080.md 4561769548.md 4938919540.md 5037840428.md 5766299900.md 5996360861.md 6422621713.md 6776362537.md 7183719943.md 7227322280.md 7336263536.md 75909371872.md 7869268779.md 8031500357.md 9253010492.md Data Management Plan The study's Data Management Plan is provided in PDF format and shows the different data sets used throughout the duration of the study and where they have been deposited, as well as how long the SSI will keep these records.
This record has no associated files available for download.
More information
Published date: 30 March 2023
Identifiers
Local EPrints ID: 484714
URI: http://eprints.soton.ac.uk/id/eprint/484714
PURE UUID: 49fbabdd-105c-43ec-b196-947cd561a288
Catalogue record
Date deposited: 20 Nov 2023 17:45
Last modified: 20 Nov 2023 17:45
Export record
Altmetrics
Contributors
Contributor:
Neil Chue Hong
Contributor:
Selina Aragon
Contributor:
Mario Antonioletti
Contributor:
Johanna Walker
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics