READ ME File For 'Hand-set reward function values'

Dataset DOI: 10.5258/SOTON/D3628

Date that the file was created: April 2024

-------------------
GENERAL INFORMATION
-------------------

ReadMe Author: Eryn Rigley, University of Southampton

Date of data collection: 23/2/2024-25/4/2024

Information about geographic location of data collection: Collected Online, based in Southampton

Related projects:
Doctoral Thesis - Post-anthropocentric value alignment: beyond the human-set and human-centred

--------------------------
SHARING/ACCESS INFORMATION
-------------------------- 

Licenses/restrictions placed on the data, or limitations of reuse:CC-BY

Recommended citation for the data: Rigley, E., Chapman, A., Evers, C., McNeill, W (2024) 'Hand-set reward function values'.


Links to other publicly accessible locations of the data: Post-anthropocentric value alignment: beyond the human-set and human-centred (Thesis to be published on Pure)


--------------------
DATA & FILE OVERVIEW
--------------------

This dataset contains:

Ethical_Machines_1-13_.xlsx (Participants' hand-set values for an autonomous system's actions in 2 case studies: scaring wildlife; delaying the delivery of medicine. These case studies are described in full in the Methodology Section of the Thesis). 

90430.A1_Participant_Info_Sheet.docx (Information Sheet given to Participants).


--------------------------
METHODOLOGICAL INFORMATION
--------------------------

Description of methods used for collection/generation of data: The full methodological description is available in the Thesis. 

13 PhD level computer scientists and engineers were recruited to hand-set values to reward or punish an autonomous system for a series of actions. No training was provided since the purpose of the study was to collect data on how real human programmers would assign moral values to state-action pairs based on their existing moral intuitions. The number of recruits is appropriate for this study, with 13 sitting at the higher end of existing manual coding research methodologies for moral problems, increasing result reliability and likelihood of capturing a greater range of moral intuitions across the target group. Computer science and engineering PhDs were targeted over philosophers, which is more common in value alignment data collection, since computer scientists and engineers are more likely to be the kinds of people who actually design AS.

The participants were first asked to self-identify their environmental moral valuations by selecting that they cared about the environment 1) only or 2) mostly in terms of what humans gain from it, 3) whether or not humans gained anything from it, or 4) that they don’t care about the environment. Initial self-identification of moral salience or political views is common in research which includes extracting moral information via manual coders (e.g., Weber et al., 2018; Feinberg and Willer, 2015; Graham et al., 2009). This information can help to inform or explain any trends or outliers in the data. Each participant then designed a reward function for an AS for the two case studies. For Case Study 1, the coders were asked to assign values to cases in which a UAV delayed delivery of life-saving medicine, in 20 second intervals, to a human, with varying degrees of risk to the patient (e.g., delaying delivery by 60-80 seconds with a high risk of brain damage to the patient). For Case Study 2, the coders were then asked to assign values to cases in which a UAV scares away various percentages of seven species from the local area (e.g., 65% of elephants, 15% of wildebeests). The possible values to assign to reward functions ranged from -50 to +50 on an 11-point scale. N-point scales are commonly used in moral coding to provide a framework for participants (Boukes et al., 2020; Weber et al., 2018; Feinberg and Willer, 2015; Bowman et al., 2014; Graham et al., 2009; Reed et al., 2015).

This study was granted ethics approval by the University of Southampton Ethics Committee (ERGO II No. 90430.A1) and informed consent was obtained from all participants to collect these responses and use them to build a reward function for an AS. The survey questions are included in Appendix C of the Thesis.


Methods for processing the data: Raw data was collected using an online form (Microsoft Forms) which was then downloaded as an Excel Spreadsheet and CSV file and analysed using Python.

--------------------------
DATA-SPECIFIC INFORMATION for Ethical_Machines_1-13_.xlsx 
--------------------------

Number of cases/rows: 13 rows, with Mode and Standard Deviation values below. First 6 columns are anonymised data about participants (e.g. time taken to complete survey). Column G is consent granted. Columns H to AR are each a question asked of participants.