The University of Southampton
University of Southampton Institutional Repository

Labeling post‐storm coastal imagery for machine learning: measurement of inter‐rater agreement

Labeling post‐storm coastal imagery for machine learning: measurement of inter‐rater agreement
Labeling post‐storm coastal imagery for machine learning: measurement of inter‐rater agreement

Classifying images using supervised machine learning (ML) relies on labeled training data—classes or text descriptions, for example, associated with each image. Data-driven models are only as good as the data used for training, and this points to the importance of high-quality labeled data for developing a ML model that has predictive skill. Labeling data is typically a time-consuming, manual process. Here, we investigate the process of labeling data, with a specific focus on coastal aerial imagery captured in the wake of hurricanes that affected the Atlantic and Gulf Coasts of the United States. The imagery data set is a rich observational record of storm impacts and coastal change, but the imagery requires labeling to render that information accessible. We created an online interface that served labelers a stream of images and a fixed set of questions. A total of 1,600 images were labeled by at least two or as many as seven coastal scientists. We used the resulting data set to investigate interrater agreement: the extent to which labelers labeled each image similarly. Interrater agreement scores, assessed with percent agreement and Krippendorff's alpha, are higher when the questions posed to labelers are relatively simple, when the labelers are provided with a user manual, and when images are smaller. Experiments in interrater agreement point toward the benefit of multiple labelers for understanding the uncertainty in labeling data for machine learning research.

classification, data annotation, data labeling, hurricane impacts, imagery, machine learning
2333-5084
Goldstein, Evan B.
25029a23-b5b1-4e08-9273-a62343cc8ec3
Buscombe, Daniel
7b779813-a764-4955-b86c-6aa4774a1cfe
Lazarus, Eli
642a3cdb-0d25-48b1-8ab8-8d1d72daca6e
Mohanty, Somya D.
0036fffa-3bfa-455d-9277-28b649ca0893
Rafique, Shah Nafis
4b8da1aa-61a6-4de4-9e2a-aa020679366b
Anarde, Katherine A.
1bb673c3-0eeb-49d3-9ca9-58c3a3ff996d
Ashton, Andrew D.
b03a9aa4-bfe5-4bff-8edd-f8c4b7ef06c3
Beuzen, Thomas
6d57240b-5a7e-4b06-a557-a6235e2e1701
Castagno, Katherine A.
350f6063-9135-4cd2-8584-baecb22b15a7
Cohn, Nicholas
284516bd-8368-4aa0-ad3b-1008ea808c00
Conlin, Matthew P.
15c6f5d7-4fee-42c6-92a5-adf5ab50c450
Ellenson, Ashley
2d7e57ef-770d-4194-af27-f61128f13168
Gillen, Megan
3cd432df-e0c7-43bf-a69e-ed60a5409a5c
Hovenga, Paige A.
fe2beb52-c45f-4791-9625-bbe3fb3c4fd5
Over, Jin-Si R.
76dc30ad-f151-4c1d-abc6-b2e00a242ecd
Palermo, Rose V.
e09ba6fc-947e-478d-ad3b-a9f12445fc14
Ratliff, Katherine M.
fa87eb55-a433-4506-aedd-b52e8abf4efd
Reeves, Ian R. B.
e94e9f98-87a3-4eeb-b881-f205038d0624
Sanborn, Lily H.
a84404f0-5f89-4c90-b41e-5187f9772421
Straub, Jessamin A.
decbcc2e-2eb9-41eb-9a59-e31193692501
Taylor, Luke, Alexander
d7c429f9-d964-4f6f-bb61-45c9b680f375
Wallace, Elizabeth J.
6be70419-fb54-45cb-933c-cc6c5113587f
Warrick, Jonathan
7f636283-3071-4f36-917c-9293c71348e7
Wernette, Phillipe
90f3ce5f-06d4-4420-b801-e2245e0cc536
Williams, Hannah
879e08ff-fc7e-4699-a27a-06be53e52732
Goldstein, Evan B.
25029a23-b5b1-4e08-9273-a62343cc8ec3
Buscombe, Daniel
7b779813-a764-4955-b86c-6aa4774a1cfe
Lazarus, Eli
642a3cdb-0d25-48b1-8ab8-8d1d72daca6e
Mohanty, Somya D.
0036fffa-3bfa-455d-9277-28b649ca0893
Rafique, Shah Nafis
4b8da1aa-61a6-4de4-9e2a-aa020679366b
Anarde, Katherine A.
1bb673c3-0eeb-49d3-9ca9-58c3a3ff996d
Ashton, Andrew D.
b03a9aa4-bfe5-4bff-8edd-f8c4b7ef06c3
Beuzen, Thomas
6d57240b-5a7e-4b06-a557-a6235e2e1701
Castagno, Katherine A.
350f6063-9135-4cd2-8584-baecb22b15a7
Cohn, Nicholas
284516bd-8368-4aa0-ad3b-1008ea808c00
Conlin, Matthew P.
15c6f5d7-4fee-42c6-92a5-adf5ab50c450
Ellenson, Ashley
2d7e57ef-770d-4194-af27-f61128f13168
Gillen, Megan
3cd432df-e0c7-43bf-a69e-ed60a5409a5c
Hovenga, Paige A.
fe2beb52-c45f-4791-9625-bbe3fb3c4fd5
Over, Jin-Si R.
76dc30ad-f151-4c1d-abc6-b2e00a242ecd
Palermo, Rose V.
e09ba6fc-947e-478d-ad3b-a9f12445fc14
Ratliff, Katherine M.
fa87eb55-a433-4506-aedd-b52e8abf4efd
Reeves, Ian R. B.
e94e9f98-87a3-4eeb-b881-f205038d0624
Sanborn, Lily H.
a84404f0-5f89-4c90-b41e-5187f9772421
Straub, Jessamin A.
decbcc2e-2eb9-41eb-9a59-e31193692501
Taylor, Luke, Alexander
d7c429f9-d964-4f6f-bb61-45c9b680f375
Wallace, Elizabeth J.
6be70419-fb54-45cb-933c-cc6c5113587f
Warrick, Jonathan
7f636283-3071-4f36-917c-9293c71348e7
Wernette, Phillipe
90f3ce5f-06d4-4420-b801-e2245e0cc536
Williams, Hannah
879e08ff-fc7e-4699-a27a-06be53e52732

Goldstein, Evan B., Buscombe, Daniel, Lazarus, Eli, Mohanty, Somya D., Rafique, Shah Nafis, Anarde, Katherine A., Ashton, Andrew D., Beuzen, Thomas, Castagno, Katherine A., Cohn, Nicholas, Conlin, Matthew P., Ellenson, Ashley, Gillen, Megan, Hovenga, Paige A., Over, Jin-Si R., Palermo, Rose V., Ratliff, Katherine M., Reeves, Ian R. B., Sanborn, Lily H., Straub, Jessamin A., Taylor, Luke, Alexander, Wallace, Elizabeth J., Warrick, Jonathan, Wernette, Phillipe and Williams, Hannah (2021) Labeling post‐storm coastal imagery for machine learning: measurement of inter‐rater agreement. Earth and Space Science, 8 (9), [e2021EA001896]. (doi:10.1029/2021EA001896).

Record type: Article

Abstract

Classifying images using supervised machine learning (ML) relies on labeled training data—classes or text descriptions, for example, associated with each image. Data-driven models are only as good as the data used for training, and this points to the importance of high-quality labeled data for developing a ML model that has predictive skill. Labeling data is typically a time-consuming, manual process. Here, we investigate the process of labeling data, with a specific focus on coastal aerial imagery captured in the wake of hurricanes that affected the Atlantic and Gulf Coasts of the United States. The imagery data set is a rich observational record of storm impacts and coastal change, but the imagery requires labeling to render that information accessible. We created an online interface that served labelers a stream of images and a fixed set of questions. A total of 1,600 images were labeled by at least two or as many as seven coastal scientists. We used the resulting data set to investigate interrater agreement: the extent to which labelers labeled each image similarly. Interrater agreement scores, assessed with percent agreement and Krippendorff's alpha, are higher when the questions posed to labelers are relatively simple, when the labelers are provided with a user manual, and when images are smaller. Experiments in interrater agreement point toward the benefit of multiple labelers for understanding the uncertainty in labeling data for machine learning research.

Text
2021EA001896 - Version of Record
Available under License Creative Commons Attribution.
Download (1MB)

More information

Accepted/In Press date: 26 August 2021
Published date: 3 September 2021
Additional Information: Funding Information: We thank the editor, two reviewers, and Chris Sherwood for feedback on this work. The authors gratefully acknowledge support from the U.S. Geological Survey (G20AC00403 to EBG and SDM), NSF (1953412 to EBG and SDM; 1939954 to EBG), Microsoft AI for Earth (to EBG and SDM), The Leverhulme Trust (RPG‐2018‐282 to EDL and EBG), and an Early Career Research Fellowship from the Gulf Research Program of the National Academies of Sciences, Engineering, and Medicine (to EBG). U.S. Geological Survey researchers (DB, J‐SRO, JW, and PW) were supported by the U.S. Geological Survey Coastal and Marine Hazards and Resources Program as part of the response and recovery efforts under congressional appropriations through the Additional Supplemental Appropriations for Disaster Relief Act, 2019 (Public Law 116‐20; 133 Stat. 871). Publisher Copyright: © 2021 The Authors. Earth and Space Science published by Wiley Periodicals LLC on behalf of American Geophysical Union. Copyright: Copyright 2021 Elsevier B.V., All rights reserved.
Keywords: classification, data annotation, data labeling, hurricane impacts, imagery, machine learning

Identifiers

Local EPrints ID: 452071
URI: http://eprints.soton.ac.uk/id/eprint/452071
ISSN: 2333-5084
PURE UUID: 38d7ea7a-f6a1-4f12-a668-69cccb478b4a
ORCID for Eli Lazarus: ORCID iD orcid.org/0000-0003-2404-9661

Catalogue record

Date deposited: 10 Nov 2021 17:37
Last modified: 17 Mar 2024 03:44

Export record

Altmetrics

Contributors

Author: Evan B. Goldstein
Author: Daniel Buscombe
Author: Eli Lazarus ORCID iD
Author: Somya D. Mohanty
Author: Shah Nafis Rafique
Author: Katherine A. Anarde
Author: Andrew D. Ashton
Author: Thomas Beuzen
Author: Katherine A. Castagno
Author: Nicholas Cohn
Author: Matthew P. Conlin
Author: Ashley Ellenson
Author: Megan Gillen
Author: Paige A. Hovenga
Author: Jin-Si R. Over
Author: Rose V. Palermo
Author: Katherine M. Ratliff
Author: Ian R. B. Reeves
Author: Lily H. Sanborn
Author: Jessamin A. Straub
Author: Luke, Alexander Taylor
Author: Elizabeth J. Wallace
Author: Jonathan Warrick
Author: Phillipe Wernette
Author: Hannah Williams

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×