Labeling post‐storm coastal imagery for machine learning: measurement of inter‐rater agreement
Labeling post‐storm coastal imagery for machine learning: measurement of inter‐rater agreement
Classifying images using supervised machine learning (ML) relies on labeled training data—classes or text descriptions, for example, associated with each image. Data-driven models are only as good as the data used for training, and this points to the importance of high-quality labeled data for developing a ML model that has predictive skill. Labeling data is typically a time-consuming, manual process. Here, we investigate the process of labeling data, with a specific focus on coastal aerial imagery captured in the wake of hurricanes that affected the Atlantic and Gulf Coasts of the United States. The imagery data set is a rich observational record of storm impacts and coastal change, but the imagery requires labeling to render that information accessible. We created an online interface that served labelers a stream of images and a fixed set of questions. A total of 1,600 images were labeled by at least two or as many as seven coastal scientists. We used the resulting data set to investigate interrater agreement: the extent to which labelers labeled each image similarly. Interrater agreement scores, assessed with percent agreement and Krippendorff's alpha, are higher when the questions posed to labelers are relatively simple, when the labelers are provided with a user manual, and when images are smaller. Experiments in interrater agreement point toward the benefit of multiple labelers for understanding the uncertainty in labeling data for machine learning research.
classification, data annotation, data labeling, hurricane impacts, imagery, machine learning
Goldstein, Evan B.
25029a23-b5b1-4e08-9273-a62343cc8ec3
Buscombe, Daniel
7b779813-a764-4955-b86c-6aa4774a1cfe
Lazarus, Eli
642a3cdb-0d25-48b1-8ab8-8d1d72daca6e
Mohanty, Somya D.
0036fffa-3bfa-455d-9277-28b649ca0893
Rafique, Shah Nafis
4b8da1aa-61a6-4de4-9e2a-aa020679366b
Anarde, Katherine A.
1bb673c3-0eeb-49d3-9ca9-58c3a3ff996d
Ashton, Andrew D.
b03a9aa4-bfe5-4bff-8edd-f8c4b7ef06c3
Beuzen, Thomas
6d57240b-5a7e-4b06-a557-a6235e2e1701
Castagno, Katherine A.
350f6063-9135-4cd2-8584-baecb22b15a7
Cohn, Nicholas
284516bd-8368-4aa0-ad3b-1008ea808c00
Conlin, Matthew P.
15c6f5d7-4fee-42c6-92a5-adf5ab50c450
Ellenson, Ashley
2d7e57ef-770d-4194-af27-f61128f13168
Gillen, Megan
3cd432df-e0c7-43bf-a69e-ed60a5409a5c
Hovenga, Paige A.
fe2beb52-c45f-4791-9625-bbe3fb3c4fd5
Over, Jin-Si R.
76dc30ad-f151-4c1d-abc6-b2e00a242ecd
Palermo, Rose V.
e09ba6fc-947e-478d-ad3b-a9f12445fc14
Ratliff, Katherine M.
fa87eb55-a433-4506-aedd-b52e8abf4efd
Reeves, Ian R. B.
e94e9f98-87a3-4eeb-b881-f205038d0624
Sanborn, Lily H.
a84404f0-5f89-4c90-b41e-5187f9772421
Straub, Jessamin A.
decbcc2e-2eb9-41eb-9a59-e31193692501
Taylor, Luke, Alexander
d7c429f9-d964-4f6f-bb61-45c9b680f375
Wallace, Elizabeth J.
6be70419-fb54-45cb-933c-cc6c5113587f
Warrick, Jonathan
7f636283-3071-4f36-917c-9293c71348e7
Wernette, Phillipe
90f3ce5f-06d4-4420-b801-e2245e0cc536
Williams, Hannah
879e08ff-fc7e-4699-a27a-06be53e52732
3 September 2021
Goldstein, Evan B.
25029a23-b5b1-4e08-9273-a62343cc8ec3
Buscombe, Daniel
7b779813-a764-4955-b86c-6aa4774a1cfe
Lazarus, Eli
642a3cdb-0d25-48b1-8ab8-8d1d72daca6e
Mohanty, Somya D.
0036fffa-3bfa-455d-9277-28b649ca0893
Rafique, Shah Nafis
4b8da1aa-61a6-4de4-9e2a-aa020679366b
Anarde, Katherine A.
1bb673c3-0eeb-49d3-9ca9-58c3a3ff996d
Ashton, Andrew D.
b03a9aa4-bfe5-4bff-8edd-f8c4b7ef06c3
Beuzen, Thomas
6d57240b-5a7e-4b06-a557-a6235e2e1701
Castagno, Katherine A.
350f6063-9135-4cd2-8584-baecb22b15a7
Cohn, Nicholas
284516bd-8368-4aa0-ad3b-1008ea808c00
Conlin, Matthew P.
15c6f5d7-4fee-42c6-92a5-adf5ab50c450
Ellenson, Ashley
2d7e57ef-770d-4194-af27-f61128f13168
Gillen, Megan
3cd432df-e0c7-43bf-a69e-ed60a5409a5c
Hovenga, Paige A.
fe2beb52-c45f-4791-9625-bbe3fb3c4fd5
Over, Jin-Si R.
76dc30ad-f151-4c1d-abc6-b2e00a242ecd
Palermo, Rose V.
e09ba6fc-947e-478d-ad3b-a9f12445fc14
Ratliff, Katherine M.
fa87eb55-a433-4506-aedd-b52e8abf4efd
Reeves, Ian R. B.
e94e9f98-87a3-4eeb-b881-f205038d0624
Sanborn, Lily H.
a84404f0-5f89-4c90-b41e-5187f9772421
Straub, Jessamin A.
decbcc2e-2eb9-41eb-9a59-e31193692501
Taylor, Luke, Alexander
d7c429f9-d964-4f6f-bb61-45c9b680f375
Wallace, Elizabeth J.
6be70419-fb54-45cb-933c-cc6c5113587f
Warrick, Jonathan
7f636283-3071-4f36-917c-9293c71348e7
Wernette, Phillipe
90f3ce5f-06d4-4420-b801-e2245e0cc536
Williams, Hannah
879e08ff-fc7e-4699-a27a-06be53e52732
Goldstein, Evan B., Buscombe, Daniel, Lazarus, Eli, Mohanty, Somya D., Rafique, Shah Nafis, Anarde, Katherine A., Ashton, Andrew D., Beuzen, Thomas, Castagno, Katherine A., Cohn, Nicholas, Conlin, Matthew P., Ellenson, Ashley, Gillen, Megan, Hovenga, Paige A., Over, Jin-Si R., Palermo, Rose V., Ratliff, Katherine M., Reeves, Ian R. B., Sanborn, Lily H., Straub, Jessamin A., Taylor, Luke, Alexander, Wallace, Elizabeth J., Warrick, Jonathan, Wernette, Phillipe and Williams, Hannah
(2021)
Labeling post‐storm coastal imagery for machine learning: measurement of inter‐rater agreement.
Earth and Space Science, 8 (9), [e2021EA001896].
(doi:10.1029/2021EA001896).
Abstract
Classifying images using supervised machine learning (ML) relies on labeled training data—classes or text descriptions, for example, associated with each image. Data-driven models are only as good as the data used for training, and this points to the importance of high-quality labeled data for developing a ML model that has predictive skill. Labeling data is typically a time-consuming, manual process. Here, we investigate the process of labeling data, with a specific focus on coastal aerial imagery captured in the wake of hurricanes that affected the Atlantic and Gulf Coasts of the United States. The imagery data set is a rich observational record of storm impacts and coastal change, but the imagery requires labeling to render that information accessible. We created an online interface that served labelers a stream of images and a fixed set of questions. A total of 1,600 images were labeled by at least two or as many as seven coastal scientists. We used the resulting data set to investigate interrater agreement: the extent to which labelers labeled each image similarly. Interrater agreement scores, assessed with percent agreement and Krippendorff's alpha, are higher when the questions posed to labelers are relatively simple, when the labelers are provided with a user manual, and when images are smaller. Experiments in interrater agreement point toward the benefit of multiple labelers for understanding the uncertainty in labeling data for machine learning research.
Text
2021EA001896
- Version of Record
More information
Accepted/In Press date: 26 August 2021
Published date: 3 September 2021
Additional Information:
Funding Information:
We thank the editor, two reviewers, and Chris Sherwood for feedback on this work. The authors gratefully acknowledge support from the U.S. Geological Survey (G20AC00403 to EBG and SDM), NSF (1953412 to EBG and SDM; 1939954 to EBG), Microsoft AI for Earth (to EBG and SDM), The Leverhulme Trust (RPG‐2018‐282 to EDL and EBG), and an Early Career Research Fellowship from the Gulf Research Program of the National Academies of Sciences, Engineering, and Medicine (to EBG). U.S. Geological Survey researchers (DB, J‐SRO, JW, and PW) were supported by the U.S. Geological Survey Coastal and Marine Hazards and Resources Program as part of the response and recovery efforts under congressional appropriations through the Additional Supplemental Appropriations for Disaster Relief Act, 2019 (Public Law 116‐20; 133 Stat. 871).
Publisher Copyright:
© 2021 The Authors. Earth and Space Science published by Wiley Periodicals LLC on behalf of American Geophysical Union.
Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.
Keywords:
classification, data annotation, data labeling, hurricane impacts, imagery, machine learning
Identifiers
Local EPrints ID: 452071
URI: http://eprints.soton.ac.uk/id/eprint/452071
ISSN: 2333-5084
PURE UUID: 38d7ea7a-f6a1-4f12-a668-69cccb478b4a
Catalogue record
Date deposited: 10 Nov 2021 17:37
Last modified: 06 Jun 2024 01:58
Export record
Altmetrics
Contributors
Author:
Evan B. Goldstein
Author:
Daniel Buscombe
Author:
Somya D. Mohanty
Author:
Shah Nafis Rafique
Author:
Katherine A. Anarde
Author:
Andrew D. Ashton
Author:
Thomas Beuzen
Author:
Katherine A. Castagno
Author:
Nicholas Cohn
Author:
Matthew P. Conlin
Author:
Ashley Ellenson
Author:
Megan Gillen
Author:
Paige A. Hovenga
Author:
Jin-Si R. Over
Author:
Rose V. Palermo
Author:
Katherine M. Ratliff
Author:
Ian R. B. Reeves
Author:
Lily H. Sanborn
Author:
Jessamin A. Straub
Author:
Luke, Alexander Taylor
Author:
Elizabeth J. Wallace
Author:
Jonathan Warrick
Author:
Phillipe Wernette
Author:
Hannah Williams
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics