The University of Southampton
University of Southampton Institutional Repository

Endless Forams: >34,000 modern planktonic foraminiferal images for taxonomic training and automated species recognition using convolutional neural networks

Endless Forams: >34,000 modern planktonic foraminiferal images for taxonomic training and automated species recognition using convolutional neural networks
Endless Forams: >34,000 modern planktonic foraminiferal images for taxonomic training and automated species recognition using convolutional neural networks
Planktonic foraminiferal species identification is central to many paleoceanographic studies, from selecting species for geochemical research to elucidating the biotic dynamics of microfossil communities relevant to physical oceanographic processes and interconnected phenomena such as climate change. However, few resources exist to train students in the difficult task of discerning amongst closely related species, resulting in diverging taxonomic schools that differ in species concepts and boundaries. This problem is exacerbated by the limited number of taxonomic experts. Here we document our initial progress toward removing these confounding and/or rate‐limiting factors by generating the first extensive image library of modern planktonic foraminifera, providing digital taxonomic training tools and resources, and automating species‐level taxonomic identification of planktonic foraminifera via machine learning using convolution neural networks. Experts identified 34,640 images of modern (extant) planktonic foraminifera to the species level. These images are served as species exemplars through the online portal Endless Forams (endlessforams.org) and a taxonomic training portal hosted on the citizen science platform Zooniverse (zooniverse.org/projects/ahsiang/endless‐forams/). A supervised machine learning classifier was then trained with ~27,000 images of these identified planktonic foraminifera. The best‐performing model provided the correct species name for an image in the validation set 87.4% of the time and included the correct name in its top three guesses 97.7% of the time. Together, these resources provide a rigorous set of training tools in modern planktonic foraminiferal taxonomy and a means of rapidly generating assemblage data via machine learning in future studies for applications such as paleotemperature reconstruction.
2572-4517
1157-1177
Hsiang, Allison Y.
4d4d6e58-660d-49cd-8be2-ed3d7bf6039e
Brombacher, Anieke
2a4bbb84-4743-4a36-973b-4ad2bf743154
Rillo, Marina C.
a85c72ce-9efe-4dba-ae7d-599e4b89f10c
Mleneck‐vautravers, Maryline J.
0c5fb0bb-3a5a-4652-b17b-b14d69eb2c60
Conn, Stephen
d03e7950-67d4-4fcb-a4bf-ba58a0501e42
Lordsmith, Sian
a039a1bc-1506-4002-b9eb-73d4bf2be607
Jentzen, Anna
dad9b54f-e98b-4c05-a9f4-a702858d6bf1
Henehan, Michael J.
1dae087e-6389-4f29-b966-26929951881d
Metcalfe, Brett
9c621a33-4066-4a65-88df-0f2d4108f4ff
Fenton, Isabel S.
5b688f6d-c116-4027-a884-16af2a1a5905
Wade, Bridget S.
882ef710-e0e7-46a1-b382-eb48b1b31a03
Fox, Lyndsey
2a601f78-2312-4b39-ab2a-061f45a61ef7
Meilland, Julie
ef0c3ebf-8970-452e-9520-9995e090dcc0
Davis, Catherine V.
705fa2d9-d903-4025-9222-0681025f43b6
Baranowski, Ulrike
209086e2-4d7d-4782-a03a-50ad4b00a33c
Groeneveld, Jeroen
268313e5-1ea3-488e-be7a-4a40a1a7a3a6
Edgar, Kirsty M.
15a6f655-0ec8-431c-b181-2050bacce584
Movellan, Aurore
a35c7050-2900-47d5-af23-aab402963a3b
Aze, Tracy
8b672c72-a247-4900-90b2-c633d6ca006e
Dowsett, Harry J.
a4aed9ff-8fe7-4ecb-bd8e-b9850cd2830b
Miller, C. Giles
880e85b1-7e67-4550-acc4-fb905dc1c3ed
Rios, Nelson
50186889-45c2-41e9-9bbc-e1de17f7432e
Hull, Pincelli M.
56b9ec5b-7112-453b-92fd-b4b84cfcc326
Hsiang, Allison Y.
4d4d6e58-660d-49cd-8be2-ed3d7bf6039e
Brombacher, Anieke
2a4bbb84-4743-4a36-973b-4ad2bf743154
Rillo, Marina C.
a85c72ce-9efe-4dba-ae7d-599e4b89f10c
Mleneck‐vautravers, Maryline J.
0c5fb0bb-3a5a-4652-b17b-b14d69eb2c60
Conn, Stephen
d03e7950-67d4-4fcb-a4bf-ba58a0501e42
Lordsmith, Sian
a039a1bc-1506-4002-b9eb-73d4bf2be607
Jentzen, Anna
dad9b54f-e98b-4c05-a9f4-a702858d6bf1
Henehan, Michael J.
1dae087e-6389-4f29-b966-26929951881d
Metcalfe, Brett
9c621a33-4066-4a65-88df-0f2d4108f4ff
Fenton, Isabel S.
5b688f6d-c116-4027-a884-16af2a1a5905
Wade, Bridget S.
882ef710-e0e7-46a1-b382-eb48b1b31a03
Fox, Lyndsey
2a601f78-2312-4b39-ab2a-061f45a61ef7
Meilland, Julie
ef0c3ebf-8970-452e-9520-9995e090dcc0
Davis, Catherine V.
705fa2d9-d903-4025-9222-0681025f43b6
Baranowski, Ulrike
209086e2-4d7d-4782-a03a-50ad4b00a33c
Groeneveld, Jeroen
268313e5-1ea3-488e-be7a-4a40a1a7a3a6
Edgar, Kirsty M.
15a6f655-0ec8-431c-b181-2050bacce584
Movellan, Aurore
a35c7050-2900-47d5-af23-aab402963a3b
Aze, Tracy
8b672c72-a247-4900-90b2-c633d6ca006e
Dowsett, Harry J.
a4aed9ff-8fe7-4ecb-bd8e-b9850cd2830b
Miller, C. Giles
880e85b1-7e67-4550-acc4-fb905dc1c3ed
Rios, Nelson
50186889-45c2-41e9-9bbc-e1de17f7432e
Hull, Pincelli M.
56b9ec5b-7112-453b-92fd-b4b84cfcc326

Hsiang, Allison Y., Brombacher, Anieke, Rillo, Marina C., Mleneck‐vautravers, Maryline J., Conn, Stephen, Lordsmith, Sian, Jentzen, Anna, Henehan, Michael J., Metcalfe, Brett, Fenton, Isabel S., Wade, Bridget S., Fox, Lyndsey, Meilland, Julie, Davis, Catherine V., Baranowski, Ulrike, Groeneveld, Jeroen, Edgar, Kirsty M., Movellan, Aurore, Aze, Tracy, Dowsett, Harry J., Miller, C. Giles, Rios, Nelson and Hull, Pincelli M. (2019) Endless Forams: >34,000 modern planktonic foraminiferal images for taxonomic training and automated species recognition using convolutional neural networks. Paleoceanography and Paleoclimatology, 34 (7), 1157-1177. (doi:10.1029/2019PA003612).

Record type: Article

Abstract

Planktonic foraminiferal species identification is central to many paleoceanographic studies, from selecting species for geochemical research to elucidating the biotic dynamics of microfossil communities relevant to physical oceanographic processes and interconnected phenomena such as climate change. However, few resources exist to train students in the difficult task of discerning amongst closely related species, resulting in diverging taxonomic schools that differ in species concepts and boundaries. This problem is exacerbated by the limited number of taxonomic experts. Here we document our initial progress toward removing these confounding and/or rate‐limiting factors by generating the first extensive image library of modern planktonic foraminifera, providing digital taxonomic training tools and resources, and automating species‐level taxonomic identification of planktonic foraminifera via machine learning using convolution neural networks. Experts identified 34,640 images of modern (extant) planktonic foraminifera to the species level. These images are served as species exemplars through the online portal Endless Forams (endlessforams.org) and a taxonomic training portal hosted on the citizen science platform Zooniverse (zooniverse.org/projects/ahsiang/endless‐forams/). A supervised machine learning classifier was then trained with ~27,000 images of these identified planktonic foraminifera. The best‐performing model provided the correct species name for an image in the validation set 87.4% of the time and included the correct name in its top three guesses 97.7% of the time. Together, these resources provide a rigorous set of training tools in modern planktonic foraminiferal taxonomy and a means of rapidly generating assemblage data via machine learning in future studies for applications such as paleotemperature reconstruction.

Text
Hsiang_et_al-2019-Paleoceanography_and_Paleoclimatology - Version of Record
Download (3MB)

More information

Accepted/In Press date: 23 June 2019
Published date: 13 August 2019

Identifiers

Local EPrints ID: 433625
URI: http://eprints.soton.ac.uk/id/eprint/433625
ISSN: 2572-4517
PURE UUID: 58d4d29a-5d99-4195-ab6d-976f86c9be51
ORCID for Anieke Brombacher: ORCID iD orcid.org/0000-0003-2310-047X

Catalogue record

Date deposited: 28 Aug 2019 16:30
Last modified: 16 Mar 2024 04:32

Export record

Altmetrics

Contributors

Author: Allison Y. Hsiang
Author: Marina C. Rillo
Author: Maryline J. Mleneck‐vautravers
Author: Stephen Conn
Author: Sian Lordsmith
Author: Anna Jentzen
Author: Michael J. Henehan
Author: Brett Metcalfe
Author: Isabel S. Fenton
Author: Bridget S. Wade
Author: Lyndsey Fox
Author: Julie Meilland
Author: Catherine V. Davis
Author: Ulrike Baranowski
Author: Jeroen Groeneveld
Author: Kirsty M. Edgar
Author: Aurore Movellan
Author: Tracy Aze
Author: Harry J. Dowsett
Author: C. Giles Miller
Author: Nelson Rios
Author: Pincelli M. Hull

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×