The University of Southampton
University of Southampton Institutional Repository

GloSAT Historical Measurement Table Dataset

GloSAT Historical Measurement Table Dataset
GloSAT Historical Measurement Table Dataset
Dataset containing scanned historical measurement table documents from ship logs and land measurement stations. Annotations provided in this dataset are designed to allow finergrained table detection and table structure recognition models to be trained and tested. Annotations are region boundaries for tables, cells, headings, headers and captions. This dataset release includes code to train models on a training split, to use trained model checkpoints for inference and to evaluate interred results on a test split. Pretrained models used in the published HIP-2021 paper are included in the dataset so results can be easily reproduced without training the model checkpoints yourself. Instructions and code can be found on the linked github repository https://github.com/stuartemiddleton/glosat_table_dataset A pre-print of the HIP-2021 paper can be found on the authors website https://www.southampton.ac.uk/~sem03/HIP_2021.pdf Original images sourced with permission from UK Met Office, US NOAA and weatheerrescue.org (University of Reading). This work is part of the GloSAT project https://www.glosat.org/ and supported by the Natural Environment Research Council (NE/S015604/1). The authors acknowledge the use of the IRIDIS High Performance Computing Facility, and associated support services at the University of Southampton, in the completion of this work.
Table Detection, Table Stucture Recognition, Document Layout Analysis, Historical Measurements, GloSAT
University of Southampton
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Ziomek, Juliusz
b05e7f21-70db-497c-be74-b0b54d2a4579
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Ziomek, Juliusz
b05e7f21-70db-497c-be74-b0b54d2a4579

Middleton, Stuart and Ziomek, Juliusz (2021) GloSAT Historical Measurement Table Dataset. University of Southampton doi:10.5281/zenodo.5363457 [Dataset]

Record type: Dataset

Abstract

Dataset containing scanned historical measurement table documents from ship logs and land measurement stations. Annotations provided in this dataset are designed to allow finergrained table detection and table structure recognition models to be trained and tested. Annotations are region boundaries for tables, cells, headings, headers and captions. This dataset release includes code to train models on a training split, to use trained model checkpoints for inference and to evaluate interred results on a test split. Pretrained models used in the published HIP-2021 paper are included in the dataset so results can be easily reproduced without training the model checkpoints yourself. Instructions and code can be found on the linked github repository https://github.com/stuartemiddleton/glosat_table_dataset A pre-print of the HIP-2021 paper can be found on the authors website https://www.southampton.ac.uk/~sem03/HIP_2021.pdf Original images sourced with permission from UK Met Office, US NOAA and weatheerrescue.org (University of Reading). This work is part of the GloSAT project https://www.glosat.org/ and supported by the Natural Environment Research Council (NE/S015604/1). The authors acknowledge the use of the IRIDIS High Performance Computing Facility, and associated support services at the University of Southampton, in the completion of this work.

This record has no associated files available for download.

More information

Published date: 3 September 2021
Keywords: Table Detection, Table Stucture Recognition, Document Layout Analysis, Historical Measurements, GloSAT

Identifiers

Local EPrints ID: 451059
URI: http://eprints.soton.ac.uk/id/eprint/451059
PURE UUID: 40094e71-d6ec-4817-a5a6-8ade9b07ce5d
ORCID for Stuart Middleton: ORCID iD orcid.org/0000-0001-8305-8176

Catalogue record

Date deposited: 06 Sep 2021 16:30
Last modified: 06 May 2023 01:37

Export record

Altmetrics

Contributors

Creator: Stuart Middleton ORCID iD
Creator: Juliusz Ziomek

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×