GloSAT Historical Measurement Table Dataset
GloSAT Historical Measurement Table Dataset
Dataset containing scanned historical measurement table documents from ship logs and land measurement stations. Annotations provided in this dataset are designed to allow finergrained table detection and table structure recognition models to be trained and tested. Annotations are region boundaries for tables, cells, headings, headers and captions.
This dataset release includes code to train models on a training split, to use trained model checkpoints for inference and to evaluate interred results on a test split. Pretrained models used in the published HIP-2021 paper are included in the dataset so results can be easily reproduced without training the model checkpoints yourself.
Instructions and code can be found on the linked github repository https://github.com/stuartemiddleton/glosat_table_dataset
A pre-print of the HIP-2021 paper can be found on the authors website https://www.southampton.ac.uk/~sem03/HIP_2021.pdf
Original images sourced with permission from UK Met Office, US NOAA and weatheerrescue.org (University of Reading).
This work is part of the GloSAT project https://www.glosat.org/ and supported by the Natural Environment Research Council (NE/S015604/1). The authors acknowledge the use of the IRIDIS High Performance Computing Facility, and associated support services at the University of Southampton, in the completion of this work.
Table Detection, Table Stucture Recognition, Document Layout Analysis, Historical Measurements, GloSAT
University of Southampton
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Ziomek, Juliusz
b05e7f21-70db-497c-be74-b0b54d2a4579
Middleton, Stuart
404b62ba-d77e-476b-9775-32645b04473f
Ziomek, Juliusz
b05e7f21-70db-497c-be74-b0b54d2a4579
Middleton, Stuart and Ziomek, Juliusz
(2021)
GloSAT Historical Measurement Table Dataset.
University of Southampton
doi:10.5281/zenodo.5363457
[Dataset]
Abstract
Dataset containing scanned historical measurement table documents from ship logs and land measurement stations. Annotations provided in this dataset are designed to allow finergrained table detection and table structure recognition models to be trained and tested. Annotations are region boundaries for tables, cells, headings, headers and captions.
This dataset release includes code to train models on a training split, to use trained model checkpoints for inference and to evaluate interred results on a test split. Pretrained models used in the published HIP-2021 paper are included in the dataset so results can be easily reproduced without training the model checkpoints yourself.
Instructions and code can be found on the linked github repository https://github.com/stuartemiddleton/glosat_table_dataset
A pre-print of the HIP-2021 paper can be found on the authors website https://www.southampton.ac.uk/~sem03/HIP_2021.pdf
Original images sourced with permission from UK Met Office, US NOAA and weatheerrescue.org (University of Reading).
This work is part of the GloSAT project https://www.glosat.org/ and supported by the Natural Environment Research Council (NE/S015604/1). The authors acknowledge the use of the IRIDIS High Performance Computing Facility, and associated support services at the University of Southampton, in the completion of this work.
This record has no associated files available for download.
More information
Published date: 3 September 2021
Keywords:
Table Detection, Table Stucture Recognition, Document Layout Analysis, Historical Measurements, GloSAT
Identifiers
Local EPrints ID: 451059
URI: http://eprints.soton.ac.uk/id/eprint/451059
PURE UUID: 40094e71-d6ec-4817-a5a6-8ade9b07ce5d
Catalogue record
Date deposited: 06 Sep 2021 16:30
Last modified: 06 May 2023 01:37
Export record
Altmetrics
Contributors
Creator:
Juliusz Ziomek
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics