The University of Southampton
University of Southampton Institutional Repository

Improving the quality of astronomical survey data

Improving the quality of astronomical survey data
Improving the quality of astronomical survey data
Astronomical survey telescopes are becoming increasing capable at generating large datasets. The quantities of data being produced necessitate the automation of the data processing which is commonly accomplished via astronomical workflows. The large scale of the data also means that small improvements in the quality of the data processing can have large implications for the value of the science gained. However, deciding on which workflow configuration is best is usually a qualitative process, achieved through trial and improvement which lacks a quantitative measure of the quality of the results produced by each workflow version. Consequently, the best workflow cannot be reliably chosen. Thorough analysis is typically applied to find specific outputs from astronomical workflows, such as the magnitude of an object. However, this targeted analysis focuses on specific components and does not utilise the wider workflow space or the provenance of the workflows. This thesis therefore outlines an approach to be applied to workflows to assess over different workflow versions and measure the quality of data that they produce. To test the approach, it was applied to three separate use cases. The first application used the approach to predict the completeness of period recovery of transient and variable astronomical sources with several candidate observing strategies from upcoming front line astronomical surveys. It was found that observing strategies which did not reduce the observations within the Galactic Plane increase the completeness by a factor of ∼3. The second was an investigation into the use of provenance to improve the timeliness of a differential photometry workflow. It was found that this method offered improvements of at least 96% in computational efficiency when analysing the outlined use cases. The third application was to improve the accuracy and completeness of a workflow designed to search for transients within a set of archival calibration data from an astronomical survey telescope. Workflow configurations were generated using the manual method in addition to via the approach. The best performing workflow found through the approach outperformed the workflow generated through the manual method and consequently found an additional ∼2,500 transient events. However, full evaluation of the approach could be a computationally expensive process, therefore the hill climbing algorithm was also investigated as a means to quickly find a verifiably good workflow configuration. The quality of the results produced by the workflow generated through this method were found to be within 0.2% of those produced by the highest quality workflow found.
University of Southampton
Johnson, Michael
33a0d8cb-491b-4b3f-b193-540a331ac705
Johnson, Michael
33a0d8cb-491b-4b3f-b193-540a331ac705
Chapman, Adriane
721b7321-8904-4be2-9b01-876c430743f1

Johnson, Michael (2020) Improving the quality of astronomical survey data. Doctoral Thesis, 193pp.

Record type: Thesis (Doctoral)

Abstract

Astronomical survey telescopes are becoming increasing capable at generating large datasets. The quantities of data being produced necessitate the automation of the data processing which is commonly accomplished via astronomical workflows. The large scale of the data also means that small improvements in the quality of the data processing can have large implications for the value of the science gained. However, deciding on which workflow configuration is best is usually a qualitative process, achieved through trial and improvement which lacks a quantitative measure of the quality of the results produced by each workflow version. Consequently, the best workflow cannot be reliably chosen. Thorough analysis is typically applied to find specific outputs from astronomical workflows, such as the magnitude of an object. However, this targeted analysis focuses on specific components and does not utilise the wider workflow space or the provenance of the workflows. This thesis therefore outlines an approach to be applied to workflows to assess over different workflow versions and measure the quality of data that they produce. To test the approach, it was applied to three separate use cases. The first application used the approach to predict the completeness of period recovery of transient and variable astronomical sources with several candidate observing strategies from upcoming front line astronomical surveys. It was found that observing strategies which did not reduce the observations within the Galactic Plane increase the completeness by a factor of ∼3. The second was an investigation into the use of provenance to improve the timeliness of a differential photometry workflow. It was found that this method offered improvements of at least 96% in computational efficiency when analysing the outlined use cases. The third application was to improve the accuracy and completeness of a workflow designed to search for transients within a set of archival calibration data from an astronomical survey telescope. Workflow configurations were generated using the manual method in addition to via the approach. The best performing workflow found through the approach outperformed the workflow generated through the manual method and consequently found an additional ∼2,500 transient events. However, full evaluation of the approach could be a computationally expensive process, therefore the hill climbing algorithm was also investigated as a means to quickly find a verifiably good workflow configuration. The quality of the results produced by the workflow generated through this method were found to be within 0.2% of those produced by the highest quality workflow found.

Text
Final thesis unsigned
Available under License University of Southampton Thesis Licence.
Download (10MB)
Text
PTD signed
Restricted to Repository staff only

More information

Published date: March 2020

Identifiers

Local EPrints ID: 447677
URI: http://eprints.soton.ac.uk/id/eprint/447677
PURE UUID: e0f36c4c-2ef9-4d15-becb-0867228bb7b0
ORCID for Michael Johnson: ORCID iD orcid.org/0000-0002-5566-6147
ORCID for Adriane Chapman: ORCID iD orcid.org/0000-0002-3814-2587

Catalogue record

Date deposited: 18 Mar 2021 17:42
Last modified: 19 Mar 2021 02:51

Export record

Contributors

Author: Michael Johnson ORCID iD
Thesis advisor: Adriane Chapman ORCID iD

University divisions

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×