Report on methods for complex linked data
Report on methods for complex linked data
The UK's longitudinal study resources have been largely survey-based, but there is potential for increasing the range of variables and coverage of the information through linkage and harmonisation with other datasets. Combining multiple sources in this way creates data with complex structures which require appropriate methodologies for analysis. This report describes the nature of complexities in linked datasets for analysis, and summarises the methodological requirements for:
analysis of partially overlapping repeated measures;
analysis of networks within longitudinal data;
secondary analysis of linked data;
longitudinal population size estimation.
These approaches all share the feature that they have to deal with the potential for errors in the data linkage process, particularly where automated solutions are needed to control costs. A summary of the challenges in providing a scalable linkage methodology which can deal with multiple datasets is included. Secondary analysis of data that cannot be linked without errors is a central topic area in the landscape created by longitudinal data linkage.
Key areas where methodological development seems possible and useful are in the use of structural equation models and related approaches to make the best use of the all the available data, and deployment of the entity resolution approach to data linkage to deal with conflicting information in multiple sources.
University of Southampton
Zhang, Li-Chun
a5d48518-7f71-4ed9-bdcb-6585c2da3649
Dawber, James
85c7c036-2ae3-4c57-a8b3-9f5223cd4da6
July 2019
Zhang, Li-Chun
a5d48518-7f71-4ed9-bdcb-6585c2da3649
Dawber, James
85c7c036-2ae3-4c57-a8b3-9f5223cd4da6
Zhang, Li-Chun and Dawber, James
(2019)
Report on methods for complex linked data
University of Southampton
16pp.
Record type:
Monograph
(Project Report)
Abstract
The UK's longitudinal study resources have been largely survey-based, but there is potential for increasing the range of variables and coverage of the information through linkage and harmonisation with other datasets. Combining multiple sources in this way creates data with complex structures which require appropriate methodologies for analysis. This report describes the nature of complexities in linked datasets for analysis, and summarises the methodological requirements for:
analysis of partially overlapping repeated measures;
analysis of networks within longitudinal data;
secondary analysis of linked data;
longitudinal population size estimation.
These approaches all share the feature that they have to deal with the potential for errors in the data linkage process, particularly where automated solutions are needed to control costs. A summary of the challenges in providing a scalable linkage methodology which can deal with multiple datasets is included. Secondary analysis of data that cannot be linked without errors is a central topic area in the landscape created by longitudinal data linkage.
Key areas where methodological development seems possible and useful are in the use of structural equation models and related approaches to make the best use of the all the available data, and deployment of the entity resolution approach to data linkage to deal with conflicting information in multiple sources.
Text
Report on Methods for Complex Linked Data
- Accepted Manuscript
More information
Published date: July 2019
Identifiers
Local EPrints ID: 436033
URI: http://eprints.soton.ac.uk/id/eprint/436033
PURE UUID: 909b466b-acd0-4fd0-9b1e-9ed50c25b68c
Catalogue record
Date deposited: 26 Nov 2019 17:30
Last modified: 18 Mar 2024 03:24
Export record
Contributors
Author:
James Dawber
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics