Topics of statistical theory for register-based statistics and data integration
Topics of statistical theory for register-based statistics and data integration
Official statistics production based on a combination of data sources, including sample survey, census and administrative registers, is becoming more and more common. Reduction of response burden, gains of production cost efficiency as well as potentials for detailed spatial-demographic and longitudinal statistics are some of the major advantages associated with the use of integrated statistical data. Data integration has always been an essential feature associated with the use of administrative register data. But survey and census data should also be integrated, so as to widen their scope and improve the quality. There are many new and difficult challenges here that are beyond the traditional topics of survey sampling and data integration. In this article we consider statistical theory for data integration on a conceptual level. In particular, we present a two-phase life-cycle model for integrated statistical microdata, which provides a framework for the various potential error sources, and outline some concepts and topics for quality assessment beyond the ideal of error-free data. A shared understanding of these issues will hopefully help us to collocate and coordinate efforts in future research and development.
combination of sources, data life cycle, representation, measurement, validity, equivalence, record linkage, statistical matching, micro integration, micro calibration
41-63
Zhang, Li-Chun
a5d48518-7f71-4ed9-bdcb-6585c2da3649
February 2012
Zhang, Li-Chun
a5d48518-7f71-4ed9-bdcb-6585c2da3649
Abstract
Official statistics production based on a combination of data sources, including sample survey, census and administrative registers, is becoming more and more common. Reduction of response burden, gains of production cost efficiency as well as potentials for detailed spatial-demographic and longitudinal statistics are some of the major advantages associated with the use of integrated statistical data. Data integration has always been an essential feature associated with the use of administrative register data. But survey and census data should also be integrated, so as to widen their scope and improve the quality. There are many new and difficult challenges here that are beyond the traditional topics of survey sampling and data integration. In this article we consider statistical theory for data integration on a conceptual level. In particular, we present a two-phase life-cycle model for integrated statistical microdata, which provides a framework for the various potential error sources, and outline some concepts and topics for quality assessment beyond the ideal of error-free data. A shared understanding of these issues will hopefully help us to collocate and coordinate efforts in future research and development.
This record has no associated files available for download.
More information
e-pub ahead of print date: 15 November 2011
Published date: February 2012
Keywords:
combination of sources, data life cycle, representation, measurement, validity, equivalence, record linkage, statistical matching, micro integration, micro calibration
Organisations:
Statistical Sciences Research Institute
Identifiers
Local EPrints ID: 345157
URI: http://eprints.soton.ac.uk/id/eprint/345157
ISSN: 0039-0402
PURE UUID: 87fdb72c-f802-4287-bf46-948df4481944
Catalogue record
Date deposited: 09 Nov 2012 17:04
Last modified: 15 Mar 2024 03:45
Export record
Altmetrics
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics