Design effects in the analysis of longitudinal survey data
Design effects in the analysis of longitudinal survey data
The design effect measures the inflation of the sampling variance of an estimator as a result of the use of a complex sampling scheme. It is usually measured relative to the variance of the estimator under simple random sampling. Many social survey designs employ multi-stage sampling, leading to some clustering of the sample and this tends to lead to design effects greater than unity. There is some empirical evidence that design effects from clustering tend to decrease the more complex the analysis. For example, design effects for regression coefficients are often found to be less than design effects for the mean of the dependent variable in the regression. Evidence of design effects close to unity for such analyses may be used by some analysts of survey data to justify ignoring the sampling design in complex analyses. In this paper we present some evidence of an opposite tendency, for design effects to be higher for complex longitudinal analyses than for corresponding cross-sectional analyses. Our empirical evidence is based upon data from the British Household Panel Study. This survey follows longitudinally a sample of individuals selected in 1991 by two-stage sampling, with clustering by area. Data are collected in annual waves. Our analyses are based upon a subsample of women aged 16-39. The dependent variable is a gender role attitude score, derived from responses to six five-point questions, and treated as a continuous variable. Covariates include age group, economic activity and educational qualifications. Longitudinal regression models include random effects for women. Data are analysed for five waves of the survey when the gender role attitude questions were asked. The design effects for the regression coefficients are found to increase the more waves are included in the analysis. A similar tendency is observed for estimates of the time-averaged mean of the dependent variable. A possible theoretical explanation is provided. The implication of these findings is that standard errors in analyses of longitudinal survey data may be very misleading if the initial sample was clustered and if this clustering is ignored in the analysis.
clustering, design effect, longitudinal analysis, random effects model
Southampton Statistical Sciences Research Institute, University of Southampton
Skinner, Chris
dec5ef40-49ef-492a-8a1d-eb8c6315b8ce
Vieira, Marcel de Toledo
d78eb443-9a6e-400d-a534-5e9a6c56ddf0
10 March 2005
Skinner, Chris
dec5ef40-49ef-492a-8a1d-eb8c6315b8ce
Vieira, Marcel de Toledo
d78eb443-9a6e-400d-a534-5e9a6c56ddf0
Skinner, Chris and Vieira, Marcel de Toledo
(2005)
Design effects in the analysis of longitudinal survey data
(S3RI Methodology Working Papers, M05/13)
Southampton, UK.
Southampton Statistical Sciences Research Institute, University of Southampton
23pp.
Record type:
Monograph
(Working Paper)
Abstract
The design effect measures the inflation of the sampling variance of an estimator as a result of the use of a complex sampling scheme. It is usually measured relative to the variance of the estimator under simple random sampling. Many social survey designs employ multi-stage sampling, leading to some clustering of the sample and this tends to lead to design effects greater than unity. There is some empirical evidence that design effects from clustering tend to decrease the more complex the analysis. For example, design effects for regression coefficients are often found to be less than design effects for the mean of the dependent variable in the regression. Evidence of design effects close to unity for such analyses may be used by some analysts of survey data to justify ignoring the sampling design in complex analyses. In this paper we present some evidence of an opposite tendency, for design effects to be higher for complex longitudinal analyses than for corresponding cross-sectional analyses. Our empirical evidence is based upon data from the British Household Panel Study. This survey follows longitudinally a sample of individuals selected in 1991 by two-stage sampling, with clustering by area. Data are collected in annual waves. Our analyses are based upon a subsample of women aged 16-39. The dependent variable is a gender role attitude score, derived from responses to six five-point questions, and treated as a continuous variable. Covariates include age group, economic activity and educational qualifications. Longitudinal regression models include random effects for women. Data are analysed for five waves of the survey when the gender role attitude questions were asked. The design effects for the regression coefficients are found to increase the more waves are included in the analysis. A similar tendency is observed for estimates of the time-averaged mean of the dependent variable. A possible theoretical explanation is provided. The implication of these findings is that standard errors in analyses of longitudinal survey data may be very misleading if the initial sample was clustered and if this clustering is ignored in the analysis.
Text
15012-01.pdf
- Other
More information
Published date: 10 March 2005
Keywords:
clustering, design effect, longitudinal analysis, random effects model
Identifiers
Local EPrints ID: 15012
URI: http://eprints.soton.ac.uk/id/eprint/15012
PURE UUID: 5ed436ca-09c6-4383-a604-53b142e542ca
Catalogue record
Date deposited: 10 Mar 2005
Last modified: 20 Feb 2024 03:20
Export record
Contributors
Author:
Chris Skinner
Author:
Marcel de Toledo Vieira
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics