Tzavidis, Nikos and Lin, Yan-Xia
Estimating from Cross-sectional Categorical Data Subject to Misclassification and Double Sampling: Moment-based, Maximum Likelihood and Quasi-Likelihood Approaches. Southampton, UK, Southampton Statistical Sciences Research Institute, 34pp.
(S3RI Methodology Working Papers, M04/03).
We discuss the analysis of cross-sectional categorical data in the presence of misclassification and double sampling. In a double sampling context we assume that along with the main measurement device, which is subject to misclassification, we have a secondary measurement device, which is free of error but more expensive to apply. Due to its higher cost, the validation survey is employed only for a subset of units. Inference using double sampling is based on combining information from both measurement devices. Previously proposed parameterisations of the misclassification model that utilize either the calibration or the misclassification probabilities are reviewed. We then show that the misclassification model can be alternatively formulated as a missing data problem using the misclassification probabilities. In this context, the model parameters are estimated using maximum likelihood estimation via the EM algorithm. We suggest that the formulation of the misclassification model as a missing data problem using the misclassification probabilities, as opposed to maximum likelihood estimation using the calibration probabilities, offers a robust basis for extending the model to handle more complex situations. We further illustrate that the likelihood-based approaches offer some practical advantages over the moment-based approaches. As an alternative approach, we also present a quasi-likelihood parameterisation of the misclassification model. In this framework, an explicit definition of the likelihood function is avoided and a different way of resolving a missing data problem is provided. The quasi-likelihood method offers further practical advantages to the data analyst over the likelihood-based and the moment-based approaches. Variance estimation under the alternative parameterisations is discussed. The different methods are illustrated using two numerical examples and a Monte-Carlo simulation study.
Actions (login required)