Imputation by Neural Networks and Related Methods
Imputation by Neural Networks and Related Methods
Neural network imputation and regression imputation are compared theoretically and numerically. In the theoretical comparison, we introduce the concept of predictive bias (pbias), which is used to measure the difference between the estimator based on full observations and the estimator based on imputed values. Let y be a continuous response variable, and x be the covariate of y, θ be the parameter of interest. The estimator of θ based on the full observations is denoted θ̂;. The estimator based on the observed and the imputed values is denoted θ̂;.I. Here the imputation is single imputation. The imputed data set has the same size of the full data set. Then pbias is defined as E(θ̂;I - θ̂;). Due to mathematical difficulty, in the theoretical study we only consider the imputation based on the RBF neural network. We show that the performance of an imputation method depends on how the corresponding model fits the underlying model of y. We also show that the RBF model can be equivalent to a regression model in terms of pbias if the RBF model is properly defined and the underlying model is a linear regression model.
A variant of nearest neighbour imputation (NNI) based on weighted distance is also proposed. This method can represent a wide range of NNIs such as Euclidean based NNI and Mahalanobis based NNI. The asymptotic form of this method and the circumstance where it outperforms other imputation methods are investigated.
In the simulation study, we create several situations to compare neural network imputation with regression imputation and other imputation methods such as tree based imputation and NNI. The results show when a competing imputation method outperforms others.
In the numerical study, we use a subset of 1991 household census data to compare the performance of neural network imputation with the performances of logistic regression imputation, nearest neighbour imputation weighed distance-based nearest neighbour imputation and classification tree imputation.
University of Southampton
Zhao, Xinqiang
624c169e-a7fa-4a9a-bc78-a9c1b9d93992
2002
Zhao, Xinqiang
624c169e-a7fa-4a9a-bc78-a9c1b9d93992
Zhao, Xinqiang
(2002)
Imputation by Neural Networks and Related Methods.
University of Southampton, Doctoral Thesis.
Record type:
Thesis
(Doctoral)
Abstract
Neural network imputation and regression imputation are compared theoretically and numerically. In the theoretical comparison, we introduce the concept of predictive bias (pbias), which is used to measure the difference between the estimator based on full observations and the estimator based on imputed values. Let y be a continuous response variable, and x be the covariate of y, θ be the parameter of interest. The estimator of θ based on the full observations is denoted θ̂;. The estimator based on the observed and the imputed values is denoted θ̂;.I. Here the imputation is single imputation. The imputed data set has the same size of the full data set. Then pbias is defined as E(θ̂;I - θ̂;). Due to mathematical difficulty, in the theoretical study we only consider the imputation based on the RBF neural network. We show that the performance of an imputation method depends on how the corresponding model fits the underlying model of y. We also show that the RBF model can be equivalent to a regression model in terms of pbias if the RBF model is properly defined and the underlying model is a linear regression model.
A variant of nearest neighbour imputation (NNI) based on weighted distance is also proposed. This method can represent a wide range of NNIs such as Euclidean based NNI and Mahalanobis based NNI. The asymptotic form of this method and the circumstance where it outperforms other imputation methods are investigated.
In the simulation study, we create several situations to compare neural network imputation with regression imputation and other imputation methods such as tree based imputation and NNI. The results show when a competing imputation method outperforms others.
In the numerical study, we use a subset of 1991 household census data to compare the performance of neural network imputation with the performances of logistic regression imputation, nearest neighbour imputation weighed distance-based nearest neighbour imputation and classification tree imputation.
Text
884591.pdf
- Version of Record
More information
Published date: 2002
Identifiers
Local EPrints ID: 464833
URI: http://eprints.soton.ac.uk/id/eprint/464833
PURE UUID: f5571901-791b-4ecf-bd20-e5d28cf2d593
Catalogue record
Date deposited: 05 Jul 2022 00:04
Last modified: 16 Mar 2024 19:46
Export record
Contributors
Author:
Xinqiang Zhao
Download statistics
Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.
View more statistics