The University of Southampton
University of Southampton Institutional Repository

Imputation by Neural Networks and Related Methods

Imputation by Neural Networks and Related Methods
Imputation by Neural Networks and Related Methods

Neural network imputation and regression imputation are compared theoretically and numerically. In the theoretical comparison, we introduce the concept of predictive bias (pbias), which is used to measure the difference between the estimator based on full observations and the estimator based on imputed values. Let y be a continuous response variable, and x be the covariate of y, θ be the parameter of interest. The estimator of θ based on the full observations is denoted θ̂;. The estimator based on the observed and the imputed values is denoted θ̂;.I. Here the imputation is single imputation. The imputed data set has the same size of the full data set. Then pbias is defined as E(θ̂;I - θ̂;). Due to mathematical difficulty, in the theoretical study we only consider the imputation based on the RBF neural network. We show that the performance of an imputation method depends on how the corresponding model fits the underlying model of y. We also show that the RBF model can be equivalent to a regression model in terms of pbias if the RBF model is properly defined and the underlying model is a linear regression model.

A variant of nearest neighbour imputation (NNI) based on weighted distance is also proposed. This method can represent a wide range of NNIs such as Euclidean based NNI and Mahalanobis based NNI. The asymptotic form of this method and the circumstance where it outperforms other imputation methods are investigated.

In the simulation study, we create several situations to compare neural network imputation with regression imputation and other imputation methods such as tree based imputation and NNI. The results show when a competing imputation method outperforms others.

In the numerical study, we use a subset of 1991 household census data to compare the performance of neural network imputation with the performances of logistic regression imputation, nearest neighbour imputation weighed distance-based nearest neighbour imputation and classification tree imputation.

University of Southampton
Zhao, Xinqiang
624c169e-a7fa-4a9a-bc78-a9c1b9d93992
Zhao, Xinqiang
624c169e-a7fa-4a9a-bc78-a9c1b9d93992

Zhao, Xinqiang (2002) Imputation by Neural Networks and Related Methods. University of Southampton, Doctoral Thesis.

Record type: Thesis (Doctoral)

Abstract

Neural network imputation and regression imputation are compared theoretically and numerically. In the theoretical comparison, we introduce the concept of predictive bias (pbias), which is used to measure the difference between the estimator based on full observations and the estimator based on imputed values. Let y be a continuous response variable, and x be the covariate of y, θ be the parameter of interest. The estimator of θ based on the full observations is denoted θ̂;. The estimator based on the observed and the imputed values is denoted θ̂;.I. Here the imputation is single imputation. The imputed data set has the same size of the full data set. Then pbias is defined as E(θ̂;I - θ̂;). Due to mathematical difficulty, in the theoretical study we only consider the imputation based on the RBF neural network. We show that the performance of an imputation method depends on how the corresponding model fits the underlying model of y. We also show that the RBF model can be equivalent to a regression model in terms of pbias if the RBF model is properly defined and the underlying model is a linear regression model.

A variant of nearest neighbour imputation (NNI) based on weighted distance is also proposed. This method can represent a wide range of NNIs such as Euclidean based NNI and Mahalanobis based NNI. The asymptotic form of this method and the circumstance where it outperforms other imputation methods are investigated.

In the simulation study, we create several situations to compare neural network imputation with regression imputation and other imputation methods such as tree based imputation and NNI. The results show when a competing imputation method outperforms others.

In the numerical study, we use a subset of 1991 household census data to compare the performance of neural network imputation with the performances of logistic regression imputation, nearest neighbour imputation weighed distance-based nearest neighbour imputation and classification tree imputation.

Text
884591.pdf - Version of Record
Available under License University of Southampton Thesis Licence.
Download (4MB)

More information

Published date: 2002

Identifiers

Local EPrints ID: 464833
URI: http://eprints.soton.ac.uk/id/eprint/464833
PURE UUID: f5571901-791b-4ecf-bd20-e5d28cf2d593

Catalogue record

Date deposited: 05 Jul 2022 00:04
Last modified: 16 Mar 2024 19:46

Export record

Contributors

Author: Xinqiang Zhao

Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.

View more statistics

Atom RSS 1.0 RSS 2.0

Contact ePrints Soton: eprints@soton.ac.uk

ePrints Soton supports OAI 2.0 with a base URL of http://eprints.soton.ac.uk/cgi/oai2

This repository has been built using EPrints software, developed at the University of Southampton, but available to everyone to use.

We use cookies to ensure that we give you the best experience on our website. If you continue without changing your settings, we will assume that you are happy to receive cookies on the University of Southampton website.

×