Clements, Lily Zara
(2022)
Multiple imputation of a derived variable in a survival analysis context.
*Doctoral Thesis*, 190pp.

## Abstract

A data set contains variables that are directly measured, and can be expanded by non-trivial transformations of the measured variable; e.g., dichotomising a continuous variable. Additionally, a new variable can be constructed from several measured variables; e.g., body mass index (BMI) is the ratio of weight and height-squared. The transformed or constructed variable is a derived variable, and the measured variable(s) that build the derived variable are constituents.

A complication in a derived variable arises if at least one value in the constituents is not stored, that is, the derived variable is incomplete. Incomplete variables are a common problem when analysing data and can lead to incorrect inferences in the analysis if mishandled. One approach to deal with them is multiple imputation (MI). In MI, each missing value is replaced several times, yielding several complete multiply imputed data sets. Each data set is analysed, with the results subsequently combined. Two approaches to impute an incomplete derived variable are active and passive imputation. In active imputation, the derived variable is directly imputed, so the functional relationship with the constituents is ignored. In passive imputation, the constituents are imputed and the derived variable is later constructed.

Previous literature finds that the performance of active and passive MI can depend on the model fitted to the multiply imputed data. One gap in the literature is in the performance of active and passive MI in a survival analysis context.

In this thesis, a simulation study is run to investigate the performance of active and passive imputation for three functional forms in a survival analysis context: ratio, additive, and index.

In an additive form, the derived variable is a weighted sum of the constituents. In an index form, a numerical variable is categorised as a factor.

Conditions investigated include how the missingness is imposed, and the number of predictors to impute the missing values. A special case of passive imputation outperforms active imputation for a ratio and additive functional form. Active imputation outperforms passive imputation for an index functional form.

**Thesis_Lily_Clements - Version of Record**

## More information

## Identifiers

## Catalogue record

## Export record

## Contributors

## Download statistics

Downloads from ePrints over the past year. Other digital versions may also be available to download e.g. from the publisher's website.