Design of experiments with mixed effects and discrete responses plus related topics

Waite, Timothy (2012) Design of experiments with mixed effects and discrete responses plus related topics. University of Southampton, Mathematical Sciences, Doctoral Thesis, 237pp.

Record type: Thesis (Doctoral)

Abstract

For certain types of experiment, the response cannot be adequately modelled using a normal distribution. When this is the case, it is common to use a Generalised Linear Model (GLM) to analyse the data. Such models allow us to fit a wide range of response distributions including Bernoulli and Poisson.

If responses in the same block are correlated, it may be appropriate to model the impact of blocking using random effects. The GLM can be extended in several ways to include random effects; both Generalised Linear Mixed Models (GLMMs) and Hierarchical Generalised Linear Models (HGLMs) are common examples of such extensions. Another example is a random intercept model for a binary response bioassay study with repeated measurements on heterogeneous individuals. The latter model is related to a GLMM but not strictly within that class.

Obtaining designs for non-normal models with random effects is complicated by the fact that the information matrix, on which most optimality criteria are based, is computationally expensive to evaluate. Indeed, if one computes naively, the search for a typical optimal GLMM design is likely to take several months.

When estimating GLMMs, it is common to use analytical approximations such as marginal quasi-likelihood (MQL) and penalised quasi-likelihood (PQL) in place of full maximum likelihood estimation. In Chapters 2 and 3, we consider the use of such computationally cheap approximations to construct surrogates for the information matrix when producing optimal designs. These reduce the computational burden substantially, enabling us to obtain designs within a practical time frame. The accuracy of the analytical approximations is explored through the use of a detailed computational approximation, which enables us to compute the optimal maximum likelihood design in the case where there are at most two points per block. It is found that one of the analytical approximations appears to perform consistently better than the others for the purposes of producing designs.

In Chapters 4 and 5, designs for an individual variation bioassay model are obtained in the cases where (i) there is a single observation, or (ii) there are multiple observations, per individual. In the former case, designs on the basis of both maximum likelihood and analytical approximations are found and compared. In the multiple observation case, a restriction on the design space enables optimal designs to be computed using a computational approximation related to that for GLMMs. This involves extensive precomputation of numerical integrals.

In Chapter 6 designs for HGLMs are studied using a computationally inexpensive asymptotic approximation to the variance-covariance matrix of the parameter estimators. This allows us to derive designs which are also efficient for the estimation of the random effects.

Throughout, the dependence of the optimal design on the unknown values of the model parameters is addressed through the use of Bayesian methods, which codify uncertainty about the parameter values using a prior distribution. We often assess the performance of the designs obtained from the optimisation of a Bayesian objective function in terms of the distribution on the local efficiencies which is induced by the prior distribution.

When the parameter space contains degenerate values, there is a problem with potential non-convergence of the Bayesian objective function used to select designs. This issue is explored in depth in Chapter 7, and results are obtained for a number of standard models.