we start with basic model assesment: the first model "base" includes only type of mask. Then "base_E" I add ethnic group. Then "base_E_S" I add gender. Then "base_E_S_A" I add age. Then "base_E_S_A_B" I add BMI. For performance we look at BIC which behaves better than AIC for huge data sets. Akaike's information criterion and Bayesian information criterion ----------------------------------------------------------------------------- Model | N ll(null) ll(model) df AIC BIC -------------+--------------------------------------------------------------- base | 9,592 . -5716.536 9 11451.07 11515.59 base_E | 9,592 . -5697.238 12 11418.48 11504.5 base_E_S | 9,592 . -5683.342 13 11392.68 11485.88 base_E_S_A | 9,592 . -5619.072 17 11272.14 11394.01 base_E_S_A_B | 9,592 . -5612.042 20 11264.08 11407.46 ----------------------------------------------------------------------------- The smaller the BIC the better is the model. The winner is "base_E_S_A" as we do not need BMI. Here are the details for the best model: Mixed-effects logistic regression Number of obs = 9,592 Group variable: id Number of groups = 5,544 Obs per group: min = 1 avg = 1.7 max = 5 Integration method: mvaghermite Integration pts. = 7 Wald chi2(15) = 261.82 Log likelihood = -5619.0725 Prob > chi2 = 0.0000 ---------------------------------------------------------------------------------- PASS | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval] -----------------+---------------------------------------------------------------- MODEL | 3M 1863 | 1 (base) 3M 1863+ | .9325028 .0914436 -0.71 0.476 .7694478 1.130111 3M 1873V | 1.528529 .2086359 3.11 0.002 1.16974 1.997368 3M 8833 | 1.53392 .1563768 4.20 0.000 1.256106 1.873178 Cardinal RFP3FV | .5125445 .0596051 -5.75 0.000 .408078 .6437541 FSM18 | .8064716 .140645 -1.23 0.217 .5729852 1.135102 Other* | 1.012831 .0877289 0.15 0.883 .8546885 1.200234 3M 1873 | .8933268 .1301806 -0.77 0.439 .6713801 1.188645 | E | white | 1 (base) black | .6613418 .086852 -3.15 0.002 .5112582 .8554835 asian | .5965401 .0520652 -5.92 0.000 .5027452 .7078339 mixed | .7923941 .0893187 -2.06 0.039 .6353218 .9882997 | SEX | Female | 1 (base) Male | 1.571049 .1452736 4.89 0.000 1.310629 1.883214 | AGE | 18-24 years old | 1 (base) 25-34 years old | 3.410393 .3997582 10.47 0.000 2.71037 4.291215 35-44 years old | 2.55482 .3124884 7.67 0.000 2.010236 3.246934 45-54 years old | 2.749823 .338829 8.21 0.000 2.159837 3.500971 55+ years old | 3.457362 .5043512 8.50 0.000 2.597609 4.601675 | _cons | 1.433216 .1664456 3.10 0.002 1.141453 1.799556 -----------------+---------------------------------------------------------------- id | var(_cons)| 2.485466 .2160378 2.096143 2.947098 ---------------------------------------------------------------------------------- Note: Estimates are transformed only in the first equation. Note: _cons estimates baseline odds (conditional on zero random effects). LR test vs. logistic model: chibar2(01) = 524.33 Prob >= chibar2 = 0.0000 you can clearly see which models perform better and worse than the reference, significantly. The results are also clear for age and sex and ethnic group. Of course, the question is if there is any interaction between model and ethnic group, or sex or age. So, that would mean that the effect of a mask would be different in different ethnic groups. This is what I have focussed now on. The result is that there is no eveidence of interaction. Akaike's information criterion and Bayesian information criterion ----------------------------------------------------------------------------- Model | N ll(null) ll(model) df AIC BIC -------------+--------------------------------------------------------------- base_E_S_A | 9,592 . -5619.072 17 11272.14 11394.01 model_E | 9,592 . -5595.729 38 11267.46 11539.87 model_S | 9,592 . -5617.152 24 11282.3 11454.35 model_A | 9,592 . -5584.465 45 11258.93 11581.52 ----------------------------------------------------------------------------- Note: BIC uses N = number of observations. See [R] BIC note. The last three models are the interaction models with ethnic group, sex, and age, respectively. Top line is our best model from above. You see the BIC is worse for all interaction models. So, there are clear and significant effects of type of mask, ethnic group, sex and age on fit, but these effects do not interact and work independently. Nice and clear result.