A

Question

Multiple testing in GLMM models

0

Entering edit mode

6.4 years ago

MHMMH • 0

Hello all,

I have a question regarding multiple testing. I have been looking everywhere but I haven't found a suitable answer yet.

So, the situation is like this, I have two different situations.

A) The first situation I have a GLMM that has the following structure.

Response1 ~ Variable1 + Confounder1 + Confounder2 + (1|Random Effect)

In this case, I have data from 1000 individuals, and I am interested in the relationship (and p-value) between Response1 and Variable1. Both response1 and variable1 are numeric variables. My question is, do I have to correct this p-value for the number of iterations in the model (1000 because of the number of individuals)? Do I have to correct it at all?

B) The second situation is very similar to the first one. This time, I have 50 Responses and 4 Variables. What I do now, is fixing the Variable, I run a different GLMM for each one of the 50 responses.

Response1 ~ Variable1 + Confounder1 + Confounder2 + (1|Random Effect)

Response2 ~ Variable1 + Confounder1 + Confounder2 + (1|Random Effect)

Response3 ~ Variable1 + Confounder1 + Confounder2 + (1|Random Effect)

...

Response50 ~ Variable1 + Confounder1 + Confounder2 + (1|Random Effect)

In this case, for the same Variable (Variable1), I would have 50 different p-values from the 50 different models. In this case, I would correct by the number of tests (50), but the question in A remains. Do I also have to consider the intra model number tests?

R multiple testing glmm statistics • 2.0k views

ADD COMMENT • link updated 6.4 years ago by Kevin Blighe 87k • written 6.4 years ago by MHMMH • 0

score 0 · Answer 1 · 2017-11-15

Did you take a look here: Do P-Values From A Generalised Linear Model Need Correction For Multiple Testing?

A

In your example A), it's just a single independent test, so, you don't have to adjust (how can you?). To test the model robustly, follow it up by performing cross validation (CV) (e.g. 10-fold) using cv.glm() in R. If you're not up to speed with CV analysis, then take a look at my explanation here: Multinomial elastic net implementation on microarray dataset , error at lognet and nfold explanation Also take a look at my thread on A: Resources for gene signature creation

CV analysis on a GLM will produce a prediction standard error and a cross-validated prediction standard error. They should both be similar, and relate to the error in, in your case, the prediction of the response.

B

For your example B), the adjustment could be done 4 times, i.e., separately for each variable's 50 P values. However, as you have 50 different outcomes/responses, it's slightly different from the typical model. My preference would actually be to not adjust in this situation and to treat each as entirely independent models, which can be further validated through cross validation.

There is a publication that backs up my comment on example B): Do multiple outcome measures require p-value adjustment?