Biostar Beta. Not for public use.
Limma package, how to correct by age and sex?
0
Entering edit mode
13 months ago
ellen2270 • 0

Hello everyone,

I am analyzing microarray data with limma package and I have a couple of doubts as it is the first time I use it. I have 3 conditions which I want to compare. My data looks like this:

Condition   AGE SEX
0   64  Male
0   65  Female
1   67  Male
1   60  Male
2   58  Male
2   65  Female

I want to correct the results by sex and age as there are slightly differences in sex and age between groups. I have came with two different codes to correct by age/sex and I obtain different results. Do you think the code I am using is correct to adjust by age and sex. Which strategy is better?

  1. Introducing sex and age in the model:

    design <- model.matrix(~0+Condition+as.numeric(AGE)+SEX,targets)
    fit <- lmFit(y,design)
    cont.matrix <- makeContrasts(P1="1-0",
    P2=”1-2”,
    P3=”2-0”, levels=design)
    
    fit2  <- contrasts.fit(fit, cont.matrix)
    fit3  <- eBayes(fit2)
    topTable(fit3, coef=1, n=Inf, adjust="BH")
    
  2. Using the function RemoveBatchEffect

    y_correct<-removeBatchEffect(y,batch=(targets$SEX),covariates=(targets$AGE)
    design <- model.matrix(~0+Condition)
    fit <- lmFit(y_correct,design)
    cont.matrix <- makeContrasts(P1="1-0",
    P2=”1-2”,
    P3=”2-0”, levels=design)
    
    fit2  <- contrasts.fit(fit, cont.matrix)
    fit3  <- eBayes(fit2)
    topTable(fit3, coef=1, n=Inf, adjust="BH")
    

Can anyone confirm me which method would be more correct and confirm me that it make sense? Thankyou

ADD COMMENTlink
2
Entering edit mode

I am not a statistician so I add this as comment rather than answer: I think from a biological perspective these slight differences in AGE are probably of minor effect. All study participants are reaching or already reached an elderly status. You are not comparing kids to adults, so the confounding effect is probably limited (or absent). I also doubt that using AGE as a continuous variable makes sense at it assumes a somewhat linear influence of AGE on gene expression. Do you think this is justified? The SEX category might indeed influence the gene expression but you do not have any replicates in terms of SEX replication per group (e.g. 2 men, 2 women per group) so that you cannot really see if SEX creates additional variability that cannot be explained by the non-SEX variation. I would start by checking how the samples cluster by plotting the results from a PCA or MDS analysis and then decide if the effort really makes sense.

ADD REPLYlink
0
Entering edit mode

Thankyou for your input. I only had shown some of the data in the previous post but I think you are correct and maybe age doesn't influence so much. Regardless Sex, I have between 7 and 8 individuals per condition with different sex distribution (eg. condition 1: 5 males and 2 females, condition 2: 5 males and 3 females and condition 3: 5 males and 3 females), with this data and knowing that sex influences gene expression it maybe makes sense to continue adjusting the analysis by sex.

ADD REPLYlink
0
Entering edit mode

Ok I see. As said, I would first check by PCA or MDS that there is indeed a confounding effect based on SEX.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1