Hi all,
I am trying to find the differential expressed genes for a dataset where the samples are treated with a drug at three different time-points(baseline, 16 weeks, 52 weeks). There are no controls in the study. I am trying to use the limma in R and analyze this as they are paired samples. I am using F statistic to do this. I have one challenge here. Not all time points equal samples. Few observations are missing. For instance, I have baseline and week 16 and 52 week is missing. The easiest way to handle the missing observations I thought was to take the samples which has all three time points. But I think in statistical point of view this might not be correct way. Can anybody suggest what method to use to handle this issue. I am reading lot of papers. I came across imputation technique. I am not sure whether or not the technique would be applicable to this scenario.
I am attaching the sample data here
Accession Title time timepoints subjectid
GSM2352693 SUBJ.1720, SLE, baseline baseline 3 SUBJ.1720
GSM2352694 SUBJ.1720, SLE, week16 week 16 3 SUBJ.1720
GSM2352695 SUBJ.1720, SLE, week52 week 52 3 SUBJ.1720
GSM2352696 SUBJ.0003, SLE, baseline baseline 3 SUBJ.0003
GSM2352697 SUBJ.0003, SLE, week16 week 16 3 SUBJ.0003
GSM2352698 SUBJ.0003, SLE, week52 week 52 3 SUBJ.0003
GSM2352699 SUBJ.0065, SLE, baseline baseline 2 SUBJ.0065
GSM2352700 SUBJ.0065, SLE, week52 week 52 2 SUBJ.0065
GSM2352701 SUBJ.1587, SLE, baseline baseline 3 SUBJ.1587
GSM2352702 SUBJ.1587, SLE, week16 week 16 3 SUBJ.1587
GSM2352703 SUBJ.1587, SLE, week52 week 52 3 SUBJ.1587
GSM2352704 SUBJ.1028, SLE, baseline baseline 3 SUBJ.1028
GSM2352705 SUBJ.1028, SLE, week16 week 16 3 SUBJ.1028
GSM2352706 SUBJ.1028, SLE, week52 week 52 3 SUBJ.1028
GSM2352707 SUBJ.0901, SLE, baseline baseline 3 SUBJ.0901
GSM2352708 SUBJ.0901, SLE, week16 week 16 3 SUBJ.0901
GSM2352709 SUBJ.0901, SLE, week52 week 52 3 SUBJ.0901
GSM2352710 SUBJ.1544, SLE, baseline baseline 3 SUBJ.1544
GSM2352711 SUBJ.1544, SLE, week16 week 16 3 SUBJ.1544
GSM2352712 SUBJ.1544, SLE, week52 week 52 3 SUBJ.1544
GSM2352713 SUBJ.0200, SLE, baseline baseline 3 SUBJ.0200
GSM2352714 SUBJ.0200, SLE, week16 week 16 3 SUBJ.0200
GSM2352715 SUBJ.0200, SLE, week52 week 52 3 SUBJ.0200
GSM2352716 SUBJ.0032, SLE, baseline baseline 3 SUBJ.0032
GSM2352717 SUBJ.0032, SLE, week16 week 16 3 SUBJ.0032
GSM2352718 SUBJ.0032, SLE, week52 week 52 3 SUBJ.0032
GSM2352719 SUBJ.1545, SLE, week16 week 16 2 SUBJ.1545
GSM2352720 SUBJ.1545, SLE, week52 week 52 2 SUBJ.1545
Th R code I am trying to use is
library(limma)
# limma
design <- model.matrix(~0 + eset_filtered$cohort)
colnames(design) <- levels(eset_filtered$cohort)
fit <- lmFit(eset_filtered, design)
contrast.matrix <- makeContrasts("week16-baseline","week52-baseline",levels = design) #name
fit2 <- contrasts.fit(fit, contrast.matrix)
fit2 <- eBayes(fit2)
top.active <- topTable(fit2, adjust="BH", n=nrow(eset_filtered))
Any suggestions would be really appreciated. Thanks in advance.