Entering edit mode

13 months ago

oghzzang
•
40

I'm working on datas from TCGA's pipeline. we have 'rsem_isoforms_normalized.txt', 'rsem_gene_normalized.txt', of 215 samples. and this samples have number of 20501 genes.

I have these question:

I want to get differentially expression genes from 215 data. Group 1 dim is 195 (samples)* 20501 (genes) Group 2 dim is 20 (samples)*20501 (genes)

I have already run 2 t-tests

multtest's mt.maxT in r

```
PP=mt.maxT(Counts, groups, test="t", B=5000)
PP$fdr=p.adjust(PP$rawp, method = "fdr")
```

number of FDR =< 0.05 genes are only 35.

stats's t.test in r

```
t.result <- apply(TotalCounts, 1, function(x) t.test(x[1:ncol(Counts.C)], x[ncol(Counts.C)+1:ncol(TotalCounts)], paired=F, var.equal = F))
f$p_value <- unlist(lapply(t.result, function(x) x$p.value))
f$fdr <- p.adjust(TotalCounts$p_value, method = "fdr")
```

number of FDR =< 0.05 genes are 2000.

this 2 group's adjusted p values by BH are completely different.

In this situation, how can I get DEGs?

You might consider using either edgeR or DESeq2, both of which are designed to deal with RNA-seq data.

Thanks Davis,

Can I use edgeR by using quartile normalized RNA-seq data?

quartile normalization is conducted to make 'rsem_gene_normalized.txt'