edgeR and GO analysis using goana
2
0
Entering edit mode
7.6 years ago
moxu ▴ 510

I used RSEM to align and quantify RNAseq, then used edgeR to get differentially expressed genes with someting like the following:

library("edgeR");

d0<-read.delim("mydata.txt", row.names="gene_id");
d<-d0[c("c1","c2","t1","t2")];
attach(d);
group<-factor(c(1,1,2,2));
dgeList <- DGEList(counts=d, group=group);
normFac <- calcNormFactors(dgeList); # ERCC normalization
dsgn <- model.matrix(~group);
disp <- estimateDisp(normFac, dsgn);
qlFit <- glmQLFit(disp, dsgn);
qlfTest <- glmQLFTest(qlFit, coef=2);
topTags(qlfTest);

# do GO analysis
go <- goana(qlfTest, specifes="Hs");

The code works up to "topTags(qlfTest)", but stopped at the last line with the following error:

Error in goana.default(qlfTest, specifes = "Hs", de = list(Up = c("A1BG",  : 
  No genes found in universe

Any ideas?

Thanks!

RNA-Seq next-gen gene • 11k views
ADD COMMENT
1
Entering edit mode

show us a snippet of mydata.txt

ADD REPLY
0
Entering edit mode
         c1 c2 t1 t2
A1BG        33.61    35.00    100.26    129.68
A1BG-AS1    33.25    35.13     93.40    112.64
A1CF         1.00     2.06      3.55      4.30
A2M       3843.77  3995.67      4.03      5.03
A2M-AS1     11.23    17.34     65.97     88.99
ADD REPLY
0
Entering edit mode

Also it should be species = "Hs" not specifes.

ADD REPLY
0
Entering edit mode

Thanks for the correction. However, with "species", I got the same error.

ADD REPLY
0
Entering edit mode

And what's the output of topTags(qlfTest)?

ADD REPLY
0
Entering edit mode
> topTags(qlfTest)
Coefficient:  group2 
             logFC    logCPM        F       PValue          FDR
ALDH1A3   7.268353  9.470591 18973.47 1.274423e-28 1.542331e-24
KRT7      7.941622  9.301393 18935.93 1.297276e-28 1.542331e-24
BCAT1    -4.480013 10.361213 17991.53 2.053146e-28 1.627324e-24
FLG     -12.436994  8.134986 15132.03 9.703443e-28 5.768212e-24
SLPI     -4.676582  8.770861 14112.19 1.814653e-27 8.629763e-24
ADD REPLY
1
Entering edit mode
7.6 years ago

You need to convert your gene identifiers to Entrez gene identifiers. E.g KRT7 is 3855.

I suggest BioMart from Ensembl or https://github.com/stephenturner/annotables

ADD COMMENT
0
Entering edit mode

Great!

I've installed all the packages in the link, then for my edgeR variables, what exactly should I do to be able to run "goana"?

Thanks!

ADD REPLY
0
Entering edit mode

You need to perform a join using the required table (e.g. grch38) and your dataframe of differential expressed genes (qlfTest). The dplyr library is great for jobs like this. A few examples are also included on the github page and http://www.gettinggeneticsdone.com/2015/11/annotables-convert-gene-ids.html.

ADD REPLY
0
Entering edit mode

hello Is there a way to do the goanna and topGo with Ensenmbl or external_gene_name? I lose a lot of genes when I do it with entrez Thanks

ADD REPLY
1
Entering edit mode
7.6 years ago

I believe your problem lies in that you are using the wrong objects for the goana function.

You cannot just feed it the qflTest and expected it to work. You need to give it a list of genes.

degs <- topTags(qlfTest)[[1]]
go <- goana(degs, species="Hs");

That should work. Ideally in this case you want to feed it:

de = Your differentially expressed genes (up and/or down)

universe = All expressed genes (all row.names of counts/dge object) or alternatively All annotated genes

ADD COMMENT
0
Entering edit mode

According to the help information of goana:

de: a vector of Entrez Gene IDs, or a list of such vectors, or an "MArrayLM" fit object.

ADD REPLY
0
Entering edit mode

Yes, you are right.

The problem now is that "topGO(go)" invariantly (I have other differential gene expression analyses) returns the same list of GO pathways as the following:

> go <- goana(as.vector(degs), species="Hs");
> topGO(go)
                                                                                                 Term
GO:0000002                                                           mitochondrial genome maintenance
GO:0000003                                                                               reproduction
GO:0000009                                                     alpha-1,6-mannosyltransferase activity
GO:0000010                                                  trans-hexaprenyltranstransferase activity
GO:0000012                                                                 single strand break repair
GO:0000014                                         single-stranded DNA endodeoxyribonuclease activity
GO:0000015                                                          phosphopyruvate hydratase complex
GO:0000016                                                                           lactase activity
GO:0000018                                                            regulation of DNA recombination

Certainly something is wrong.

ADD REPLY

Login before adding your answer.

Traffic: 1487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6