A good pipeline for GO term analysis on RNA-seq Clusters in R?
3
1
Entering edit mode
5.9 years ago
dtatarak ▴ 30

Hi all,

I'm a relative newcomer to RNA-seq analysis, and I am now at the point where I want to do GO term analysis on my dataset.

I have done hierarchical clustering of my dataset consisting of 800 differentially expressed genes from Zebrafish samples. I have identified clusters that are interesting based on their expression patterns, and I now want to look at the gene ontology within these clusters.

I have looked at several R packages for GO term analysis online including clusterProfiler and GOexpress. But the documentation leaves something to be desired for an R newbie like myself. Does anyone have a suggestion for a GO term analysis pipeline they have used in R? Thank you very much!

Best David Tatarakis

RNA-Seq GO terms R • 7.2k views
ADD COMMENT
0
Entering edit mode

Could you please share your RNA-Seq pipeline with commands (DSeq2 and Hierarchical clustering with me? I am also a newcomer and trying to analyze the RNA-Seq data from zebrafish. Thank you in advance.

ADD REPLY
3
Entering edit mode
5.9 years ago

Coincidence, but my recommendation for you, David, is to use DAVID. That's an acronym for Database for Annotation, Visualization and Integrated Discovery. It is quite possibly the easiest tool to use for someone just starting out wih gene enrichment. To help, I've even shown how one can do enrichment in my tutorial here: Clustering of DAVID gene enrichment results from gene expression studies

There are many other tools out there,. but their implementation can be tricky due to annotation issues. With DAVID, you can have your genes in various annotation formats, as you'll see, and it will even attempt to automatically identify the annotation format for you, if you wish.

Kevin

ADD COMMENT
0
Entering edit mode

DAVID is definitely good, but is there a way to present the results graphically instead of the standard tables?

ADD REPLY
0
Entering edit mode

I had a tutorial on Biostars previously, specifically for how to plot the results of DAVID as a heatmap; however, as new package versions were released, the tutorial fell into disrepair.

Essentially, you could create a gene X GO term [or KEGG pathway, etc] binary matrix, and shade cells in the heatmap white for 0, and green or any other colour for 1.

ADD REPLY
0
Entering edit mode

I used PANTHER to identify enriched biological processes in ~4000 genes and obtained ~500 GO BP complete terms after FDR < 0.05. This I guess would not be good with heatmaps. REVIGO treemaps helped reduce the redundancy though and helped make a good figure for publication.

ADD REPLY
1
Entering edit mode

You could just plot the top 20 as a barplot based on -log10(FDR) ?

Here, I use base R: A: DAVID functional Analysis and its visualization of GO terms using Bar plot

Using ggplot2 would be nicer, though

ADD REPLY
1
Entering edit mode
5.9 years ago
Zhilong Jia ★ 2.2k
  1. the toppcluster webserver, but not work recently.
  2. co-expressed gene set enrichment analysis, cogena. But Zebrafish GO gene sets as a gmt file are needed.
  3. clusterProfiler. GO analyses (groupGO(), enrichGO() and gseGO()) support organisms that have an OrgDb object available. so it supports zebrafish. ref: http://bioconductor.org/packages/release/bioc/vignettes/clusterProfiler/inst/doc/clusterProfiler.html

In summary. clusterProfiler probably is the easiest way if you program. Or use DAVID webserver as recommended by @Kevin by analyzing per cluster each time.

Another relative post: C: Compare sets of GO enrichments

ADD COMMENT
1
Entering edit mode
5.9 years ago
caggtaagtat ★ 1.9k

For gene set enrichment analysis (GSEA), I use the R package "EGSEA". It combines 12 prominent GSEA algorithms availible for R and obtains a consensus ranking of biologically relevant results.

The results can than be used for REVIGO for example, to visualize changes of GO families.

ADD COMMENT
0
Entering edit mode

Hello, Does this work for zebrafish?

ADD REPLY

Login before adding your answer.

Traffic: 1940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6