Clustering use GO analysis (DAVID) Heatmap between different groups
1
0
Entering edit mode
4.9 years ago
khoang3 • 0

Hello,

So I have a excel/txt file that has the genes listed in column1, i have the fold change of these genes in one group in column 2, and another fold change in column 3. I performed DAVID analysis to get the GO terms.

What I am trying to do is to see if I can group the genes into the different GO terms with their associated fold change between column 2 and 3.

Ive been looking online and haven't really been able to find a solution that I can really comprehend.

I tried use Galaxy to input the list myself, but was wondering if there is a way to group them into GO terms before i generate the heatmap.

Sorry I am new to this and thanks for any suggestions

RNA-Seq Heatmap DAVID gene • 2.3k views
ADD COMMENT
2
Entering edit mode
4.9 years ago

Perhaps this will help, if you can follow my code: Clustering of DAVID gene enrichment results from gene expression studies

Kevin

ADD COMMENT
0
Entering edit mode

Hello Kevin,

thanks for the reply,

I was wondering how I would go about changing this part of your code:

#Create heatmap annotations
dfMinusLog10FDRGenes <- data.frame(-log10(topTable[which(topTable[,1] %in% rownames(annGSEA)),"padj"]))
toptdfMinusLog10FDRGenes[dfMinusLog10FDRGenes=="Inf"] <- 0
dfFoldChangeGenes <- data.frame(topTable[which(topTable[,1] %in% rownames(annGSEA)),"log2FoldChange"])
dfGeneAnno <- data.frame(dfMinusLog10FDRGenes, dfFoldChangeGenes)
colnames(dfGeneAnno) <- c("DEG\nsignificance\nscore", "Regulation")
dfGeneAnno[,2] <- ifelse(dfGeneAnno[,2]>0, "Up-regulated", "Down-regulated")
colours <- list("Regulation"=c("Up-regulated"="royalblue", "Down-regulated"="yellow"))
haGenes <- rowAnnotation(df=dfGeneAnno, col=colours, width=unit(1,"cm"))

dfMinusLog10BenjaminiTerms <- data.frame(-log10(read.table(DAVIDfile, sep="\t", header=TRUE)[which(read.table(DAVIDfile, sep="\t", header=TRUE)$Term %in% colnames(annGSEA)),"Benjamini"]))
colnames(dfMinusLog10BenjaminiTerms) <- "GO Term\nsignificance\nscore"
haTerms <- HeatmapAnnotation(df=dfMinusLog10BenjaminiTerms,
                             colname=anno_text(colnames(annGSEA), rot=40, just="right", offset=unit(1,"npc")-unit(2,"mm"), gp=gpar(fontsize=termLab)),
                             annotation_height=unit.c(unit(1, "cm"), unit(8, "cm")))

pdf("GO.pdf", width=7, height=12)
hmapGSEA <- Heatmap(annGSEA,

                name="My enrichment",

                split=dfGeneAnno[,2],

                col=c("0"="white", "1"="forestgreen"),

                rect_gp=gpar(col="grey85"),

                cluster_rows=T,
                show_row_dend=T,
                row_title="Statistically-significant genes",
                row_title_side="left",
                row_title_gp=gpar(fontsize=12, fontface="bold"),
                row_title_rot=0,
                show_row_names=TRUE,
                row_names_gp=gpar(fontsize=geneLab, fontface="bold"),
                row_names_side="left",
                row_names_max_width=unit(15, "cm"),
                row_dend_width=unit(10,"mm"),

                cluster_columns=T,
                show_column_dend=T,
                column_title="Enriched terms",
                column_title_side="top",
                column_title_gp=gpar(fontsize=12, fontface="bold"),
                column_title_rot=0,
                show_column_names=FALSE,
                #column_names_gp=gpar(fontsize=termLab, fontface="bold"),
                #column_names_max_height=unit(15, "cm"),

                show_heatmap_legend=FALSE,

                #width=unit(12.5, "cm"),

                clustering_distance_columns="euclidean",
                clustering_method_columns="ward.D2",
                clustering_distance_rows="euclidean",
                clustering_method_rows="ward.D2",

                bottom_annotation=haTerms)

draw(hmapGSEA + haGenes, heatmap_legend_side="right", annotation_legend_side="right") dev.off()

to just generating a heat map based off of fold change between two different experimental conditions. I've just started to learn R. My toptable has 3 columns (column 1 is the list of genes, column 2 is fold change in one condition, and column 3 is fold change in an another experimental condition). I have about 51 go terms with a lot of overlapping genes that I want to visualize on a heatmap between these two conditions

Thanks for the help and any suggestions!

ADD REPLY
1
Entering edit mode

I see, this function is not really for that type of data - it is more for data coming straight from the DAVID website. Essentially, the input to the Heatmap() function is a matrix of 1 and 0 (1 = gene present in GO term; 0 = gene not present in GO term)

ADD REPLY

Login before adding your answer.

Traffic: 1614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6