Create a new dataframe that has '1' for when the gene is part of a term, and '0' when not

Question

Annotating DEG Heatmap by Go terms

1

Entering edit mode

6.1 years ago

lawarde.ankita1 ▴ 70

Hello,

I have a heatmap of DEG , the data is microarray data, i have used ComplexHeatmap for the annotations. I have annotated the rows (genes) and i want to add one more row annotation to the existing heatmap depending upon the GO terms.

For those DEG i have GO terms from the DAVID tool.

I have referred to the GOexpress package, The GOexpress package does the supervised analysis and it takes the Go terms using the GO_analyse function. (this is what i understood, please correct me if i have got this wrong)

But if i already have the GO terms from the DAVID, then how to use those terms to plot the annotations of rows on heatmap? Can i use ComplexHeatmap for this, if yes could somebody explain it with a sample code?

i want to add one more annotation to the heatmap,

I want to add those obtained GO terms from DAVID to plot on the same heatmap of DEG. Is there any way to add such annotations?
what R packages are there for such annotation purpose?

Any suggestions would be very helpful Thank you all in advance.

Heatmap Annotation GO terms DEG • 6.5k views

ADD COMMENT • link updated 6.1 years ago by Kevin Blighe 87k • written 6.1 years ago by lawarde.ankita1 ▴ 70

score 2 · Answer 1 · 2018-02-27

2

Entering edit mode

6.1 years ago

Kevin Blighe 87k

You may take inspiration from the tutorial that I posted here: Clustering of DAVID gene enrichment results from gene expression studies (this was posted in response to a question from another user: Ideas to plot several enrichment results (BP) in one ).

In that, I create a heatmap based on DAVID GO terms and then add 2 extra row annotations and 1 extra column annotation.

-----------------------------

To then add this alongside a heatmap for DEGs, then just create both heatmaps separately but merge them with the '+' operand:

combinedHeatmap <- Heatmap(...) + Heatmap(...)
draw(combinedHeatmap, heatmap_legend_side="left", annotation_legend_side="left", newpage=FALSE)

free photo hosting

Kevin

ADD COMMENT • link 6.1 years ago by Kevin Blighe 87k

1

Entering edit mode

Hello Kevin Blighe,

Thank you so much for the solution.

ADD REPLY • link 6.1 years ago by lawarde.ankita1 ▴ 70

0

Entering edit mode

Hello Kevin Blighe,

One error is coming when im following your tutorial, i already have the DEGs in a matrix, so when i do the annGSEA step for matching the gene patterns, it throws error, Follwing is the error, my DEG set, with row names as probe set IDs,

            GSM1244729_BH2010140_5_NT105.CEL GSM1244733_BH2010140_5_OT105.CEL

223597_at 4.541022 5.019449 205819_at 6.091944 7.889634 210078_s_at 3.422433 4.007747 207804_s_at 3.334840 3.534835 220429_at 2.322924 2.429732 230478_at 3.384412 5.656653 GSM1244737_BH2010140_5_QT105.CEL GSM1244741_BH2010140_5_RT105_2.CEL 223597_at 3.937667 3.707921 205819_at 5.393037 5.851137 210078_s_at 3.274880 3.352664 207804_s_at 3.012993 3.083688 220429_at 2.157992 2.377862 230478_at 2.601378 3.540798

then i changes the row names to Gene symbols using this,

row.names(DEG_set)<- DEG_without_na$Symbol

colnames(DEG_set)<- c(rep("cancer",24),rep("control",24))

head(DEG_set1)
     cancer   cancer   cancer   cancer   cancer   cancer   cancer   cancer   cancer

ITLN1 4.541022 5.019449 3.937667 3.707921 4.602628 4.648486 4.126351 4.675421 5.008449 MARCO 6.091944 7.889634 5.393037 5.851137 6.855235 7.243431 6.820261 8.558179 6.528510 KCNAB1 3.422433 4.007747 3.274880 3.352664 2.498456 3.217542 3.670437 2.986598 3.075708 FCN2 3.334840 3.534835 3.012993 3.083688 4.484046 4.877473 4.450210 3.681277 2.732536 NDST3 2.322924 2.429732 2.157992 2.377862 2.557548 2.452419 2.704905 2.600509 2.846537 OIT3 3.384412 5.656653 2.601378 3.540798 3.518944 3.759138 4.855136 4.191087 3.056716

But when i do this step,

Create a new dataframe that has '1' for when the gene is part of a term, and '0' when not

annGSEA <- data.frame(row.names=rownames(topMatrix)), error is coming as,

 annGSEA <- data.frame(row.names=rownames(DEG_set1))

Error in data.frame(row.names = rownames(DEG_set1)) : duplicate row.names: KCNAB1, LYVE1, CLEC4M, GPM6A, LIFR, FCN2, CDKN3, P2RY12, CD55, NAPSB, CXCL12, RRM2, TGFBR3, TOP2A, FST, RBMS3, RARRES1, PTPRB, COL27A1, KLB, IL17RB, RIPOR2, SULT1C2, TMEM97, CAVIN2, APOM, DNASE1L3, ERBB3, A1CF, JAM2, AJUBA, PLOD2, AK4, ENAH, SDC2, NR1H4, MGP, AKR1C2, GPAM, AKR1C1, PDK4

I know i dont have to match the rownames as i only have the genes of my intrest with me, but in the next command you have assigned column names of annGSEA from the DAVID object as,

colnames(annGSEA) <- DAVID[,2]

so im stuck at this step.

So how can i make a data from with my genes from the DEG_Set same as the annGSEA.

Thank you very much.

ADD REPLY • link 6.1 years ago by lawarde.ankita1 ▴ 70

0

Entering edit mode

I'm sorry i have paste the out put of command in wrong way, please refer to the following outputs,

 head(DEG_set)



   GSM1244729_BH2010140_5_NT105.CEL GSM1244733_BH2010140_5_OT105.CEL

223597_at                           4.541022                         5.019449
205819_at                           6.091944                         7.889634
210078_s_at                         3.422433                         4.007747
207804_s_at                         3.334840                         3.534835
220429_at                           2.322924                         2.429732
230478_at                           3.384412                         5.656653

DEG set after setting the row names to gene symbol,

DEG_set1<- DEG_set

head(DEG_set1)

row.names(DEG_set1)<- DEG_without_na$Symbol

head(DEG_set1)

colnames(DEG_set1)<- c(rep("cancer",24),rep("control",24))

head(DEG_set1)

cancer   cancer   cancer   cancer   cancer   cancer   cancer   cancer   cancer

ITLN1  4.541022 5.019449 3.937667 3.707921 4.602628 4.648486 4.126351 4.675421 5.008449
MARCO  6.091944 7.889634 5.393037 5.851137 6.855235 7.243431 6.820261 8.558179 6.528510
KCNAB1 3.422433 4.007747 3.274880 3.352664 2.498456 3.217542 3.670437 2.986598 3.075708

#Create a new dataframe that has '1' for when the gene is part of a term, and '0' when not
annGSEA <- data.frame(row.names=rownames(topMatrix))

Error in data.frame(row.names = rownames(DEG_set1)) : 
duplicate row.names: KCNAB1, LYVE1, CLEC4M, GPM6A, LIFR, FCN2, CDKN3, P2RY12, CD55, NAPSB, CXCL12,

I have checked that there are no duplicate names.

I'm very sorry for the inconvenience i made. i hope this will help in understanding the question.

Thank you so much.

ADD REPLY • link 6.1 years ago by lawarde.ankita1 ▴ 70

0

Entering edit mode

Hello Kevin Blighe, Referring to your solution above, i could plot the heatmap but the main heatmap of DEG and this new heatmap is not same as here genes are on rows and the GO terms are the columns. but this not how i want.

I want to add GO annotations to the heatmap which ia have attached here, I have the heatmap map of DEG, the image is attached, Heatmap

I want to add Go annotations from the DAVID to this heatmap, So is there any way to add GO annotations to this existing heatmap using ComplexHeatmap or any other R package?

Thank you.

ADD REPLY • link 6.1 years ago by lawarde.ankita1 ▴ 70

0

Entering edit mode

Sorry, I now see this other message. Since your 2 heatmaps are more different than I thought, you should plot them not using the '+' operator. Instead, do something like this as independent heatmap objects (it took me a long time to figure this out when I first had to do it):

heatmapDEG <- Heatmap(...)
heatmapDAVID <- Heatmap(...)

#Createa new grid viewport with rows=1 and columns=1
pushViewport(viewport(layout=grid.layout(nr=1, nc=1)))


pushViewport(viewport(layout.pos.row=1, layout.pos.col=1))
draw(heatmapDEG, heatmap_legend_side="left", annotation_legend_side="left", newpage=FALSE)
upViewport()

pushViewport(viewport(layout.pos.row=1, layout.pos.col=2))
draw(heatmapDAVID, heatmap_legend_side="left", annotation_legend_side="left", newpage=FALSE)
upViewport()

ADD REPLY • link 6.1 years ago by Kevin Blighe 87k

0

Entering edit mode

~~Hi lawarde, are you sure?~~

What is the output of:

length(rownames(topMatrix))

and

length(unique(rownames(topMatrix)))

?

ADD REPLY • link 6.1 years ago by Kevin Blighe 87k