Plotting GO annotation results
2
2
Entering edit mode
8.1 years ago
seta ★ 1.9k

Hi all friends,

I used TRAPID for GO annotation of a plant de novo assembled transcriptome, the output is a large text file (about 16 MB) like below:

 ab29703    GO:0019219
ab29703 GO:0009059
ab29707 GO:0044446
ab29707 GO:0006810
ab29707 GO:0044424
Contig6742  GO:0044260
Contig6742  GO:0003824
Contig6742  GO:0016772

I plan to use WEGO (enter link description here) for categorizing and plotting the GO annotation results, but the input of WEGO is something like:

ab29703 GO:0019219  GO:0009059
ab29707 GO:0044446  GO:0006810  GO:0044424
Contig6742  GO:0044260  GO:0003824  GO:0016772

Could you please help me out how I can change the original output format to accept by WEGO? Any comments and suggestion for using other tools for plant GO annotation and plotting would be highly appreciated.

GO annotation transcriptome Plotting • 3.8k views
ADD COMMENT
1
Entering edit mode
8.1 years ago
Benn 8.3k

Hi Seta,

I had a kinda similar question on the bioconductor site https://support.bioconductor.org/p/77134/

You can use dplyr in R to get what you want.

Good luck!

Ben

ADD COMMENT
0
Entering edit mode

Thanks for your reply. However, my question differed with yours; actually, I would like to have all GO term from one gene in every row of a text file. Please let me know if you have any suggestion.

Best

ADD REPLY
0
Entering edit mode

Yes, but the principle is the same right? Switch the genes with the go terms?

ADD REPLY
0
Entering edit mode

Sorry, I am not much familiar with R. Could you please put the appropriate command for using "dplyr" to get GO term from one gene in every row of a text file?

Thanks for your help,

ADD REPLY
2
Entering edit mode

Assuming you have a tab-del txt file named "exampl.txt"

library(dplyr)

df<-read.table("exampl.txt")

reshaped <- group_by(df, V1) %>% summarise(GO = paste(V2, collapse = "\t"))

write.table(reshaped, "reshaped.txt", sep = "\t", quote = FALSE, row.names = FALSE, col.names = FALSE)
ADD REPLY
0
Entering edit mode

Thank you very much b.nota. I try it.

ADD REPLY
1
Entering edit mode

I tried your suggested command, it worked well.

again thanks

ADD REPLY
0
Entering edit mode
8.1 years ago
ivivek_ngs ★ 5.2k

Are you only restricted to using WEGO? you can simply do it yourself as well using R or as @b.nota said. In that case you will have to write a small script to count the number of GO terms each gene is having and then print that matrix and make any plots per row for those those genes , wither a barplot or even a heatmap.

ADD COMMENT
0
Entering edit mode

Thanks, vchris_ngs. No, I'm not restricted to use WEGO. Regarding your comment, I also have another txt file with below information in every row:

go  description     num_transcripts transcripts

GO:0000033  alpha-1,3-mannosyltransferase activity  3   Contig12621 Contig15208 ab76639

In fact, there is the number of transcript with a given GO term in this file. however, I don't know how to use it for plotting.

ADD REPLY
0
Entering edit mode

Now you have it , for each gene the corresponding GO terms. as you put your example.

ADD REPLY
0
Entering edit mode

No, I have a given GO term for multiple genes, while I want to have all corresponding GO terms for each gene in every row of a txt file.

ADD REPLY
0
Entering edit mode

Thats what I wrote that b.nota has already shown it, sorry I did not mention his name in my comment.

ADD REPLY

Login before adding your answer.

Traffic: 1930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6