Biostar Beta. Not for public use.
comparing three genomes?
0
Entering edit mode
5.5 years ago
kxd419 • 10
United Kingdom

Hello,

I have sequenced a bacterial genome. I want to use a venn digram comparing it to two other already sequenced genomes. Something like this:
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2953697/figure/f2/

However I am unsure how to go about it. Should I used my genome as a blast db then blast the two known genomes against this?
If so do I then take the genes present and absent in both known genomes and blast them against each other?

Kind regards,

KXD

ADD COMMENTlink
1
Entering edit mode
17 months ago
5heikki 8.4k
Finland

One option would be to query the proteins of the 3 genomes against e.g. pfam (with hmmer) and then extract the number of shared features between the proteomes from that. Also, maybe they tell in the text or MM what they actually did there..

ADD COMMENTlink
0
Entering edit mode
16 months ago
HG ♦ 1.1k
Germany

Simple way : 1. Annotate the Genome 2. Cluster the gene (cd hit/orthomcl...) 3. Find the share gene among all genome 4. Draw a venn digram (may be using http://bioinfogp.cnb.csic.es/tools/venny/)

ADD COMMENTlink
0
Entering edit mode
5.5 years ago
kxd419 • 10
United Kingdom

Hi HG,

Thanks for your reply.

The genome is annotated however all three genomes have different gene names.

Can you explain step two in more detail?

ADD COMMENTlink
0
Entering edit mode
16 months ago
HG ♦ 1.1k
Germany

Extract all the gene from each file > blast all vs all (with your desire cutoff value using cdhit) > You will get a list unique sequence and share sequence > count the number and plot

http://weizhongli-lab.org/cd-hit/

ADD COMMENTlink
0
Entering edit mode

I don't see this for a set of three transcriptomes, it only presents option for comparing two nucleotide databases, can you explain how you do this if you have three databases? Thank you

ADD REPLYlink
0
Entering edit mode

we are talking here "bacterial genome" not transcriptomes!!!!

ADD REPLYlink
0
Entering edit mode

and what is the difference for you? It still consists of fasta files with sequences right? The question is how do you do it for three sets of 'genes' (if you want) instead of two. The problem is that what you propose doesn't work when the gene names are not the same, and when there is gene expansion number in one genotype versus another. Also you need to do best reciprocal blast not just all_vs_all

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1