Help with genome comparison between two strains.
2
1
Entering edit mode
6 months ago
Andrea Laura ▴ 10

Hi, I'm new to bioinformatics and I need to compare two bacterial genomes. These genomes come from different strains of the same species. and both genomes are available in NCBI

What I need to do is a genome wide comparison to find the homologous genes in both sequences and what is the name of both locus tags.

In other words I need a table that says Gene A corresponds to Gene B.

I've searched around and found some tools that might help me, for example, microbializer (which is not working) and orthologr but after 10 days of running a power shortage terminated my run and the results I could get were in protein code and not in gene code.

Anyone knows any tool that might help me?

genome-wide-comparison • 913 views
ADD COMMENT
0
Entering edit mode

Couldn't you BLAST every gene in one strain against the genome of the other strain and vice versa?

ADD REPLY
0
Entering edit mode

That is basically what a reciprocal best hit (RBH) search is. This is the approach suggested here: Help with genome comparison between two strains. .

ADD REPLY
1
Entering edit mode
6 months ago
dthorbur ★ 1.9k

What about microbializer isn't working? I see their github was updated a year ago. What have you tried to get it working? And what are you running things on? 10 days to compare 2 similar genomes seems excessive.

Other tools you could look at include OrthoMCL and OrthoFinder. That's good for genome-wide analyses as they clusters based on sequence similarity. If you want to be more sure that a transcript/protein is the same in both assemblies you could use a hmm based tool like HMMER.

If you also download an annotation file (GFF/GTF), then you can see what proteins/transcripts are part of which genes and which chromosome/contig they are located on.

ADD COMMENT
0
Entering edit mode

As far as I could see microbializer is a web tool and that isn't working. I tried to find how to run it locally and didn't find how to do so. The other program i tried I run it on my personal laptop, thats all i have access to :/

ADD REPLY
1
Entering edit mode
6 months ago
Roberto ▴ 20

You can use https://github.com/soedinglab/mmseqs2/wiki#reciprocal-best-hit-using-mmseqs-rbh. For two bacterial genomes you can run it easily on a laptop in a matter of minutes. It's basically like blasting genome vs genome and returns the best batch for each feature.

ADD COMMENT
0
Entering edit mode

This is the way. Something like OrthoFinder is overkill for a single pairwise comparison.

ADD REPLY

Login before adding your answer.

Traffic: 2002 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6