Question: Mapping contigs to a reference genome
1
Entering edit mode
3 months ago
SpamChop • 10

Hi!

I have ~163000 contigs (a single FASTA file) and I want to map them to a refernce genome (also a FASTA file). Is there any way to do it? I have tried bowtie2, but could not work out its nuances.

Any help appreciated!

Thanks

ADD COMMENTlink 3 months ago SpamChop • 10 • updated 3 months ago evelyn • 30
Entering edit mode
2

Is this a related reference (i.e. you expect high homology)? If so you could use blat or even blast+. If these are very large contigs a program like LASTZ would also be valuable.

ADD REPLYlink 3 months ago
genomax
68k
Entering edit mode
1

Try minimap2 with preset -x:

asm5/asm10/asm20: asm-to-ref mapping, for ~0.1/1/5% sequence divergence

ADD REPLYlink 3 months ago
SMK
♦ 1.3k
2
Entering edit mode
3 months ago
h.mon 25k
Brazil

Bowtie2 is a short read mapper and, although it can be used to map long sequences, it probably won't be good at it, specially if the query and reference are even moderately divergent.

In addition to the suggestions by SMK and genomax , there is also LAST.

Another option to consider is QUAST, an assembly evaluation tool. Quast uses minimap2 to align one (or more) query genomes to a reference genome, and in addition to the alignment, it will provide a number of metrics comparing the query to the reference.

ADD COMMENTlink 3 months ago h.mon 25k
1
Entering edit mode
3 months ago
Corentin • 290

Hi,

You can use D-genies, it can map two fasta to produce a dotplot (http://dgenies.toulouse.inra.fr/)

Alternatives are:

If you are more interested in ordering your contigs, then you could try "show-tiling" from Mummer

ADD COMMENTlink 3 months ago Corentin • 290
0
Entering edit mode
3 months ago
evelyn • 30

You can try nucmer from mummer:

nucmer -p output_prefix ref.fa example-contigs.fa
ADD COMMENTlink 3 months ago evelyn • 30

Login before adding your answer.

Powered by the version 1.5