Question

Hybrid assembly using MinION data + correction with Illumina. Which strategy to adopt ?

0

Entering edit mode

6.0 years ago

lagartija ▴ 160

Hi there,

I want to assemble a bacterial genome using canu for the Minion data and correcting it with Illumina reads (I will also try to compare it to a hybrid assembler like SPades) but I do not know which strategy to use. As I understood well there are several ways to proceed : - Assembling with canu and then correcting the errors with a hybrid polish like pilon - Correcting the reads from Minion with Illumina with NaS for example and then assebling the genome

Which method would be the best ? I would prefere the first method because canu is already thought to correct the errors but I am not sure as our coverage from the Illumina data is better.

Thank you very much for your help, Cheers, Sofia

assembly • 3.8k views

ADD COMMENT • link updated 4.9 years ago by predeus ★ 1.9k • written 6.0 years ago by lagartija ▴ 160

0

Entering edit mode

Try all of them. In my experience there isn't one method that just works all the time. Sometimes my Spades is better than my Canu. Sometimes my miniasm is better than my Spades,etc.

ADD REPLY • link 6.0 years ago by Damian Kao 16k

0

Entering edit mode

Thanks ! You are right, I will try both. Here I found a good comparaison https://pdfs.semanticscholar.org/ee7d/0333bf46248497ecdf2d9141903e5501e3d3.pdf

And another question, I don't really get the difference between SPades and HybridSPades as Spades. Isn't HybridSPades an algorithm implemented in Spades ?

Cheers,

ADD REPLY • link 6.0 years ago by lagartija ▴ 160

score 2 · Answer 1 · 2018-04-24

2

Entering edit mode

6.0 years ago

Rayan Chikhi ★ 1.5k

Have a look at https://github.com/rrwick/Unicycler

ADD COMMENT • link 6.0 years ago by Rayan Chikhi ★ 1.5k

score 1 · Answer 2 · 2019-05-29

It depends what kind of bacteria you have. If it's something very non-repetitive (which would be the most common case for bacteria) - e.g. E. coli, Salmonella, Staph etc etc, Unicycler would give you the fewest misassemblies (and often a completely finished genome), which is the most important thing for downstream analysis. Unicycler heavily uses Illumina-based contigs.

If you genome is repetitive (e.g. Nesseria or other species with bunch of transposons), you'll get better results with long-read only assembly, followed by Illumina polishing. And don't use Canu - it's not performing well with circular genomes, very often adding a false duplicated region where the circle is closed. Flye does a lot better job with circular chromosomes (and is also a lot faster).