Using Bandage to finish ambiguous long-read assembly?
0
2
Entering edit mode
5.3 years ago
predeus ★ 1.9k

Hello all,

I have a Unicycler assembly of a bacterial genome from PacBio and Illumina reads. It's a rather small but repetitive genome, and it's didn't assemble into one circualr chromosome, despite having 500x long read coverage.

I've tried few other options (i.e. long read-only assemblers) and they didn't produce a finished genome as well.

I've read that it's possible to finish the assembly by inspecting it with Bandage and by aligning reads to it. However it's not obvious how to do it and what with? Bandage offers BLAST functionality, but I don't think I can blast 500x of PacBio reads onto the graph. Would it make sense to get consensus set of well-corrected reads with Racon? And how does one identify a subset of long reads that potentially span the contig ends?

Thank you for any suggestions.

Assembly Bandage long reads Unicycler • 2.3k views
ADD COMMENT
2
Entering edit mode

What's the quality and length of the long reads? If you have ILMN reads handy, I'd recommend using Ryan Wick's Filtlong to filter out the longest and highest quality reads possible. Maybe down to about ~100X coverage? You can use the ILMN reads as a reference for filtering the Nanopore reads. It also has a very handy script included w/ filtlong to quickly generate stats on the reads before/after filtering - https://github.com/rrwick/Filtlong/tree/master/scripts

ADD REPLY
1
Entering edit mode

As a matter fact I do have some Illumina! Thank you, very good suggestion.

ADD REPLY
1
Entering edit mode

wow I completely misread PB reads for nanopore, sorry. does filtlong work on pacbio reads?

ADD REPLY
1
Entering edit mode

yes, I've ran it with my data and it worked very well (I've ran it against Illumina reads).

ADD REPLY
1
Entering edit mode

How many contigs you had on your assembly? I played just a little with Bandage, it helped decide where to design PCR primers to link contigs and (possibly later) sequence and finish the assembly. I had only MiSeq data, PacBio is still a rarity around here. Of course, this approach is only useful if you have small gaps, and not too many.

I think you can try to map consensus reads with minimap2 and identify chimeric reads mapping to different contigs.

ADD REPLY
0
Entering edit mode

I have just a few contigs - the assembly is almost complete.

ADD REPLY

Login before adding your answer.

Traffic: 1513 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6