Improve illumina short read assembly using PacBio long reads
4
2
Entering edit mode
4.9 years ago

I am trying to assemble goat genome (genome size=2.9 Gb) and I have goat genome sequencing data from short and long reads

  1. short read data from Illumina (genome coverage ~37x).
  2. long read data from PacBio (genome coverage ~1.5x)

I have assembled Illumina short reads using ABySS and SOAPdenovo and got best N50 1884 at K-mer of 41. I would like to improve short read assembly using PacBio long reads data. Because of the low coverage (1.5x genome coverage) of PacBio data, I am unable to decide which software would be best for the improvement of N50 using long reads.

I tried HybridSPADES for hybrid assembly of my short and long read data but it is giving issue regarding memory (out of memory).

Please let me know, how could I improve short read assembly using low coverage (~1,5 X coverage) long reads.

Assembly Illumina genome PacBio • 1.8k views
ADD COMMENT
0
Entering edit mode

What was your input read length of the illumina data?

an optimal Kmer of 41 seems pretty low , what range did you evaluate?

ADD REPLY
1
Entering edit mode
4.9 years ago

Maybe you can't, 1.5X actually means 0X for a good proportion of the genome.

Generally, you want 20X + Pacbio coverage to make a good assembly.

It might pay to use another better assembly - I think a goat is available - for orientating your short scaffolds.

ADD COMMENT
1
Entering edit mode
4.9 years ago
jean.elbers ★ 1.7k

You could use your best Illumina assembly as input for whole-genome alignment with Cactus (https://github.com/ComparativeGenomicsToolkit/cactus) to NCBI accession GCA_004361675.1 as the reference.You could then use Ragout (https://github.com/fenderglass/Ragout) to generate a reference-guided assembly of your individual based off of the best available goat genome assembly GCA_004361675.1.

ADD COMMENT
1
Entering edit mode
4.9 years ago

Since you're already on the ABySS route, you could give LINKS a try: that's a long read scaffolder from the same people/group as ABySS.

but as mentioned by others here as well, 1,5x coverage will likely not get you very far

ADD COMMENT
0
Entering edit mode
4.9 years ago
Vitis ★ 2.5k

filtlong may help you filter and correct long reads using your short reads.

https://github.com/rrwick/Filtlong

Then the corrected long reads may help you scaffold some contigs. But I agree with the other answers: 1.5X of long reads wouldn't get you very far.

ADD COMMENT

Login before adding your answer.

Traffic: 2749 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6