Hybrid assembly of PacBio and Illumina reads
2
1
Entering edit mode
7.7 years ago
int11ap1 ▴ 470

I have a ~30X PacBio dataset and ~40X Illumina dataset, besides of a mate-pair dataset. I am trying to assemble them (the expected genome size is around ~230Mb, it's a plant) in a server with 160Gb of RAM. However, I am having problems with ALLPATHS (lack of memory at the step of CorrectLongReads).

  • Is that possible to assemble my data in a cluster with 160Gb of RAM?

  • Is that possible with ALLPATHS?

pacbio illumina • 4.3k views
ADD COMMENT
2
Entering edit mode
7.7 years ago
GenoMax 141k

Have you seen this wiki page from PacBio?

As for the first question you have already answered that. There is no substitute for RAM. If the server does not have enough then finding alternate hardware may be the only option.

ADD COMMENT
1
Entering edit mode
7.6 years ago

Given the ammount of Illumina and PacBio data you have, I would suggest a hybrid assembly using DBG2OLC. It gave me good results with 100x Illumina + 30x PacBio data, and it is memory efficient, so you could probably run it in your 160Gb RAM cluster. I would then use a scaffolding step using your mate-pair data to improve the assembly obtained from DBG2OLC and a final base correction using Pilon.

ADD COMMENT
0
Entering edit mode

How would you use mate-pairs reads after DBG2OLC?

ADD REPLY

Login before adding your answer.

Traffic: 2006 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6