Biostar Beta. Not for public use.
How to decide how many Iontorrent reads to run for contig assembly using Mira assembler?
0
Entering edit mode
13 months ago
DanielC • 80
Canada

Dear Friends,

I am running Mira contig assembler on a iontorrent sequenced bacteriophage fastq file. The total number of reads in the fastq file is about 1800000; the average read length in the fastq file is 300, and the reference genome is unknown. To run the program efficiently, I have divided the fastq files into chunks of reads like "fastq1.fastq: has 10000 reads" etc. At present, among the fastq files I generated from the main fastq file, I am experimenting how many reads fastq file will give a better resultant contig. Ideally the best result should be just 1 contig. Could you please tell me how many reads I should run (given the information I have as aforementioned) to get the best resultant contig? Thanks much!

ADD COMMENTlink
0
Entering edit mode

Since you are working with a phage (assuming your DNA is pure phage) you are going to have a large amount of data which will oversample the DNA. Having too much coverage is not good to get good assemblies. You can either follow the method of incrementally adding reads or use a normalization method to intelligently look as the entire dataset at the same time.

ADD REPLYlink
0
Entering edit mode

Thanks genomax! If I have to do normalization, then should I do it on the main fastq file with 180000 reads? or the fastq files generated from the main fastq files with reads like 10000, 20000 etc? I would really appreciate your suggestion on this and the rational behind the selection. Thanks much!

ADD REPLYlink
0
Entering edit mode

Do the normalization with entire data.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1