Does SPAdes produce consensus sequence?

0

Entering edit mode

5.1 years ago

DanielC ▴ 170

Dear All,

I have assembled contigs from fastq reads of sequenced phage data. The best contig size is of 88000bp with k-mer coverage of 49. Now I am trying to get the consensus by mapping the reads back to this contig. The question is , should I do this step to get the consensus or SPAdes resultant contig is a consensus itself and I can use it for gene prediction and annotation? Please let me know if anything is not clear.

Thanks, DK

consensus Assembly • 2.2k views

ADD COMMENT • link 5.1 years ago by DanielC ▴ 170

0

Entering edit mode

Assembly is inherently a consensus process. Each base position is made up of the most likely/common base present in all your reads. There is nothing more to be done with the contigs unless you have additional sequencing.

ADD REPLY • link 5.1 years ago by Joe 21k

0

Entering edit mode

Thanks! I have a resultant phage contig of the best size 88315 among all other contigs, the number of reads in sequenced fastq file 1821574, the k-mer of the contig is 48.95, the average read length is 303bp. Do you think this contig is a good contig that could be taken for gene prediction and annotation? Thanks.

ADD REPLY • link 5.1 years ago by DanielC ▴ 170

0

Entering edit mode

Sounds reasonable, but your read quality matters more. I trust you did QC prior to assembly?

ADD REPLY • link 5.1 years ago by Joe 21k

0

Entering edit mode

Thanks! I did the QC, used fastx to discard reads of quality score less than 20 and almost all reads passed the filter; the sequencing seems to be well performed. Any comments?

ADD REPLY • link 5.1 years ago by DanielC ▴ 170

0

Entering edit mode

Nope sounds good to me. What is your estimated real genome size for this phage?

ADD REPLY • link 5.1 years ago by Joe 21k

0

Entering edit mode

There is no reference genome for the assembled phage; however, the estimated genome size for the phage is in the range 50k bp to 100k bp. By blast hit of the assembled phage I got >95% identity with Salmonella phage which is about 87000 bp. Please let me know if you have any comments. Thanks!

ADD REPLY • link 5.1 years ago by DanielC ▴ 170

0

Entering edit mode

Then it sounds like you probably have a single Contig representing the whole genome which is quite common for phases. You can’t ask for much better than that! Carry on and do your annotation and the est of your workflow etc.

ADD REPLY • link 5.1 years ago by Joe 21k

Login before adding your answer.