I am assembling a viral genome of 30kb de novo using SPAdes. I have fasta files for scaffolds and contigs. How do I process these into one final genome sequence, i.e. one fasta file with one sequence? What softwares are there for doing this? I would like to do pairwise alignment with published viral genome sequences from other labs/sources.Thanks in advance!
Hi, thanks for your reply! Yes, I'm sequencing SARS-CoV-2. My largest contig is about 3kb and all scaffolds/contigs only cover only about 85-90% of the genome. Before assembly, I trimmed adapters and primers and normalizedto a depth of 100X. The per nucleotide coverage looked fine before assembly and after normalization, so I'm not quite sure why I am getting poor assembly results.
Maybe take a look at the assembly graph. Also, you can do amplicon sequencing: https://eu.idtdna.com/pages/landing/coronavirus-research-reagents/ngs-assays