Hello everyone,
I'm currently working on a genome assembly using SPAdes 3.13.0
, but first I tried other assembly programs to look for which one was the best for my data (Illumina reads). At first I did a default assembly on MIRA
, Velvet
, SPAdes
and idba_hybrid
without specifying the kmer
size, and as expected, the programs made a few assemblies using some kmers sizes that I'm yet to find out how he defines, after that I ran Quast on the contigs-list of each folder of different kmers to check the metrics of all the assemblies, after choosing the best assemblies based on Total Lenght, covarage, Largest contig, number of contigs and N50 values, I've chosen to run the program with the best assembly and the same kmer size that resulted on the better metrics (for example: the best metrics came with SPAdes
using a kmer size of K77 and idba_hybrid with a kmer of K60), on this second round of assemblies I told the program to run specific kmers sizes based on the metrics shown on Quast, but now I try kmers close to the value that showed the better results (for example: on SPAdes
I now use kmers of 73, 75, 77*, 79 and 81), the problem is that this new assembly using the same kmer size chosen by the program before (and a few others) isn't the same from the previous assembly, when kmers weren't specified, usually the second assemblies are worse than the first, more contigs, shorter total lenght and biggest contig.
How can the program generate different results if the only parameter changed was the kmer size that was chosen based on the assembly made previously using the same program?
Thanks for attention