Hi all,
For certain reasons, I need to make a custom reference genome to align my Illumina sequence reads against, and call SNPs. I've made my custom reference by using SPAdes to create de novo assembly from sequence reads, and indexing for the programs I'm going to use for alignment and subsequent SNP calling (so indexed for BWA, GATK and Samtools).
I only want to call SNPs, so I'm not interested in gene annotation. Basically, what I want to know is, am I missing anything crucial in the approach to making a custom reference sequence?
Thanks
When you map reads back to your assembly, how the alignment stats looks ?
Well, according to flagstat, alignment stats look alright: 96% of reads mapped, and 94% properly paired, and low percentage (0.9%) of singletons.
How about multi-mapped reads?
Not really sure - how do I check?