I have used Bowtie 2 to align my reads to reference genome and Samtools to call variances (SNPs and InDels). Does anyone know how to count how many SNPs and InDels I got? Thanks!
I have used Bowtie 2 to align my reads to reference genome and Samtools to call variances (SNPs and InDels). Does anyone know how to count how many SNPs and InDels I got? Thanks!
You could write a quick awk line ignoring multi-allelic loci:
SNPs:
awk '! /\#/' variants.vcf | awk '{if(length($4) == 1 && length($5) == 1) print}' | wc -l
Indels:
awk '! /\#/' variants.vcf | awk '{if(length($4) > 1 || length($5) > 1) print}' | wc -l
but for something more comprehensive, use bcftools stats:
Via BEDOPS convert2bed
:
$ vcf2bed --snvs < foo.vcf | wc -l
$ vcf2bed --insertions < foo.vcf | wc -l
$ vcf2bed --deletions < foo.vcf | wc -l
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
in a BAM or in a VCF ?