Biostar Beta. Not for public use.
Total SNP calculation using command line vs picardtools
0
Entering edit mode
2.0 years ago
bitpir • 130

Hello, I am trying to calculate the total number of SNPs in my VCF file. When I tried using Picardtool's CollectVariantCallingMetrics, it calculates 174387 SNPs. However, when I tried to use the command line to calculate my SNPs, I get 199320. The command line I used is as follow:

grep -v "#" VARIANT_VCF | wc -l

Does anyone know why there is a difference in 24933 SNPs called in the 2 calculation methods?

Thank you very much for your help!

picardtools SNP • 777 views
ADD COMMENTlink
0
Entering edit mode
23 months ago
igor 7.7k
United States

Does your VCF contain both SNPs and indels? I think Picard only counts SNPs.

ADD COMMENTlink
0
Entering edit mode

Yes, my VCF contains both SNPs and indels. Under the CollectVariantCallingMetrics-generated file, theTOTAL SNP = 174387, TOTAL INDEL = 24542,TOTAL_MULTIALLELIC_SNPS = 6659, TOTAL_COMPLEX_INDELS=278. Adding those up, I get 205886 variants. I am using picardtools 2.0.1. I was reading GATK's website and it says that "Many factors affect these counts, including cohort size, relatedness between samples, strictness of filtering, ethnicity of samples, and even algorithm improvement due to updated software." I've also tried using picardtools 2.4.1 but did not get the same number as 2.0.1: TOTAL_SNP=174387, TOTAL_INDEL=24542, TOTAL_MULTIALLELIC_SNPS=646,TOTAL_COMPLEX_INDELS=292. Total variant = 199287.

ADD REPLYlink
0
Entering edit mode

Looks like TOTAL_SNP and TOTAL_INDEL stayed the same, which makes sense because those are simple events. The difference is in TOTAL_MULTIALLELIC_SNPS and TOTAL_COMPLEX_INDELS. They might be doing some overlapping. For example, a single indel may sometimes be split into multiple events by the variant caller and they might be trying to correct for that.

Since GATK discusses this tool on their website, it's probably a good idea to ask about it on the GATK forum.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3