Quality Scores In A Multi Sample Vcf File
1
0
Entering edit mode
11.2 years ago
Kssr ▴ 110

How is error probability estimated while calculating phred quality scores? I have multi samples in my vcf file and I see many SNP's having a quality score of '999'. Is there any threshold value after which the SNP's are assigned a quality score of '999'.

variant samtools bcftools vcf quality • 4.3k views
ADD COMMENT
0
Entering edit mode
11.2 years ago

The limits are probably hard coded into the software - a 999 value is probably unrealistic anyway. If not from mathematical perspective, there may be good reasons for a value to be this but from biological or experimental perspective. The underlying error rates of the experimental protocol or sequencing platform most likely far exceed the reported value that simply relies on the results that are most certainly biased above those values.

ADD COMMENT
0
Entering edit mode

I agree with the point that '999' value might be hard coded in to the software.I am not clear about what you have tried to explain.

I am using samtools mpileup output followed by bcftools to do the variant calling.I am assuming bcftools incorporates this value in to the vcf file.I see quality scores ranging from some min value up to 300 and rest all are placed in 999 category. Wondering whether these snp qualities couldn't be defined?In general, not sure how to interpret this quality score.

Any help appreciated.

ADD REPLY
0
Entering edit mode

my guess would be that under a certain minimum it generates a value that corresponds to the probability value of 0 That is 1E-99

ADD REPLY

Login before adding your answer.

Traffic: 2027 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6