VCF PL Values not making sense
1
0
Entering edit mode
6.2 years ago
fwuffy ▴ 110

Hi- Can someone help me identify these PL (phred likelihood) genotype values for this multi-allelic indel row in a VCF?

The VCF row:

chr1    36418926    .   caagaaa cAAAAaagaaa,caa,cAAaagaaa   9.95049 .   INDEL;IDV=4;IMF=0.666667;DP=6;VDB=0.0340507;SGB=-0.556411;MQSB=1;MQ0F=0;AF1=0.502508;AC1=1;DP4=1,0,3,1;MQ=59;FQ=-14.6836;PV4=1,1,0.342519,1 GT:PL:DP    0/1:72,25,45,27,0,47,40,13,15,59:5

With 3 alt alleles, I expect 6 values, not 10

A/A:72, 
A/B:25, 
B/B:45, 
A/C: 27, 
B/C: 0, 
C/C: 47, 
??: 40, 
??: 13, 
??: 15, 
??: 59

These are unphased, one sample. called with bcftools 1.6+htslib-1.6 Using this to call variants:

samtools mpileup -v -u -B -t DP -f full.fa mapped-sorted-mkdup.bam | bcftools call -c -v - > germline-variants-all.vcf

Thanks

vcf • 1.8k views
ADD COMMENT
2
Entering edit mode
6.2 years ago

You have the reference allele and 3 alt alleles, so there are 10 possibilities.

That locus is very poorly covered, and those alignments might not be right. I'd either eyeball it, to see if you can figure out what's going on there, or ignore it, on the grounds that it's too poorly covered to assess.

ADD COMMENT
0
Entering edit mode

Oh right!!! Duh. It's been a while...

so I'd have

A/A:72, 
A/B:25, 
A/C:45, 
A/D: 27, 
B/B: 0, 
B/C: 47, 
B/D: 40, 
C/C: 13, 
C/D: 15, 
D/D: 59

The coverage is low cause it's a sub-sampled fastq. I also know there are issues around indels but just for the sake of knowing what i'm supposed to be looking at, setting aside the correctness of it for now.

*edit: This also confuses me cause the genotype call is 0/1.

Thanks.

ADD REPLY

Login before adding your answer.

Traffic: 2031 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6