Hi,
I used GATK HaplotypeCaller to generate gVCFs for 9 samples (BP_RESOLUTION mode), and then used GenotypeGVCFs to do the joint calling.
It's very important for me to know the sites are called or not, so I checked the joint genotyping VCF with all sites kept (no filter added). By extracting the record only for one individual, many sites with 'no-call' were found.
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT M_FR_BOU_ln1
12 20283174 . C . . . . GT:AD:DP:RGQ ./.:8:12:0
12 20283175 . T . . . . GT:AD:DP:RGQ ./.:8:11:0
12 20283176 . G . . . . GT:AD:DP:RGQ ./.:11:12:0
12 20283177 . C . . . . GT:AD:DP:RGQ ./.:11:12:0
12 20283178 . G . 27.48 LowQual . GT:DP:RGQ 0/0:12:36
12 20283179 . C . . . . GT:AD:DP:RGQ ./.:12:13:0
However, when I checked back the single gVCF for this individual, these sites were called with genotype.
12 20283174 . C <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:8,4:12:0:0,0,164
12 20283175 . T <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:8,3:11:0:0,0,204
12 20283176 . G <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:11,1:12:0:0,0,399
12 20283177 . C <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:11,1:12:0:0,0,419
12 20283178 . G A,<NON_REF> 0 . . GT:AD:DP:GQ:PGT:PID:PL:PS:SB 0|0:12,0,0:12:36:0|1:20283171_A_*:0,36,535,36,535,535:20283171:4,8,0,0
12 20283179 . C <NON_REF> . . . GT:AD:DP:GQ:PL 0/0:12,1:13:0:0,0,454
I am not very clear with the mechanism of how joint genotyping works, but is there any explanation of how genotypes of single individual will be affected in this process? Are these sites called or not, and what './. means in both VCF? Any suggestions and comments will be very helpful!
Best, Monica