GT field in VCF only has one number?
1
0
Entering edit mode
6.7 years ago
DVA ▴ 630

I recently got some VCF files from Illumina (IVC output), and I noticed that there are certain lines having GT field only with 1 number. For example, see the 3rd line below (I'm including the adjacent lines.):

chr1    21723729        .       T       C       123.0   PASS    SNVSB=-17.2;SNVHPOL=4;EFF=INTERGENIC(MODIFIER||||||||||1);dbSNP138_ID=rs2176878;dbSNP142_ID=rs2176878;1000G_phase1_release_v3_AF=0.45   GT:GQ:GQX:DP:DPF:AD     0/1:156:123:19:1:11,8
chr1    21723798        .       TTA     T       281.0   PASS    CIGAR=1M2D;RU=TA;REFREP=1;IDREP=0;EFF=INTERGENIC(MODIFIER||||||||||1);dbSNP138_ID=rs202144135;dbSNP142_ID=rs202144135;1000G_phase1_release_v3_AF=0.48   GT:GQ:GQX:DPI:AD       0/1:321:281:19:12,7
#######################see this line below, INFO column with GT is at the end of the line########################
chr1    21723800        .       A       T       221.0   PASS    SNVSB=-22.2;SNVHPOL=15;EFF=INTERGENIC(MODIFIER||||||||||1);dbSNP138_ID=rs2682358;dbSNP142_ID=rs2682358  GT:GQ:GQX:DP:DPF:AD     1:30:30:11:0:0,11
chr1    21724072        .       G       C       163.0   PASS    SNVSB=-18.9;SNVHPOL=2;EFF=INTERGENIC(MODIFIER||||||||||1);dbSNP138_ID=rs1568407;dbSNP142_ID=rs1568407;1000G_phase1_release_v3_AF=0.52   GT:GQ:GQX:DP:DPF:AD     0/1:163:160:19:1:8,11

(I find this in almost all chromosomes. This sample is from a healthy tissue.)

I'm expecting to only see things like 0/1, 1/1, 1/2... and have never seen one single number before. Any idea what does it mean please? (Cases like line3 are only a portion of my file).

Thank you.

vcf • 2.4k views
ADD COMMENT
3
Entering edit mode
6.7 years ago

Some ways for this to happen:

  1. While females have two copies of each of the first 23 chromosomes, males have only one copy of chrX and one copy of chrY; in those cases, you'd only expect to see a single number.
  2. A single number is also common for mitochondrial DNA.
  3. Suppose you have a large deletion on one copy of another chromosome. If this SNP overlaps such a deletion, some software will report your genotype as a single number (though the latest VCF standard recommends a '*' allele code to represent the middle of a deletion).

This looks like case #3, since rs2682358 is on chromosome 1.

ADD COMMENT
0
Entering edit mode

Thank you so much for the reply. Great points! I added some adjacent lines from the original file, and interestingly all the "1" GT cases are next to a indel case. I wonder why IVC did not put line3 with line2 as an indel case...

ADD REPLY
0
Entering edit mode

By the way, would you please comment on why the software did not take the line3 and line2 as one single indel case?

ADD REPLY

Login before adding your answer.

Traffic: 1919 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6