adding custom INFO tag to vcf
1
0
Entering edit mode
6.9 years ago
bioguy24 ▴ 230

I am trying to add a few custom INFO tags to a vcf 4.1. The below vcf is what I have in which the last 4 tab-delimited fields (GOOD 103 hom 16 and GOOD 139 het 8) are not defined in the INFO tags.

My thought (though probably not the best) was too add 4 INFO tags for these:

sed -i '10i\##INFO=<ID=,Type=Integer,Description="Variant quality">\'
sed -i '11i\##INFO=<ID=,Type=String,Description="Reads">\'
sed -i '12\##INFO=<ID=,String,Type=Float,Description="Zygosity">\'
sed -i '13i\##INFO=<ID=,Type=Integer,Description="Score">\'

I am not sure if the above will work or not as each one of the 4 fields does not have an ID=.

VCF:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  xxxx
chr1    948846  .   T   TA  529.927 PASS    AF=0.970874;AO=97;DP=106;FAO=100;FDP=103;  FR=.;FRO=3;FSAF=52;FSAR=48;FSRF=3;FSRR=0;FWDB=-0.0127942;FXX=0.00961446;HRUN=1;LEN=1;MLLD=26.521;OALT=A;OID=.;OMAPALT=TA;OPOS=948847;OREF=-;PB=.;PBP=.;QD=20.5797;RBI=0.0732214;REFB=0.0962764;REVB=0.0720949;RO=7;SAF=49;SAR=48;SRF=6;SRR=1;SSEN=0;SSEP=0;SSSB=-0.0448565;STB=0.514016;STBP=0.111;TYPE=ins;VARB=-0.0047395  GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    1/1:90:106:103:7:3:97:100:0.970874:48:49:6:1:48:52:3:0:1    GOOD    103 hom 16
chr1    948870  .   C   G   279.296 PASS    AF=0.482014;AO=67;DP=139;FAO=67;FDP=139;FR=.,REALIGNEDx0.4964;FRO=72;FSAF=34;FSAR=33;FSRF=34;FSRR=38;FWDB=-0.000997446;FXX=0;HRUN=2;LEN=1;MLLD=60.2134;OALT=G;OID=.;OMAPALT=G;OPOS=948870;OREF=C;PB=.;PBP=.;QD=8.0373;RBI=0.00460624;REFB=-0.0184382;REVB=0.00449694;RO=72;SAF=34;SAR=33;SRF=34;SRR=38;SSEN=0;SSEP=0;SSSB=0.0329868;STB=0.518243;STBP=0.7;TYPE=snp;VARB=0.0213678   GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    0/1:279:139:139:72:72:67:67:0.482014:33:34:34:38:33:34:34:38:1  GOOD    139 het 8

desired vcf:

##INFO=<ID=,Type=Integer,Description="Variant quality">\'
##INFO=<ID=,Type=String,Description="Reads">\'
##INFO=<ID=,String,Type=Float,Description="Zygosity">\'
##INFO=<ID=,Type=Integer,Description="Score">\'
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  xxxx
chr1    948846  .   T   TA  529.927 PASS    AF=0.970874;AO=97;DP=106;FAO=100;FDP=103;  FR=.;FRO=3;FSAF=52;FSAR=48;FSRF=3;FSRR=0;FWDB=-0.0127942;FXX=0.00961446;HRUN=1;LEN=1;MLLD=26.521;OALT=A;OID=.;OMAPALT=TA;OPOS=948847;OREF=-;PB=.;PBP=.;QD=20.5797;RBI=0.0732214;REFB=0.0962764;REVB=0.0720949;RO=7;SAF=49;SAR=48;SRF=6;SRR=1;SSEN=0;SSEP=0;SSSB=-0.0448565;STB=0.514016;STBP=0.111;TYPE=ins;VARB=-0.0047395  GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    1/1:90:106:103:7:3:97:100:0.970874:48:49:6:1:48:52:3:0:1    GOOD    103 hom 16
chr1    948870  .   C   G   279.296 PASS    AF=0.482014;AO=67;DP=139;FAO=67;FDP=139;FR=.,REALIGNEDx0.4964;FRO=72;FSAF=34;FSAR=33;FSRF=34;FSRR=38;FWDB=-0.000997446;FXX=0;HRUN=2;LEN=1;MLLD=60.2134;OALT=G;OID=.;OMAPALT=G;OPOS=948870;OREF=C;PB=.;PBP=.;QD=8.0373;RBI=0.00460624;REFB=-0.0184382;REVB=0.00449694;RO=72;SAF=34;SAR=33;SRF=34;SRR=38;SSEN=0;SSEP=0;SSSB=0.0329868;STB=0.518243;STBP=0.7;TYPE=snp;VARB=0.0213678   GT:GQ:DP:FDP:RO:FRO:AO:FAO:AF:SAR:SAF:SRF:SRR:FSAR:FSAF:FSRF:FSRR:QT    0/1:279:139:139:72:72:67:67:0.482014:33:34:34:38:33:34:34:38:1  GOOD    139 het 8
NGS vcf • 3.1k views
ADD COMMENT
3
Entering edit mode
6.9 years ago

I don't think you can have an empty ID in a VCF info and

<ID=,String,Type=Flo loooks really wrong. Anyway.

the following awk script will add the 4 info header when the VCF header is matched:

awk '/^#CHROM/ {printf("##INFO=<ID=ID1,Type=Integer,Description=\"Variant quality\">\n"); printf("##INFO=<ID=ID2,Type=String,Description=\"Reads\">\n"); printf("##INFO=<ID=ID3,String,Type=Float,Description=\"Zygosity\">\n"); printf("##INFO=<ID=ID4,Type=Integer,Description=\"Score\">\n"); } {print;} ' input.vcf
ADD COMMENT
0
Entering edit mode

Thank you very much :).

ADD REPLY

Login before adding your answer.

Traffic: 3428 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6