I would like to annotate a vcf file with the output of a filter that detects false positives. I know a very similar question to this has been previously posted here but the answer did not work for my case. The vcf file was produced by varscan-2. Here is the first few rows of my vcf:
##FORMAT=<ID=RDF,Number=1,Type=Integer,Description="Depth of reference-supporting bases on forward strand reads1plus)">
##FORMAT=<ID=RDR,Number=1,Type=Integer,Description="Depth of reference-supporting bases on reverse strand (reads1minus)">
##FORMAT=<ID=ADF,Number=1,Type=Integer,Description="Depth of variant-supporting bases on forward strand (reads2plus)">
##FORMAT=<ID=ADR,Number=1,Type=Integer,Description="Depth of variant-supporting bases on reverse strand (reads2minus)">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1
chr1 13273 . G C . PASS ADP=69;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:150:71:69:30:39:56.52%:8.7033E-16:37:37:22:8:37:2
chr1 13417 . C CGAGA . PASS ADP=61;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:97:62:61:35:27:43.55%:1.9451E-10:34:34:0:35:0:27
chr1 14653 . C T . PASS ADP=37;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:29:37:37:28:9:24.32%:1.1256E-3:34:35:27:1:9:0
chr1 15903 . G GC . PASS ADP=35;WT=0;HET=0;HOM=1;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 1/1:118:35:35:8:27:77.14%:1.2926E-12:34:35:6:2:15:12
chr1 16495 . G C . PASS ADP=17;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:38:18:17:7:10:58.82%:1.4831E-4:36:35:7:0:10:0
here is the first few lines of my annotation file:
CHROM FROM INFO/FP
chr1 13273 PASS
chr1 13417 NoReadCounts
chr1 14653 Strandedness
chr1 15903 NoReadCounts
chr1 16495 PASS
chr1 69511 PASS
chr1 129285 VarDist3
chr1 137622 PASS
chr1 137825 PASS
Common problems like the 'chr' being different in the files can be ruled out. in the tabix line I have specified 's' as 1 to specify where the chromosome name is, 'b' and 'e' as '2' because the documentation suggests to do this is only one position is supplied in the annotation file. Here is my code to attempt to get the annotation to work:
$bgzip annotation_file
$tabix -s 1 -b 2 -e 2 -f annotation_file.gz
cat vcf_file.vcf | vcf-annotate -a annotation_file.gz \
-d key=INFO,ID=FP,Number=1,Type=String,Description='fpfilter annotation' \
-c CHROM,FROM,INFO/FP > annot_vcf.vcf
And here is the output of the annotation process:
##FORMAT=<ID=RDR,Number=1,Type=Integer,Description="Depth of reference-supporting bases on reverse strand (reads1minus)">
##FORMAT=<ID=ADF,Number=1,Type=Integer,Description="Depth of variant-supporting bases on forward strand (reads2plus)">
##FORMAT=<ID=ADR,Number=1,Type=Integer,Description="Depth of variant-supporting bases on reverse strand (reads2minus)">
##INFO=<ID=FP,Number=1,Type=String,Description="fpfilter annotation">
##source_20170210.1=vcf-annotate(v0.1.14-12-gcdb80b8) -a path/to/annotation_file.gz -d key=INFO,ID=FP,Number=1,Type=String,Description=fpfilter annotation -c CHROM,FROM,INFO/FP
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Sample1
chr1 13273 . G C . PASS ADP=69;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:150:71:69:30:39:56.52%:8.7033E-16:37:37:22:8:37:2
chr1 13417 . C CGAGA . PASS ADP=61;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:97:62:61:35:27:43.55%:1.9451E-10:34:34:0:35:0:27
chr1 14653 . C T . PASS ADP=37;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:29:37:37:28:9:24.32%:1.1256E-3:34:35:27:1:9:0
chr1 15903 . G GC . PASS ADP=35;WT=0;HET=0;HOM=1;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 1/1:118:35:35:8:27:77.14%:1.2926E-12:34:35:6:2:15:12
chr1 16495 . G C . PASS ADP=17;WT=0;HET=1;HOM=0;NC=0 GT:GQ:SDP:DP:RD:AD:FREQ:PVAL:RBQ:ABQ:RDF:RDR:ADF:ADR 0/1:38:18:17:7:10:58.82%:1.4831E-4:36:35:7:0:10:0
A line has been added above where the variants start giving details of what 'should' be in the annotation but there is no annotation added to the INFO column (or any other column). Any help or suggestions or would be much apreciated as this tool does not seem to work at all.
Could you provide the first few lines of desired output? I think I have a workaround.
Dear Vincent! If you have a workaround, please, share it. I have the same problem.
Hello! Excuse me, did you fix it finally? I have the same problem...