bcftools filtering on INFO fields
0
0
Entering edit mode
3.8 years ago

below is a piece of my VCF where I try to extract rows with SUPPORT>10 but fail to specify such filter.

I tried: bcftools filter -sFilterName -e 'INFO/SUPPORT>10' variants.vcf (typo edited) result is no filtering with all the input going through I do not find a page with working examples

bcftools 1.9

##fileformat=VCFv4.2
##fileDate=2020-06-26|04:40:51PM|CEST|+0200
##source=SVIM-v1.4.0
##contig=<ID=CA_Cp,length=155185>
##ALT=<ID=DEL,Description="Deletion">
##ALT=<ID=INV,Description="Inversion">
##ALT=<ID=DUP,Description="Duplication">
##ALT=<ID=DUP:TANDEM,Description="Tandem Duplication">
##ALT=<ID=DUP:INT,Description="Interspersed Duplication">
##ALT=<ID=INS,Description="Insertion">
##ALT=<ID=BND,Description="Breakend">
##INFO=<ID=SVTYPE,Number=1,Type=String,Description="Type of structural variant">
##INFO=<ID=CUTPASTE,Number=0,Type=Flag,Description="Genomic origin of interspersed duplication seems to be deleted">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the variant described in this record">
##INFO=<ID=SVLEN,Number=1,Type=Integer,Description="Difference in length between REF and ALT alleles">
##INFO=<ID=SUPPORT,Number=1,Type=Integer,Description="Number of reads supporting this variant">
##INFO=<ID=STD_SPAN,Number=1,Type=Float,Description="Standard deviation in span of merged SV signatures">
##INFO=<ID=STD_POS,Number=1,Type=Float,Description="Standard deviation in position of merged SV signatures">
##INFO=<ID=STD_POS1,Number=1,Type=Float,Description="Standard deviation of breakend 1 position">
##INFO=<ID=STD_POS2,Number=1,Type=Float,Description="Standard deviation of breakend 2 position">
##FILTER=<ID=hom_ref,Description="Genotype is homozygous reference">
##FILTER=<ID=not_fully_covered,Description="Tandem duplication is not fully covered by a single read">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read depth">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Read depth for each allele">
##FORMAT=<ID=CN,Number=1,Type=Integer,Description="Copy number of tandem duplication (e.g. 2 for one additional copy)">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  Sample
CA_Cp   0       svim.BND.1      N       ]CA_Cp:150583]N 1       PASS    SVTYPE=BND;SUPPORT=1;STD_POS1=.;STD_POS2=.      GT:DP:AD        ./.:.:.,.
CA_Cp   2       svim.BND.2      N       ]CA_Cp:152088]N 3       PASS    SVTYPE=BND;SUPPORT=3;STD_POS1=1;STD_POS2=500    GT:DP:AD        ./.:.:.,.
CA_Cp   3       svim.INV.1      N       <INV>   0       PASS    SVTYPE=INV;END=85158;SUPPORT=95;STD_SPAN=3.49;STD_POS=1.44      GT:DP:AD        ./.:.:.,.
CA_Cp   3       svim.BND.3      N       ]CA_Cp:153260]N 4       PASS    SVTYPE=BND;SUPPORT=4;STD_POS1=2;STD_POS2=418    GT:DP:AD        ./.:.:.,.
CA_Cp   7       svim.BND.4      N       ]CA_Cp:154652]N 29      PASS    SVTYPE=BND;SUPPORT=27;STD_POS1=26;STD_POS2=309  GT:DP:AD        ./.:.:.,.
CA_Cp   7       svim.BND.5      N       ]CA_Cp:155004]N 89      PASS    SVTYPE=BND;SUPPORT=87;STD_POS1=17;STD_POS2=268  GT:DP:AD        ./.:.:.,.
CA_Cp   7       svim.DUP_TANDEM.1       N       <DUP:TANDEM>    1       not_fully_covered       SVTYPE=DUP:TANDEM;END=87179;SVLEN=87172;SUPPORT=1;STD_SPAN=.;STD_POS=.  GT:CN:DP:AD
     ./.:2:.:.,.
CA_Cp   8       svim.BND.6      N       ]CA_Cp:154580]N 27      PASS    SVTYPE=BND;SUPPORT=25;STD_POS1=26;STD_POS2=309  GT:DP:AD        ./.:.:.,.
CA_Cp   8       svim.BND.7      N       ]CA_Cp:155166]N 94      PASS    SVTYPE=BND;SUPPORT=985;STD_POS1=17;STD_POS2=88  GT:DP:AD        ./.:.:.,.
CA_Cp   11      svim.BND.8      N       ]CA_Cp:154272]N 15      PASS    SVTYPE=BND;SUPPORT=14;STD_POS1=35;STD_POS2=259  GT:DP:AD        ./.:.:.,.
CA_Cp   13      svim.BND.9      N       ]CA_Cp:154140]N 13      PASS    SVTYPE=BND;SUPPORT=12;STD_POS1=38;STD_POS2=326  GT:DP:AD        ./.:.:.,.
CA_Cp   17      svim.BND.10     N       ]CA_Cp:153931]N 9       PASS    SVTYPE=BND;SUPPORT=9;STD_POS1=44;STD_POS2=367   GT:DP:AD        ./.:.:.,.
CA_Cp   122     svim.DEL.1      N       <DEL>   1       PASS    SVTYPE=DEL;END=165;SVLEN=-43;SUPPORT=1;STD_SPAN=.;STD_POS=.     GT:DP:AD        ./.:.:.,.
CA_Cp   368     svim.DEL.2      N       <DEL>   1       PASS    SVTYPE=DEL;END=424;SVLEN=-56;SUPPORT=1;STD_SPAN=.;STD_POS=.     GT:DP:AD        ./.:.:.,.
CA_Cp   699     svim.BND.11     N       ]CA_Cp:153209]N 1       PASS    SVTYPE=BND;SUPPORT=1;STD_POS1=.;STD_POS2=.      GT:DP:AD        ./.:.:.,.
CA_Cp   910     svim.DEL.3      N       <DEL>   1       PASS    SVTYPE=DEL;END=970;SVLEN=-60;SUPPORT=1;STD_SPAN=.;STD_POS=.     GT:DP:AD        ./.:.:.,.
CA_Cp   1346    svim.DEL.4      N       <DEL>   1       PASS    SVTYPE=DEL;END=1397;SVLEN=-51;SUPPORT=1;STD_SPAN=.;STD_POS=.    GT:DP:AD        ./.:.:.,.
CA_Cp   1547    svim.INS.1      N       <INS>   1       PASS    SVTYPE=INS;END=1547;SVLEN=56;SUPPORT=1;STD_SPAN=.;STD_POS=.     GT:DP:AD        ./.:.:.,.
variants vcf bcftools variant • 2.7k views
ADD COMMENT
1
Entering edit mode

Isn't it INFO/ and not INFO\? Maybe that's why the filter doesn't work well.

ADD REPLY
0
Entering edit mode

You are right, only '/' is valid but does not work for me Should the VCF be bgzipped and indexed or does this normally work on plain text?

ADD REPLY
0
Entering edit mode

One of these bad days, I used -e (exclude) instead of -i (include) Sorry about this! It now works (of course)

ADD REPLY
0
Entering edit mode

Glad you found the problem. In the future, please use Add Comment when you're adding a comment or Add Reply when you're replying to a comment. Only use Add Answer when you're answering the top-level question.

ADD REPLY

Login before adding your answer.

Traffic: 1709 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6