Biostar Beta. Not for public use.
BCF Tools Filter on 1000Genomes Annotation
0
Entering edit mode
15 months ago

I am trying to filter a VCF based on the allele frequency as given by ANNOVAR annotation, which has annotations in the info column. I am doing this because I only want the rare variants for analysis. I have been trying bcf query but it is not acting well behaved....

$ bcftools view -i 'INFO/controls_AF_popmax < 0.05' File_Name.vcf.gz  | bgzip > output.vcf.gz

or

$ bcftools query File_Name.vcf.gz -f '%ID\t %INFO/Gene.ensGene\t %INFO/controls_AF_popmax\n' -i 'INFO/controls_AF_popmax < 0.05' >  output.vcf

I get output but not wat expected in terms of the filter.

I want a vcf in the end so I can make ped, map, bed, bim, fam from it using PLINK (I know how to do that part).

UPDATE: I thoguht maybe it was because of the format of the INFO/controls_AF_popmax field in the header. I tried to change the designation to the same designation AF is but that did not work either.

UPDATE: It works for 'INFO/AF > 0.05' but not for 'INFO/controls_AF_popmax > 0.05' do not know why

ADD COMMENTlink
0
Entering edit mode

Hello,

could you please show us the complete header and some variants of your input file?

fin swimmer

ADD REPLYlink
0
Entering edit mode

UPDATED VCF header where i deleted all the contig information and filtering information so it would fit in a single screenshot:

PREVIOUS VCF HEADER https://ibb.co/rZRtR7F

UPDATED VCF HEADER https://ibb.co/n8cWxm9

Sorry i am not sure the best way to share...

and here is a record, not including sample information:

chr1    861219  chr1:861219:G:C G   C   490.79  PASS    AC=0;AF=0.0007669;AN=922;BaseQRankSum=2.72;DP=18855;ExcessHet=3.0103;FS=3.736;InbreedingCoeff=-0.0011;MLEAC=1;MLEAF=0.0007669;MQ=60;MQRankSum=0;QD=18.88;ReadPosRankSum=0.416;SOR=2.247;VQSLOD=13.64;culprit=MQRankSum;ANNOVAR_DATE=2018-04-16;Func.refGene=intronic;Gene.refGene=SAMD11;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;Func.ensGene=intronic;Gene.ensGene=ENSG00000187634;GeneDetail.ensGene=.;ExonicFunc.ensGene=.;AAChange.ensGene=.;AF=0.0001;AF_popmax=0.0004;AF_male=0.0001;AF_female=0.0001;AF_raw=0.0001;AF_afr=0;AF_sas=0.0004;AF_amr=0;AF_eas=0;AF_nfe=0.0001;AF_fin=0;AF_asj=0.0002;AF_oth=0;non_topmed_AF_popmax=0.0004;non_neuro_AF_popmax=0.0004;non_cancer_AF_popmax=0.0004;controls_AF_popmax=0.0003;MCAP13=.;CADD13_RawScore=.;CADD13_PHRED=.;gerp++elem=.;CLNALLELEID=.;CLNDN=.;CLNDISDB=.;CLNREVSTAT=.;CLNSIG=.;ALLELE_END GT:AD:DP:GQ:PL
ADD REPLYlink
0
Entering edit mode

i will try to clean up my header completely and double check the positioning, maybe the annovar annotation didnt properly align the header and that is why i am getting the issue

ADD REPLYlink
0
Entering edit mode

Looks like there were numerous extra info fields specified in the header that were not actually present. I just purged them, going to double check the header and try this all again.

UPDATE:

AFTER UPDATING INFO IT THE IT APPEARS TO BE WORKING. Lets see

what I changed:

  1. removed superfluous headers
  2. Changed format of the Annotation of interested to Float from String:

## INFO=<ID=controls_AF_popmax,Number=.,Type=Float,Description="controls_AF_popmax annotation provided by ANNOVAR">
ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1