grep command on VCF
1
1
Entering edit mode
5.0 years ago
jon.klonowski ▴ 150

Summery: my grep command is not working properly when applied to a vcf but worked fine on a dummy test file. The grep command is putting too many records..

I am trying to pull all records with 1000 Genomes AF < 0.5 from a vcf. the vcf is annotated and for each SNP the AF from 1KGenomes is under an info column "controls_AF_popmax"

This is the surrounding area from an entry:

;non_cancer_AF_popmax=0.0004;controls_AF_popmax=0.0003;MCAP13=.;

This is my grep:

zless my_file.vcf.gz  | grep -v '^#' | grep ';controls_AF_popmax=0\.0[0-4]\|;controls_AF_popmax=\.;' > output.txt

It is pulling records where the AF value is between 1 and 2.338e-05 and the "."

I tried a test .txt and the function worked well:

fadsfad;controls_AF_popmax=0.0003;adsfadsf

dsafdsaf;controls_AF_popmax=.;fadsf

fdasfasd;controls_AF_popmax=0.1;fasdf

where the result is:

fadsfad;controls_AF_popmax=0.0003;adsfadsf

dsafdsaf;controls_AF_popmax=.;fadsf

vcf unix grep bcf • 8.2k views
ADD COMMENT
0
Entering edit mode

I don't see the problem here, where does it fail?

ADD REPLY
0
Entering edit mode

when i reintroduce the header and query the file, the controls_AF_popmax are all possible numbers between 1-NA

ADD REPLY
0
Entering edit mode

I strongly recommend using existing programs for filtering vcf files, like bcftools or SnpSift.

bcftools view -i "controls_AF_popmax<0.5" input.vcf
ADD REPLY
0
Entering edit mode

This command is not functioning properly for me.

We are talking about it on: BCF Tools Filter on 1000Genomes Annotation

ADD REPLY
1
Entering edit mode
5.0 years ago
jon.klonowski ▴ 150

The function works. The issue was something with the VCF file: unknown reason.

bcftools view -i "controls_AF_popmax<0.5" input.vcf

is the best answer if the VCF is formated corrected.

Future people:

Make sure your info field is formatted correctly. here I wanted the format liek this:

INFO=<id=controls_af_popmax,number=.,type=float,description="controls_af_popmax annotation="" provided="" by="" annovar"="">

Previously the type was string

ADD COMMENT

Login before adding your answer.

Traffic: 1514 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6