Extracting Info From A Column Of Vcf File?
3
3
Entering edit mode
11.7 years ago
bioinfo ▴ 830

I used the command below to get column 2,6 and 8 of my vcf file

cut -f 2,6,8 my.vcf  > out.vcf

output:

12 AA DP=23;AC=4;DC=5
13 BB DP=24;AC=6;DC=9
14 CC DP=34;AC=6;DC=65

For for INFO column (column 8), I want to get just DP from a list of things in that column (INFO column contains e,g. AN=2;DP=87;DC=56;Dels=0.00..)

What else should I add to that command? it will be great if I get all output in csv format.

vcftools perl vcf • 15k views
ADD COMMENT
4
Entering edit mode
11.7 years ago
Random ▴ 160

Normally I would also use awk, but in this case I think the use of VCFtools is warranted.

You can specify which fields and/or sub-fields you want to retrieve, as well as the separator in the output file.

In your case I would just use the VCFtools perl modules, and do vcf-query -f '%POS,%QUAL,%INFO/DP\n' my.vcf > out.vcf

ADD COMMENT
0
Entering edit mode

If using bcftools, here is equivalent command: bcftools query -f '%POS,%QUAL,%INFO/DP\n' my.vcf > out.vcf

ADD REPLY
3
Entering edit mode
11.7 years ago

Using my tool extractinfo: https://code.google.com/p/variationtoolkit/wiki/ExtractInfo

extractinfo -t DP  < my.vcf |   cut -f ' ' -1,2,12
ADD COMMENT
0
Entering edit mode

this tool is deprecated. Use bioalcidae http://lindenb.github.io/jvarkit/BioAlcidaeJdk.html

ADD REPLY
2
Entering edit mode
11.7 years ago

If your out.vcf is:

12 AA DP=23;AC=4;DC=5
13 BB DP=24;AC=6;DC=9
14 CC DP=34;AC=6;DC=65

simply do:

cut -f 2,6,8 my.vcf | awk -F \; '{print $1}' > out.vcf
ADD COMMENT
0
Entering edit mode

the output looks like that:

DP=23

DP =24

DP=34

I want just numbers means

23

24

34

I also want the two other columns at the same output. So I was expecting something like that:

12 AA 23

13 BB 24

14 CC 34

ADD REPLY
3
Entering edit mode

then modify the script to become

cut -f 2,6,8 my.vcf | awk  '{OFS="\t";  split($3,x,";"); split(x[1],y,"="); print $1,$2, y[2]}'

generates:

12  AA  23
13  BB  24
14  CC  34
ADD REPLY
0
Entering edit mode

it worked...thanks

by the way when we do the 'editing of commands' how should we put that in a box like you did?

ADD REPLY
0
Entering edit mode

I think it makes it easier to read - just add four spaces at the beginning of a line to format it as code

ADD REPLY

Login before adding your answer.

Traffic: 1888 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6