Extracting multiple regions of interest with vcftools
1
0
Entering edit mode
6.0 years ago
someler • 0

Hi all,

New to this, and I'm trying to work through our pipeline. I am having difficulty extracting my regions of interest from my merged VCF file. The code is correct, however, I get an error that I can only extract one chr at a time. Wondering if there is something I can do to this to make it extract multiple at once, or if I can use something else.

Note: do not have an available bed file, I have a .txt file with these regions in it. Also, I ran this script with just one chr, and it worked. (Which is how I know it's correct).

Thank you!

Code (imagine this with multiple chr regions to extract):

for FILE in XXX
do
vcftools --gzvcf $FILE --chr chr8 --from-bp 89933325 --to-bp 90003238 --out output --recode --recode-INFO-all
done
vcftools • 5.4k views
ADD COMMENT
2
Entering edit mode

Switch to bcftools. It's faster and has a LOT of extended options.

ADD REPLY
0
Entering edit mode

Can I extract more than one region at once?

ADD REPLY
2
Entering edit mode
6.0 years ago

You can use bedtools intesect. Keep regions of interest in a separate bed file. Intersect vcf file with bed file with regions of interest. Your text file should have 3 columns (chromosome name, start and end positions). Example code and example bed for regions of interest:

$ bedtools intersect -header -wa -a test.vcf -b test.txt

$ cat test.txt
chr12   133433173   133433373
chr12   133781605   133781705

-header will print header from VCF

-wa will print matching lines from VCF (file a)

From your code, it seems you have multiple VCF files and you are trying to extract single region of interest from each vcf file in a loop. Let us say you want to extract multiple regions of interest from each vcf file. You can use parallel:

parallel bedtools intersect -header -wa -a test.vcf -b test.txt >{.}.out ::: *.vcf
ADD COMMENT
0
Entering edit mode

Thank you, I will try this!

ADD REPLY

Login before adding your answer.

Traffic: 2810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6