Filtering SNPS by minimum LD value in bcftools?
0
3
Entering edit mode
5.1 years ago
RNAseqer ▴ 260

Hello everyone,

I have just started using bcftools to 'prune' some vcf files. While I have found some helpful examples of how to discard SNPs with high LD:

 bcftools +prune -l 0.6 -w 1000 frag.vcf -Ov -o output1.vcf

I was hoping to actually create output where the SNPs kept were those with r2 values higher than .6, and the other SNPs are discarded. Is there a straightforward way to do this?

bcftools vcftools vcf LD r2 • 5.3k views
ADD COMMENT
0
Entering edit mode

If the functionality is not directly built into bcftools +prune, then I would, for example, compare the lists of SNPs in the filtered versus unfiltered, and then infer the ones that were removed. bcftools query can output VCF-formatted data in a neat way, and you could then use awk arrays to compare the lists.

ADD REPLY
0
Entering edit mode

I was thinking along the same lines. I think that would work. However, I did find vcftools has a command line option for minimum r2:

vcftools --vcf frag.vcf --hap-r2 --min-r2 .7 --ld-window-bp 50000 --out minr2_ld_window_50000

This outputs a file containing an r2 value rather than the vcf file data line... but I'm thinking it may be most efficient to just pull out these SNPs using a custom perl script that takes the vcftools output as its input and pulls lines from the original vcf file accordingly. Also, I am just starting to look at the Tagger program in the Broad's Haploview software package, since I am really interested in getting tagging SNPs alone...

ADD REPLY

Login before adding your answer.

Traffic: 2710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6