plink --indep LD calculation and filtering to VCF file in one command
Entering edit mode
3.2 years ago
rjzotti • 0

I want to use plink's linkage disequilibrium feature to filter my VCF file. I'm new to genomics, but after reading plink's documentation, I assumed I could do this in one command:

plink \
--bcf /input/${CHROMOSOME_ID}.vcf.gz \
--recode vcf \
--out /output/ch${CHROMOSOME_ID} \

I then use the output file, e.g., ch6.vcf, for downstream analysis. I never bothered touching the .in and .out files because according to the plink data docs:

--recode creates a new text fileset, after applying sample/variant filters and other operations.

so I assumed plink's --recode would interpret my $VIF_THRESHOLD as a variant filter operation. However, in other, older biostars posts I've read that you have to do the filtering using .in or .out in a separate command. Is my original command incorrect?

plink • 1.1k views
Entering edit mode
3.2 years ago
Sam ★ 4.7k

Based on my understanding, --indep doesn't perform filtering, but generate file containing the pruned SNPs. So you should do --extract or --exclude in downstream analysis to filter out the SNPs. Documentation is here


Login before adding your answer.

Traffic: 1034 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6