Pruning with plink finds a majority of SNPs in very high LD
0
0
Entering edit mode
13 days ago

Hello everyone,

I'm using plink v1.9 in order to test several r² cutoff from 0.1 to 0.99, however from 0.8 to 0.99 the same about of SNPs are extracted and ~450 000 SNPs (my total set is about 520 000 SNPs) are pruned. So it seems they all have a r² > 0.99.

I'm really new at this subject, does it seem believable for you ?

My SNPs were called with GATK and imputed with Beagle5 without reference haplotype. To do the pruning I'm running the following script :

!/bin/bash

Script to do pruning with different r2

feature 1 : file to prune (.vcf format)

feature 2 : where to store results

Create directory where the results will be stored

mkdir -p $2/Pruned

create chrom map

bcftools view -H $1 | cut -f 1 | uniq | awk '{print $0"\t"$0}' > chrom-map.txt

convert vcf to plink format

vcftools --gzvcf $1 --plink --out $2/Pruned/plink.data --chrom-map chrom-map.txt

pruning

for r2 in 0.99 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 do plink --file $2/Pruned/plink.data --indep-pairwise 200 10 $r2 --out $2/Pruned/pruned.$r2 --allow-extra-chr plink --file $2/Pruned/plink.data --extract $2/Pruned/pruned.$r2.prune.in --recode vcf --out $2/Pruned/pruned.$r2 --allow-extra-chr bgzip $2/Pruned/pruned.$r2.vcf done

Thanks by advice for your answers, Xillanne

LD SNP plink pruning • 185 views
ADD COMMENT

Login before adding your answer.

Traffic: 2219 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6