Biostar Beta. Not for public use.
Question: Plotting Linkage Disequilibrium Decay Across The Genome
3
Entering edit mode

Hi everyone:

I am trying to plot the linkage disequilibrium decay across the genome using genome-wide snps and get a plot that looks something like this:

http://postimg.org/image/5bkdju6jx/

In the methods sections the authors specified that they used the --r2 command in plink. I read the details of this command in the plink website, but I am still not sure how should I use the output file to plot a similar thing.

Any help would be really appreciated.

Javier

Ps: Heres the article's title: Population Genetic Structure of the People of Qatar (2010)

Entering edit mode
0

Can you post the first 4 lines of the plink.ld file containing the pai wise correlations?

Irsan
♦ 6.9k
Entering edit mode
0

Hi Irsan:

Heres the output file I get from plink using default values

http://postimg.org/image/87hq8nodp/

I got an answer from another post, and what I need to do is calculate the distance between SNP A and SNP B and calculate the r2 averages for every distance binned for example every 1kb. Do you know how to do a script that could do that?

Thank you!

Javier2013
• 110
Entering edit mode
0

Hi

i have two file:ped and map

which command of plink is requir to build input file for R?(for linkage disequilibrium decay across the genome)

mpt.gene.93
• 0
Entering edit mode
0

WouterDeCoster
39k
1
Entering edit mode

In the plink.ld file you have colums CHR_A, BP_A, SNP_A, CHR_B, BP_B, SNP_B and R2.

in R do

``````# install and load ggplot2
install.packages("ggplot2")
library(ggplot2)

# import the data

# plot the average correlation for each snp distance
ggplot(ld) +
geom_line(aes(x=BP_B - BP_A, y = R2))
``````
Entering edit mode
0

If you don't like scripting you can also do this with Excel pivot tables

Irsan
♦ 6.9k
Entering edit mode
0

Hi Irsan,

I just tried plotting my .ld file but I couldn't get any clear result. I just get a plot where everything is black http://s24.postimg.org/9mhyytpqt/Screen_Shot_2013_10_27_at_3_43_34_PM.png

So I did some scripting on my own and found out that the r2 average is too high even at 70kb apart. http://s12.postimg.org/812gxfza5/Screen_Shot_2013_10_27_at_3_49_53_PM.png

I usually do not use the ggplot package so I am not very sure why am I getting that plot...

But I am also thinking that theres something wrong with the my output file. Maybe something with the commands? Have you done a LD decay plot before??

Here is my log file:

Options in effect: --noweb --bfile testLWK --r2 --ld-window-kb 70 --ld-window-r2 0 --ld-window 10 --out LWK_70kb_10snps

Thanks again.

Javier2013
• 110
Entering edit mode
0

No i have never used plink before, for me this is just a visualization question. And it should be really straightforward. Check if your data frame is alright by dim(ld), str(ld), summary(ld)

Irsan
♦ 6.9k
Entering edit mode
0

In R version 3.3.2 (2016-10-31) the line

ggplot(ld) + geom_line(aes(x=BP_B - BP_A, y = R2))

gives

veronicaschroeder78
• 110
0
Entering edit mode

I had the same problem as described by veronicaschroeder78: Error in eval(expr, envir, enclos) : object 'BP_B' not found. I solved the problem by modifying a little be in Iran's codes: change sep="\t" to sep="", others keep the same, and also change "geom_line" to "geom_point" will give you scatter plot of the LD.