Long Range Linkage Disequilibrium Between Snps
3
4
Entering edit mode
12.3 years ago
Ryan D ★ 3.4k

I want to check linkage disequilibrium between pairs of SNPs that may be more than 500kb away from one another or on different chromosomes (trans-LD).

A paper from 2009 had a tool which did this called GLIDERS. That website is currently down. SNAP and most tools typically limit comparisons between SNPs to cis-LD and 500kb limits. In the absence of downloading genotype data from Hapmap, are there any tools which allow this to be done in batch as was possible with GLIDERS beyond the limits (cis/500k) mentioned above.

Thanks,

Ryan

linkage snp • 5.9k views
ADD COMMENT
1
Entering edit mode

I would simply be very cautious about that gliders tool after skimming that article, because they don't tell how they actually calculated genome-wide ld and which method or software was used, and the resulting LD-data would take up terabytes of data given their way of storing them in text-files.

ADD REPLY
5
Entering edit mode
12.3 years ago

I have been writing code to measure LD for unphased genotypes. One thing that I quickly found was that I was doing factorial(N-mutations) pairwise comparisons, this became too computationally intensive. You may need to constrain the distance between the markers. I did. Maybe you could do markers 500K > && < 600k?

I used the method from:

C. Huff

I know this isn't the answer you were looking for, but I thought I would share my experiences. I found it to be pretty easy to code it myself...

ADD COMMENT
5
Entering edit mode
12.3 years ago
Caddymob ▴ 1000

If you are just looking at pairs of SNPs you can do this in PLINK. I just did it as a test with two SNPs, one chr5 and one on chr19.

plink --file MY_PED --ld rs17070145 rs7412

Which gives the result:

LD information for SNP pair [ rs17070145 rs7412 ]

   R-sq = 0.000     D' = 0.108

   Haplotype     Frequency    Expectation under LE
   ---------     ---------    --------------------
       TT          0.018            0.020
       CT          0.044            0.042
       TC          0.303            0.302
       CC          0.636            0.637

   In phase alleles are TC/CT
ADD COMMENT
1
Entering edit mode

gotcha. Wasn't sure how many you were doing. Another way to do this and constrain on r2 is using the snpMatrix package in R. It will read a PLINK BED file and you can get the LD measures and pump them into a dataframe based on your r2 (or D' or both) constraints.

ADD REPLY
0
Entering edit mode

Thanks, GLIDERS is back up. http://mather.well.ox.ac.uk/GLIDERS/ But I did, in fact, end up using a solution like the one above. I used tabix to download my regions and then did LD comparisons in the manner above after converting the vcf files to PLINK format and combining them to a single bed file.

ADD REPLY
0
Entering edit mode

Thanks, GLIDERS is back up. http://www.sanger.ac.uk/Software/GLIDERS/ But I did, in fact, end up using a solution like the one above. I used tabix to download my regions and then did LD comparisons in the manner above after converting the vcf files to PLINK format and combining them to a single bed file. Though I think the point of Michael and Zev above is understood. Pairwise comparisons for 10e7 variants would be a bit much running at 10e14 values. GLIDERS constrains by only keeping those which have an r2 of 0.3 or better. A tool to do this for any two SNPs/regions would be nice.

ADD REPLY
0
Entering edit mode

Is it possible to do the calculation for a subpopulation? Such as CHB?

ADD REPLY
0
Entering edit mode

Is it possible to do the calculation for a subpopulation? Such as CHB?

ADD REPLY
0
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 1940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6