Question

Understanding How Snps Affect The Association Score In A Gwas

7

Entering edit mode

13.2 years ago

User 6659 ▴ 970

Hi

Please excuse the basic question but I am new to GWAS. It is my understanding that haplotypes are blocks of DNA sequence where the bases therein are always coinherited because (as yet) the blocks of DNA do not undergo recombination. Haplotypes can be characterised by key tagSNPs

In a GWAS you find haplotypes or tagSNPs associated statistically with a particular trait. As a general point, how does the distance between a causal variant of the trait and the tagSNP affect the resulting association? I thought that, by definition of a haplotype, all of the bases in the region are ALWAYS coinherited with the tagSNP (otherwise it wouldn't be a haplotype) so the distance of the causal SNP from the tagSNP does not affect the strength of the association. In other words, excluding other confounding factors like environment etc, would all SNPs inside a haplotype have the same signal strength if they had the same association with the disease?

Are the signals from SNPs additive? Lets say a haplotype had 2 SNPs associated with a disease, would their signals 'add up' to give a stronger signal?

Is it possible that SNPs outside the haplotype may sometimes segregate with the haplotype and cause weak signals for the haplotype?

Thanks

gwas • 5.0k views

ADD COMMENT • link updated 3.1 years ago by Ram 43k • written 13.2 years ago by User 6659 ▴ 970

Ram · Answer 1 · 2011-02-11

6

Entering edit mode

13.2 years ago

Jarretinha 3.4k

our last point is the most interesting. But, first things first. The definition of haplotype depends on the context. You cited the classical sense. In the HapMap sense, only the statistical associations counts to define it. And in both cases recombination isn't excluded. It's just low enough to keep the signal detectable. Haplotypes come and go in evolutionary time.

SNPs signals aren't additive in a strict sense. They are affected by penetrance, dominance, epistasis in non-trivial ways rendering very complicated to build a metric.

You can go to you last point, now. We don´t know for sure what maintains a haplotype structured in the long run. But we know that they aren't isolated from the influence of other parts of the genome. It might be possible that a haplotype has emerge in response to forces like genetic draft, Hill-Robertson effect and linkage disequilibria in general. A haplotype isn't a uniform block of SNPs under exactly the same instensity of evolutionary forces.

So, it's totally possible to exist high orders of organization. That is, sets of haplotypes found together more commonly than others and so on. That's why is so hard to detect weak/rare associations. You'll need sufficient statistical power to sort the different populational effects.

I'll add some refs soon to help in practice.

ADD COMMENT • link updated 3.1 years ago by Ram 43k • written 13.2 years ago by Jarretinha 3.4k

0

Entering edit mode

Thanks for the answer. It's interesting that my 'classical' definition isn't the hapmap definition as i got my 'classical' definition from the hapmap website! I'm not contradicting you - just explaining why its easy to be confused.

ADD REPLY • link updated 3.1 years ago by Ram 43k • written 13.2 years ago by User 6659 ▴ 970

0

Entering edit mode

So - to clarify - I appreciate that SNPs are strictly additive but are they 'loosely' additive. If a haplotype has 2 SNPs associated with the disease, will there be a stronger signal for that haplotype than for the 'same' haplotype where one of those SNPs wasn't present?

ADD REPLY • link updated 3.1 years ago by Ram 43k • written 13.2 years ago by User 6659 ▴ 970

0

Entering edit mode

It's possible to imagine a situation where SNPs are totally additive. What I've said was that such situation is very unusual. It's not easy to find SNPs with the same effect on a phenotype. An extreme example: SNPs in hemoglobin loci (no recombination involved) can "cause" falciform anaemia, thalassemia or HPFH. Are they additive? I don't think so. Their effects simply don't stack up. You can find real examples for QTLs.

ADD REPLY • link updated 3.1 years ago by Ram 43k • written 13.2 years ago by Jarretinha 3.4k

Ram · Answer 2 · 2011-05-31

I agree it is easy to get confused...

LD blocks = stretches of genome in high LD, are created due to population genetics events and their patterns are different in different populations.
A haplotype may or may not refer to SNPs in an LD block.
Tag SNP-s refer to SNPs that are good surrogate for others in an LD block, by capturing the common haplotypes within that LD blocks. They are generally not perfect surrogates. Setting an LD threshold of >0.8 may give 5 tag SNPs, >0.2 might give 1 tag SNP.
Back to haplotypes, Technically any set of adjacent SNPs can consititute a haplotype, where adjacent generally means among the typed SNPs. So these SNPs could be quite far apart if SNP density is low. It is also possible to define non-adjancent SNPs as haplotype, but generally NOT in separate chromosomes. For a single meiosis, one can conceptually think of an entire stretch of chromosome being transmitted together (sadwiched between two recombination events), which leads to the confusion with the LD block. For different people (meioses) recombination will happen at different places, so there is no single haplotype that is "always" coinherited with a single SNP at a population level.
finally regarding addiditivity of association, say a 3 SNP haplotype. Each haplotype e.g. (1, 1, 0) can be thought of as a bin in the 2 X 2 X 2 table of these 3 SNPs (A/a X B/b X C/c). So modelling effects of each haplotype separately is equivalent to a saturated model (no linearity or additivity assumption, interactions of all orders are allowed). But power is quickly lost as bins become too sparse for longer haplotypes. So each haplotype is compared with the rest (clubbed together). So generally the underlying assumption is that a particular (rare) haplotype say (0,0,1) will tag an ungenotyped rare SNP in that region with high LD.