Question

The Variance in HWE

0

Entering edit mode

7.2 years ago

GabrielMontenegro ▴ 670

I'm reading Design, Analysis and Interpretation of GWAS of Daniel O. Strom.

On chapter 2 I found:

If we have a sample of N unrelated individuals in a population the distribution of A allele counts for each individual follows a binomial with number of trials = 2N and frequency of A allele = p

p can be found as:

p = ( 1/2N ) * SUM (niA)

Where niA= number of A alleles in individual i

And the variance:

( 1/(2N)^2 ) * SUM Var (niA)

But, I do not understand why do we have the 2N squared in the second equation.

Thank you.

HWE • 1.3k views

ADD COMMENT • link updated 7.2 years ago by atks ▴ 10 • written 7.2 years ago by GabrielMontenegro ▴ 670

score 1 · Answer 1 · 2017-02-16

1

Entering edit mode

7.2 years ago

atks ▴ 10

It's a property derived from the definition of variance.

https://en.wikipedia.org/wiki/Variance#Basic_properties

ADD COMMENT • link 7.2 years ago by atks ▴ 10

0

Entering edit mode

OK, it's a property, but why that particular number?

ADD REPLY • link 7.2 years ago by GabrielMontenegro ▴ 670

0

Entering edit mode

The estimate of the population allele frequncy is p^{hat} = sum_i{niA} / 2N where niA is the number of copies of the A allele for individual i and N is the number of individuals. You use 2N because you assume that the variant is diploid.

so Var(p^{hat}) = Var (sum_i{niA} / 2N ) = (1/2N)^2 Var(sum_i{niA}) [because of above mentioned property] = (1/2N)^2 sum_i Var(niA) [because each observation is independent]

ADD REPLY • link 7.2 years ago by atks ▴ 10