Biostar Beta. Not for public use.
Question: Why quantitative design are preferred GWAS approach
0
Entering edit mode

Hello!

I was writing something about GWAS, however is not really my field and so lot of reading. I encountered this statement in Bush and Moore, 2012 (Chapter 11: Genome-Wide Association Studies, 2012):

There are two primary classes of phenotypes: categorical (often binary case/control) or quantitative. From the statistical perspective, quantitative traits are preferred because they improve power to detect a genetic effect, and often have a more interpretable outcome. For some disease traits of interest, quantitative disease risk factors have already been identified.

Can anyone help me to understand why quantitative trait has more power (even with some formulas it will be great)? are they referring to QTL somehow?

Thank you very much

Entering edit mode
0

I think one reason could be quantitative traits follow certain distribution like normal distribution so they could be statistically tested against the null hypothesis for example t student test while qualitative traits usually have to be tested by non parametric tests as these data don't follow certain distributions.

Za
• 120
Entering edit mode
0

Do you think parametric or non parametric test makes any difference with the number of GWAS? I am not saying is not the case, but, but I want just to understand your point.

ste.lu
• 40
Entering edit mode
0

Actually I also don't know deeply but I only know quantitative data can be model and tested with more flexibility , I am sorry :(

Za
• 120
4
Entering edit mode

Quantitative (continuous) traits are preferred because they contain more information. However, we are strictly referring to quantitative traits that already follow a data distribution that can be modeled in whatever it is your proposed statistical test. Usually, this would mean a Gaussian / normal distribution. If you have a very weird variable that has a skewed distribution that cannot be modeled, then changing it to qualitative (categorical) would be better.

Think about it: we have a beautiful variable of n=1000000 and it 'perfectly' follows our expected distribution (in R):

``````million <- rnorm(1000000)
hist(million)
``````

Now lets dichotomise it:

``````million[million<=2] <- 0
million[million>2] <- 1
hist(million)
``````

They look completely different and you can see that we have lost so much information. Whilst we can treat this new data as categorical, you can clearly appreciate at the same time that we have thrown out so much information.

--------------------------------------

It is this lost information that increases error (type II) and, therefore, reduces statistical power. Remember that, generally speaking, statistical power is the level of our ability to identify an effect when an effect is actually present in our cohort. You can therefore appreciate that, by throwing out useful information, we are reducing our power.

Kevin

PS - wrote a bit more here: A: Log-tranformation and GWAS

Entering edit mode
1