Biostar Beta. Not for public use.
Question: QQ-plot for microarray t-test?
0
Entering edit mode

Hello,

we submitted a paper and we t-test and fold change for determining genes differentially expressed between two sample sets (the sets are not equal in numbers, 12 vs 8 in one case, 12 vs 10 in another). A referee is asking us for a qq-plot for the t-tests. I just do not understand what he is intending: the distribution between one set versus other one, or the distribution of genes in all samples versus normal distribution? And what is the simplest way to do it?

Thank you in advance.

Entering edit mode
0

Did you analyze microarray data with non-standard tools or even homemade statistics instead of something like limma?

ADD REPLYlink 9 months ago
ATpoint
17k
Entering edit mode
0

It is a commercial software; I would not know if it can be called "non-standard".

ADD REPLYlink 9 months ago
sig93618
• 0
3
Entering edit mode

The reviewer might suspect that the assumptions of the t-test are violated. A quantile-quantile-plot is a good way to compare two distributions, in this case, the theoretical distribution and the empirical distribution. Ideally, the two would be equal, resulting in a straight line. But often, empirical distributions tend to have wider tails, that is, more extreme values than expected are observed, resulting in a skewed Q-Q-plot. You were lucky though because the reviewer might have requested more advanced methods like limma or CyberT, but you might be fine with a t-test because you have a good number of samples.

Now, the question remains which distributions to compare. It could be debated whether the whole expression data should follow a single normal distribution, or if that should only apply to an individual transcript and its measurement error. For a t-test we assume that values for each transcripts are sampled from normal distributions with the same or different means. Because each single t-test 'sees' only the data from a single transcript, the latter should suffice, and one does not need to make the assumption about normality of all gene-expression values or their differences in total.

A t-test is made under the assumption that its T-statistic follows a Student-T distribution under the null-hypothesis. Therefore, instead of making a plot of all the expression data, I would make a Q-Q-plot of the test-statistics against a theoretical student-t distribution with the same degrees of freedom (depending on sample size).

This can be done easily with the functions qqplot and qt in R.

ADD COMMENTlink 9 months ago Michael Dondrup 46k
0
Entering edit mode

Thank you very much for your extensive and very helpful reply. I will follow your instructions. Best

ADD COMMENTlink 9 months ago sig93618 • 0

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0