Computing T-Test For Genes With Help Of Apply Function
1
0
Entering edit mode
10.5 years ago
jack ▴ 520

I have a matrix :

>data

      A  A  A  B  B  C
gene1 1  6 11 16 21 26
gene2 2  7 12 17 22 27
gene3 3  8 13 18 23 28
gene4 4  9 14 19 24 29
gene5 5 10 15 20 25 30

I want to to test whether the mean of each gene (rows) values are different between different groups for each gene or not? I want to use T-test for it. The function should take all columns belong to group A, take all columns belongs to group B, take all columns belongs to group C,... and calculate the T-test between each groups for each genes.(every groups contains several columns) one possible implementation is :

   > Results <- combn(colnames(data), 2, function(x) t.test(data[,x]) ,simplify = FALSE)  sapply(Results, "[", c("statistic", "p.value"))
    > sapply(Results, "[", c("statistic", "p.value"))

but it does compute between all columns rather than between groups for every row(every gene). can somebody help me how to modify this code to calculate T test between groups for each genes l ?

r bioinformatician statistics • 19k views
ADD COMMENT
4
Entering edit mode

Don't you want to process an ANOVA instead, as you have more than 2 groups (i.e. factors)? Also, I guess you will need to correct for multiple testing bias.

ADD REPLY
1
Entering edit mode

Have you tried to make any modifications to that yourself or are you just relying on others to feed working R code to you?

ADD REPLY
1
Entering edit mode

Don't have time to write a proper answer/test this but a quick and dirty solution would presumably be to transpose your data before running the same code? i.e. data <- t(data)

ADD REPLY
0
Entering edit mode

No, that'll end up comparing genes against each other (as is, it compares samples against each other across genes).

ADD REPLY
4
Entering edit mode
10.5 years ago

In your particular case, you appear to have 3 groups. To do a test based on the first two groups, something like:

d=matrix(rnorm(60000),nc=6)
pvals=apply(d,1,function(x) {t.test(x[1:2],x[3:5])$p.value})

While using hand-made t-tests is quite useful as a learning process, the bioconductor limma package (and multiple other packages) are designed to work with small sample sizes where the number of tests is quite large, and where the experimental design might be more complicated than can be handled by a simple t-test.

ADD COMMENT
0
Entering edit mode

jack could also just apply(data, 1, function(x) pairwise.t.test(x, colnames(data))), assuming that each of the groups has at least 2 samples. Limma is probably the better route, of course :o)

ADD REPLY

Login before adding your answer.

Traffic: 2422 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6