Question

Computing T-Test For Genes With Help Of Apply Function

0

Entering edit mode

10.5 years ago

jack ▴ 520

I have a matrix :

>data

      A  A  A  B  B  C
gene1 1  6 11 16 21 26
gene2 2  7 12 17 22 27
gene3 3  8 13 18 23 28
gene4 4  9 14 19 24 29
gene5 5 10 15 20 25 30

I want to to test whether the mean of each gene (rows) values are different between different groups for each gene or not? I want to use T-test for it. The function should take all columns belong to group A, take all columns belongs to group B, take all columns belongs to group C,... and calculate the T-test between each groups for each genes.(every groups contains several columns) one possible implementation is :

   > Results <- combn(colnames(data), 2, function(x) t.test(data[,x]) ,simplify = FALSE)  sapply(Results, "[", c("statistic", "p.value"))
    > sapply(Results, "[", c("statistic", "p.value"))

but it does compute between all columns rather than between groups for every row(every gene). can somebody help me how to modify this code to calculate T test between groups for each genes l ?

r bioinformatician statistics • 19k views

ADD COMMENT • link updated 10.5 years ago by Sean Davis 26k • written 10.5 years ago by jack ▴ 520

4

Entering edit mode

Don't you want to process an ANOVA instead, as you have more than 2 groups (i.e. factors)? Also, I guess you will need to correct for multiple testing bias.

ADD REPLY • link 10.5 years ago by Manu Prestat 4.1k

1

Entering edit mode

Have you tried to make any modifications to that yourself or are you just relying on others to feed working R code to you?

ADD REPLY • link 10.5 years ago by Devon Ryan 104k

1

Entering edit mode

Don't have time to write a proper answer/test this but a quick and dirty solution would presumably be to transpose your data before running the same code? i.e. data <- t(data)

ADD REPLY • link 10.5 years ago by Ben ★ 2.0k

0

Entering edit mode

No, that'll end up comparing genes against each other (as is, it compares samples against each other across genes).

ADD REPLY • link 10.5 years ago by Devon Ryan 104k

score 4 · Answer 1 · 2013-10-02

4

Entering edit mode

10.5 years ago

Sean Davis 26k

In your particular case, you appear to have 3 groups. To do a test based on the first two groups, something like:

d=matrix(rnorm(60000),nc=6)
pvals=apply(d,1,function(x) {t.test(x[1:2],x[3:5])$p.value})

While using hand-made t-tests is quite useful as a learning process, the bioconductor limma package (and multiple other packages) are designed to work with small sample sizes where the number of tests is quite large, and where the experimental design might be more complicated than can be handled by a simple t-test.

ADD COMMENT • link 10.5 years ago by Sean Davis 26k

0

Entering edit mode

jack could also just apply(data, 1, function(x) pairwise.t.test(x, colnames(data))), assuming that each of the groups has at least 2 samples. Limma is probably the better route, of course :o)

ADD REPLY • link 10.5 years ago by Devon Ryan 104k