Question

geom_signif() uses t-test to compare between more than 3 groups... Isn't this wrong?

1

Entering edit mode

7 months ago

RM123 ▴ 10

Hello guys,

I have a somewhat R and biostats question. I want to make boxplots with ggplot2 package and include in them the significance bars between groups. I have 4 groups in the boxplots. In my short statistic classes I always learned that I should do a OneWay ANOVA and then use multiple comparison tests (like Tukey HSD). However, in this package I can only use "t-test" or "wilcox.test" and I also learned that repeating t-tests in the same group of comparisons increases some kind of error.

Therefore, my question to you is if I can use the t-test to make the multiple comparisons between subgroups, or if I should perform ANOVA and the Tukey HSD for the multiple comparisons. If you think I should go for the Tukey, can you explain why the package only includes the T-test?

I appreciate any help you can give!

Cheers!

ggplot2 ANOVA t-test • 1.5k views

ADD COMMENT • link 7 months ago by RM123 ▴ 10

0

Entering edit mode

It seems to me that anova followed by HSD is a more sensible approach than applying independent t-tests. Granted this is the first time I see the ggsignif package, I'm kind of surprised it doesn't provide an out-of-the-box mechanism to implement multiple testing corrections. However, I see that the test parameter of geom_signif accepts a custom function name (If you implement a custom test make sure that it returns a list that has an entry called p.value), so you could write your own method.

ADD REPLY • link 7 months ago by dariober 14k

0

Entering edit mode

Hello Dariober, That seems like a good solution, however I'm pretty new to R. Could you tell me how I can do that?

ADD REPLY • link 7 months ago by RM123 ▴ 10

score 0 · Answer 1 · 2023-09-25

0

Entering edit mode

7 months ago

LChart 3.9k

Yes and no. The only consideration is that there are multiple comparisons, and so at a p-value of 5%, you would expect 5% of all comparisons to be significant purely by chance. If you're comparing 6 groups (15 total comparisons) that means about 50% of such cases there should be at least one significant comparison simply by chance.

The Tukey HSD corrects for multiple testing; you could use HSD directly, apply a p-value adjustment method (BH or bonferroni), or simply reduce the p-value threshold to 1- exp(log(1-p)/(# groups choose 2)) when annotating the boxplot with *, **, *** (etc).

ADD COMMENT • link 7 months ago by LChart 3.9k

0

Entering edit mode

Thank you for your help. I am comparing 4 groups, but as I have gene expression data I have a considerable number of boxplots to make. I found this tutorial on how to apply the anova and tukey hsd with geom_boxplot and seems to be a fast way to make several plots: https://statdoe.com/one-way-anova-and-box-plot-in-r/ . What do you think?

ADD REPLY • link 7 months ago by RM123 ▴ 10