geom_signif() uses t-test to compare between more than 3 groups... Isn't this wrong?
1
1
Entering edit mode
7 months ago
RM123 ▴ 10

Hello guys,

I have a somewhat R and biostats question. I want to make boxplots with ggplot2 package and include in them the significance bars between groups. I have 4 groups in the boxplots. In my short statistic classes I always learned that I should do a OneWay ANOVA and then use multiple comparison tests (like Tukey HSD). However, in this package I can only use "t-test" or "wilcox.test" and I also learned that repeating t-tests in the same group of comparisons increases some kind of error.

Therefore, my question to you is if I can use the t-test to make the multiple comparisons between subgroups, or if I should perform ANOVA and the Tukey HSD for the multiple comparisons. If you think I should go for the Tukey, can you explain why the package only includes the T-test?

I appreciate any help you can give!

Cheers!

ggplot2 ANOVA t-test • 1.5k views
ADD COMMENT
0
Entering edit mode

It seems to me that anova followed by HSD is a more sensible approach than applying independent t-tests. Granted this is the first time I see the ggsignif package, I'm kind of surprised it doesn't provide an out-of-the-box mechanism to implement multiple testing corrections. However, I see that the test parameter of geom_signif accepts a custom function name (If you implement a custom test make sure that it returns a list that has an entry called p.value), so you could write your own method.

ADD REPLY
0
Entering edit mode

Hello Dariober, That seems like a good solution, however I'm pretty new to R. Could you tell me how I can do that?

ADD REPLY
0
Entering edit mode
7 months ago
LChart 3.9k

Yes and no. The only consideration is that there are multiple comparisons, and so at a p-value of 5%, you would expect 5% of all comparisons to be significant purely by chance. If you're comparing 6 groups (15 total comparisons) that means about 50% of such cases there should be at least one significant comparison simply by chance.

The Tukey HSD corrects for multiple testing; you could use HSD directly, apply a p-value adjustment method (BH or bonferroni), or simply reduce the p-value threshold to 1- exp(log(1-p)/(# groups choose 2)) when annotating the boxplot with *, **, *** (etc).

ADD COMMENT
0
Entering edit mode

Thank you for your help. I am comparing 4 groups, but as I have gene expression data I have a considerable number of boxplots to make. I found this tutorial on how to apply the anova and tukey hsd with geom_boxplot and seems to be a fast way to make several plots: https://statdoe.com/one-way-anova-and-box-plot-in-r/ . What do you think?

ADD REPLY

Login before adding your answer.

Traffic: 1823 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6