Thanks for your answer!

Entering edit mode

Hi,

I would like to perform a statistical test to see whether there is any significant differences between proportions in three different groups (G1, G2, G3) among different runs. The ID refers to different subjects. The sum of G1, G2 and G3 is always 1 as they are proportions.

I’m interested in comparing different samples regarding the proportions in different groups (comparing the rows).

```
{r}
data <- data.frame(ID=rep(paste0("ID", 1:3), 3), runs = rep(c("run1","run2","run3"), 3),
G1=c(0.58, 0.43, 0.43, 0.55, 0.45, 0.33, 0.55, 0.45, 0.43) ,
G2=c(0.22, 0.33, 0.35, 0.3, 0.2, 0.24, 0.15, 0.35, 0.24) ,
G3=c(0.2, 0.24, 0.22, 0.15, 0.35, 0.43, 0.3, 0.2, 0.33))
```

I really appreciate if you can help me to find a proper statistical test.

Entering edit mode

Because you have animal groups, tissues and multiple experiments, I would recommend modelling this using a linear model and treating the animals and the tissue as a fixed effect. There is some background you'll have to pick up, but here's a stub to get you started/thinking about analyzing this:

```
library(reshape2);
## Create dataframe
df <- data.frame(ID=rep(paste0("ID", 1:3), 3), tissue = rep(c("liver","brain","heart"), 3), G1=c(0.58, 0.43, 0.43, 0.55, 0.45, 0.33, 0.55, 0.45, 0.43) , G2=c(0.22, 0.33, 0.35, 0.3, 0.2, 0.24, 0.15, 0.35, 0.24) , G3=c(0.2, 0.24, 0.22, 0.15\
, 0.35, 0.43, 0.3, 0.2, 0.33))
# Turn into a molten dataframe
df.molten = melt(df)
## Model data set
model.lm = as.formula("value ~ variable + ID + tissue")
df.lm = lm(data = df.molten, model.lm)
## Explore results
summary(df.lm)
# Move G3 to the front of factor values to change treatment group.
df.molten$variable = factor(df.molten$variable, c("G3", setdiff(as.character(df.molten$variable), "G3")))
df.lm = lm(data = df.molten, model.lm)
summary(df.lm)
```

And the results for these `summary(df.lm)`

:

```
Call:
lm(formula = model.lm, data = df.molten)
Residuals:
Min 1Q Median 3Q Max
-0.13667 -0.04667 -0.02444 0.07333 0.16111
Coefficients: (2 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 4.667e-01 3.609e-02 12.932 9.33e-12 ***
variableG2 -2.022e-01 3.953e-02 -5.115 3.99e-05 ***
variableG3 -1.978e-01 3.953e-02 -5.003 5.23e-05 ***
IDID2 3.036e-18 3.953e-02 0.000 1
IDID3 -3.925e-17 3.953e-02 0.000 1
tissueheart NA NA NA NA
tissueliver NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.08386 on 22 degrees of freedom
Multiple R-squared: 0.6081, Adjusted R-squared: 0.5369
F-statistic: 8.535 on 4 and 22 DF, p-value: 0.0002573
```

and

```
Call:
lm(formula = model.lm, data = df.molten)
Residuals:
Min 1Q Median 3Q Max
-0.13667 -0.04667 -0.02444 0.07333 0.16111
Coefficients: (2 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 2.689e-01 3.609e-02 7.451 1.88e-07 ***
variableG1 1.978e-01 3.953e-02 5.003 5.23e-05 ***
variableG2 -4.444e-03 3.953e-02 -0.112 0.912
IDID2 -4.626e-17 3.953e-02 0.000 1.000
IDID3 -2.453e-17 3.953e-02 0.000 1.000
tissueheart NA NA NA NA
tissueliver NA NA NA NA
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.08386 on 22 degrees of freedom
Multiple R-squared: 0.6081, Adjusted R-squared: 0.5369
F-statistic: 8.535 on 4 and 22 DF, p-value: 0.0002573
```

In the first comparison, you compare G1 to G2 and G3. The p-value for the coefficient variableG2, variableG3 seem to indicate that the difference between G1 and G2 or G1 and G3 is significant.

In the second comparison, you switch G3 for your treatment group. The coefficient variableG1 is equivalent to the previous comparison's variableG3, so it makes sense that you get the same p-value. However you see here that G3 v.s. G2 is not significantly different (p=0.912).

Try to understand what's going on (starting with the basic of linear regression if you're not familiar with the method) and try to understand the outputs of the model before you use this in any serious analysis.

Loading Similar Posts

What did you measure?

It's not my own data. I only know it's a kind of classification measurement that shows the proportion of cells that clustered together in each sample.

deleting a post after getting a satisfactory answer is grounds for suspension!

The reason for deleting was an incorrect question! I was not interested in the differences between groups (G1, G2 and G3). I've deleted it to post the correct question!

ok but look someone took the effort to answer your question. You should thank them and leave it be. It is still the correct answer to the question and we need to honor the effort that goes into answering questions.