Question

Determining If Experiment Data Can Be Pooled From Multiple Independent Experiments

1

Entering edit mode

10.3 years ago

Adamc ▴ 680

This is more of a straight-up biostats question (no transcriptomics involved, although I'll be popping the numbers into R), but I didn't get any feedback when I posted to CrossValidated, so if someone statistically-minded in this community could provide some insight, it would be appreciated.

We have a dataset with Time and Treatment as independent variables, and tumor growth volume as a dependent variable. The experiment was repeated on three occasions, with multiple time points and multiple observations per combination of factors for each experiment instance (so 8 observations at time = 12, treatment = 1, experiment1, etc). There's a subset of common factors for time points and treatment between all the experiments that I am focusing on. Note that the samples were not the same physical samples between experiments (so it's not strictly a repeated measures design).

We'd like to figure out if we can pool the data from the individual experiments, or if the variation between batches is too significant to merge them. Originally this seemed like a design for MANOVA, but when I put all of the data in the format I figured I need for R, it looks like there's actually only one dependent variable (growth volume) if we treat experiment as an independent variable. I also read in some other CrossValidated answers that there are regression methods that are preferable to MANOVA nowadays anyway, but I wasn't sure where to start with determining an appropriate approach.

Thanks!

• 5.0k views

ADD COMMENT • link updated 14 months ago by Ram 43k • written 10.3 years ago by Adamc ▴ 680

Ram · Answer 1 · 2015-05-04

This is a dated question, but I'll give it a short answer: no. There's too much tripwire and overhead to handle in order to be statistically justified in pooling in your particular case. Steer clear. Unless your situation is very clear and your technical replicates are impeccable, be careful when pooling. Finally, ask yourself the question: why am I doing the pooling? Am I trying to increase my statistical power for a specific reason in my particular experiment? Am I trying to isolate the signal from the noise in my particular experiment? Different questions frequently lead to different approaches of pooling.

My advice: get a biostatistician on board and explain your question at 1000X the depth that you've asked it here on Biostars, and I'll guarantee you that the biostatistician will come up with another question for you. Standard procedure.