Dear all,
I am dealing with a data contains samples with 7 tissues(normal control vs. tumor), and 2 experimental batches. These control and tumor of one tissue is not evenly distributed in two batches. For one specific tissue, there are 3~4 control samples, 3~4 tumor samples. The main aim of this analysis is to find genes differentially expressed in each tissue, and to see if there is intersection of these genes lists from different tissues.
So far I have normalized the data(by total read count), and removed batch effect by setting mean inside one batch to zero. I have also try ComBat to remove batch by using model: batch + labels + batchlabels. There are 72 label types, for example for tissue A there are 2 labels, A_normal and A_tumor. But I got "At least one covariate is confounded with batch" error so give up.
After these, I should choose one tissue(control-tumor) pair to do differential analysis to gat a gene list. The problem is, when I plot the pca plot before doing differential analysis, I found it seems there is still batch effect on that tissue and the biological signal are still confounding with batch. So should I remove batch effect on one specific tissue again?
Thanks in advance:) Any comments will be much appreciated!
tissue1 PC1 and PC2, first plot: red for batch A, green for batch B second plot: red for tumor, green for normal
Hi LucyS,
It might help if you explain what your aim of the experiment is. Is it finding differentially expressed genes? Is it clustering? Machine learning? And what did you do already, what have you tried already? You have normalized to remove batch effect, but how? With what tools? In my opinion your question is too vague and unclear to help you further. If you explain more about your design and goals maybe more people can help you with it.
Hi b.nota. Thanks for your suggestions! I modified my post, hope it's more clear now! My main aim now is to find differentially expressed genes of one tissue, and then see if it intersects with the other tissue's result!