Biostar Beta. Not for public use.
Question: What is the best way to compare transcriptome between different condition?
1
Entering edit mode

My goal is to compare transcriptome between different condition. For example, I KD gene A, gene B, genes C. And I want to know whether the consequence of KD gene A is more close to gene B or gene C. The first way I adopted is to compare CPM of KD gene A Control, KD gene A, KD gene B Control, KD gene B .... But the result is KD gene A Control and KD gene A is more close. So I think I should consider the effect of the background. I next compared the log2foldchange from DESeq2 result. But I lose the p-value information. So, what is the best way to compare the transcriptome of RNA-seq?

ADD COMMENTlink 15 months ago ch8316f5eyu • 10 • updated 15 months ago bharata1803 • 420
Entering edit mode
2

If you are interested in just knowing which of the knockdowns i.e. B or C is close to lets say A, you can do hierarchical clustering on the counts post applying a transform like vst() or rld() in DEseq2. You can find an example here.

ADD REPLYlink 15 months ago
rizoic
• 190
Entering edit mode
0

But there is a batch effect. I haven't KD those genes at the same time. Those KD samples have corresponding control. Can I just cluster those without control? If I add control samples, the KD samples are clustered with their corresponding control.

ADD REPLYlink 15 months ago
ch8316f5eyu
• 10
Entering edit mode
0

I used the first way you mentioned. I 'm not confident because I don' t it is acceptable. Thank you for your help.

ADD REPLYlink 15 months ago
ch8316f5eyu
• 10
2
Entering edit mode

Then you can either:

  1. Do the hierarchical clustering on the log2FC produced by DESeq2
  2. You can batch correct the entire expression matrix (using sva::ComBat (see section 7 here) or limma::removeBatchEffect (see page 190 here)) and do the hierarchical clustering on the corrected matrix.

Btw for doing a global comparison of which are more/less similar I would not use p-values (or only significant features) but rather the entire transcriptome.

ADD COMMENTlink 15 months ago kristoffer.vittingseerup ♦ 1.8k
0
Entering edit mode

The question of your setting is basically find which change between treated gene vs control gene is closer acroos gene, right? In that case you need to measure the change between the group, then you measure the change acrross gene. Clustering log2FC is okay I guess but I think it will not show any direct relationship because 2 genes up regulation/down regulation can be caused by many things.

I think calculating correleation between 2 genes expression is better. Calculate using normalized expression from CPM function from Limma or EdgeR I forget or VST from DESeq2.

Why I think it is better? Correlation for expression of 2 genes basically check if gene A is affected by gene B or vice versa. If a gene is affecting another gene, it will affect both in control condition and in treatment condition. It means that no matter the condition, there would be an effect of gene A to gene B.

ADD COMMENTlink 15 months ago bharata1803 • 420

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0