Hello,
I am a beginner in RNA-seq analysis. Recently, I was trying to analyze the differential expression of a particular isoform in cancer vs normal sample from TCGA exon specific data. I found that all the exons of a particular isoform has different reads/RPKM value. What I was thinking-if a full length isoform is expressed in a particular cell, all the exons of this isoforms should be expressed at the same level. Please correct me if this is not true.
Another thing is that-if I want to compare the expression of a particular isoform in cancer vs normal cell, do I have to compare the total reads/RPKM from all the exons in each condition (cancer vs normal)?
Thank you very much Devon for clarification. But how do I know that a particular exon reads come from a particular isoform? For example, my gene of interest has two isoforms, one isoform contains all the exons (exon 1 to 15), another contains 12 exons (exon 4 to 15). In that case, how can I distinguish the reads from exon 4-15 between two isoforms?
Usually TCGA provides isoform level metrics, so just use them. If your question is how one actually derives such metrics, typically an expectation maximization (or stochastic collapsed variational Bayes, in the case of Salmon) algorithm is employed.
Thanks a lot Devon.