Question

In TCGA RNA-seq exon specific data, why not all the exons of a particular isoform have similar read (RPKM)?

0

Entering edit mode

8.7 years ago

ruhulmb • 0

Hello,

I am a beginner in RNA-seq analysis. Recently, I was trying to analyze the differential expression of a particular isoform in cancer vs normal sample from TCGA exon specific data. I found that all the exons of a particular isoform has different reads/RPKM value. What I was thinking-if a full length isoform is expressed in a particular cell, all the exons of this isoforms should be expressed at the same level. Please correct me if this is not true.

Another thing is that-if I want to compare the expression of a particular isoform in cancer vs normal cell, do I have to compare the total reads/RPKM from all the exons in each condition (cancer vs normal)?

RNA-Seq next-gen rna-seq genome isoform • 3.4k views

ADD COMMENT • link updated 8.7 years ago by Devon Ryan 104k • written 8.7 years ago by ruhulmb • 0

score 1 · Answer 1 · 2015-08-08

1

Entering edit mode

8.7 years ago

Devon Ryan 104k

There are a number of possible causes for what you observed:

1) Exons are shared between isoforms, so you could be seeing different relative isoform expression levels, which would thereby affect the exon metrics.

2) Different exons will have different GC content, which frequently affects metrics like this.

3) Depending on how the library construction was done and how good the sample integrity was, you'll often observe a gradual change in the number of reads aligning to a transcript toward its 3' end.

There are probably a few other possible causes.

Anyway, yes, if you want to compare isoforms then you would use the total metrics for it, which are likely provided by TCGA already.

ADD COMMENT • link 8.7 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you very much Devon for clarification. But how do I know that a particular exon reads come from a particular isoform? For example, my gene of interest has two isoforms, one isoform contains all the exons (exon 1 to 15), another contains 12 exons (exon 4 to 15). In that case, how can I distinguish the reads from exon 4-15 between two isoforms?

ADD REPLY • link 8.7 years ago by ruhulmb • 0

1

Entering edit mode

Usually TCGA provides isoform level metrics, so just use them. If your question is how one actually derives such metrics, typically an expectation maximization (or stochastic collapsed variational Bayes, in the case of Salmon) algorithm is employed.