In TCGA RNA-seq exon specific data, why not all the exons of a particular isoform have similar read (RPKM)?
1
0
Entering edit mode
8.7 years ago
ruhulmb • 0

Hello,

I am a beginner in RNA-seq analysis. Recently, I was trying to analyze the differential expression of a particular isoform in cancer vs normal sample from TCGA exon specific data. I found that all the exons of a particular isoform has different reads/RPKM value. What I was thinking-if a full length isoform is expressed in a particular cell, all the exons of this isoforms should be expressed at the same level. Please correct me if this is not true.

Another thing is that-if I want to compare the expression of a particular isoform in cancer vs normal cell, do I have to compare the total reads/RPKM from all the exons in each condition (cancer vs normal)?

RNA-Seq next-gen rna-seq genome isoform • 3.4k views
ADD COMMENT
1
Entering edit mode
8.7 years ago

There are a number of possible causes for what you observed:

1) Exons are shared between isoforms, so you could be seeing different relative isoform expression levels, which would thereby affect the exon metrics.

2) Different exons will have different GC content, which frequently affects metrics like this.

3) Depending on how the library construction was done and how good the sample integrity was, you'll often observe a gradual change in the number of reads aligning to a transcript toward its 3' end.

There are probably a few other possible causes.

Anyway, yes, if you want to compare isoforms then you would use the total metrics for it, which are likely provided by TCGA already.

ADD COMMENT
0
Entering edit mode

Thank you very much Devon for clarification. But how do I know that a particular exon reads come from a particular isoform? For example, my gene of interest has two isoforms, one isoform contains all the exons (exon 1 to 15), another contains 12 exons (exon 4 to 15). In that case, how can I distinguish the reads from exon 4-15 between two isoforms?

ADD REPLY
1
Entering edit mode

Usually TCGA provides isoform level metrics, so just use them. If your question is how one actually derives such metrics, typically an expectation maximization (or stochastic collapsed variational Bayes, in the case of Salmon) algorithm is employed.

ADD REPLY
0
Entering edit mode

Thanks a lot Devon.

ADD REPLY

Login before adding your answer.

Traffic: 2579 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6