Biostar Beta. Not for public use.
Question: Differential gene expression based on "exon" or "CDS"?
1
Entering edit mode

Hi all,

When we talk about differential gene expression analysis using RNA-seq data, what we actually evaluate is the expression of "exon" or "CDS"?

enter image description here

ADD COMMENTlink 2.4 years ago SMILE • 100 • updated 2.4 years ago Devon Ryan 90k
Entering edit mode
1

Exons. Differential expression analysis utilizes gene annotation file which contains full length transcripts, including 5' and 3' UTRs.

ADD REPLYlink 2.4 years ago
Satyajeet Khare
♦ 1.4k
Entering edit mode
0

Since in the annotation gtf file I see "exon" and "CDS". I thought edfferential expressed gene analysis could be based on CDS, which is more meaningful in my understanding

ADD REPLYlink 2.4 years ago
SMILE
• 100
1
Entering edit mode

Exons, since that's really what RNA is composed of.

ADD COMMENTlink 2.4 years ago Devon Ryan 90k
Entering edit mode
0

Since in the annotation gtf file I see "exon" and "CDS".

chr1 unknown exon 3214482 3216968 …...

chr1 unknown stop_codon 3216022 3216024 …...

chr1 unknown CDS 3216025 3216968 …...

chr1 unknown CDS 3421702 3421901 …...

chr1 unknown exon 3421702 3421901 …...

Can I use GTF.featureType="CDS" to test the differential expressed genes based on CDS? Is this acceptable? Does this behavior have big flaws? (In my underanding, CDS is more meaningful, since differential expression at the CDS level indicates potentially different protein outputs...)

ADD REPLYlink 2.4 years ago
SMILE
• 100
Entering edit mode
0

You should use exons not CDS. As Devon explained RNA is composed of exons. Only a subset of these exons are coding.

ADD REPLYlink 2.4 years ago
Nicolas Rosewick
7.7k
Entering edit mode
0

I am not sure "RNA is composed of exons" is true after reading this paper. I don't want to be rude. I just want to know whether my understanding is right.

ADD REPLYlink 2.4 years ago
SMILE
• 100
Entering edit mode
0

Please brush up on your understanding of the central dogma of molecular biology.

ADD REPLYlink 2.4 years ago
Devon Ryan
90k
Entering edit mode
0

The biology is too complex...

I know that the eukaryotic genomes are pervasively transcribed, and that the majority of its bases can be found in primary transcripts, including non-protein-coding transcripts, and many biological processes utilize and require noncoding RNAs.

But when we talk about measuring gene expression level, isn't it based on the assumption that more abundant genes/transcripts are more important and that gene expression levels correspond to protein levels? Since comprehensive protein level measurement is hard, we measure gene expression level based on mRNA instead. And be selectively blind to the fact that sometimes "mRNA level"≠"Protein level".

After reading this paper RNA-seq data analysis at the gene and CDS levels provides a comprehensive view of transcriptome responses induced by 4-hydroxynonenal I found it reasonable to measure CDS.

Please correct me if my understanding is wrong.

ADD REPLYlink 2.4 years ago
SMILE
• 100
Entering edit mode
0

If you want to work with bioinformatics then the phrase "the biology is too complex" needs to not be in your vocabulary.

ADD REPLYlink 2.4 years ago
Devon Ryan
90k
Entering edit mode
0

Sorry about the misunderstanding caused by my poor English, I am not meaning bad.

What I mean when I say biology is too complex, is that when we analyze some biology problems, sometimes we have to do some compromise. For example, when we talk about measuring gene expression level, it is based on the assumption that more abundant genes/transcripts are more important and that gene expression levels correspond to protein levels (please correct me if my understanding is wrong). And be selectively blind to the fact that sometimes "mRNA level"≠"Protein level".

After reading this paper RNA-seq data analysis at the gene and CDS levels provides a comprehensive view of transcriptome responses induced by 4-hydroxynonenal I found it reasonable to measure CDS.

So I ask this question. ^^

ADD REPLYlink 2.4 years ago
SMILE
• 100
Entering edit mode
0

You're misunderstanding what that paper did. They didn't do "differential expression based on CDS", they actually did differential isoform usage after collapsing isoforms with compatible CDSs. That's a completely different thing. "Differential expression" by itself always uses exons. Anything else needs different terms and will use very different tools (namely, for differential isoform usage or differential CDS usage one would use salmon or kallisto rather than something like featureCounts/STAR). The wording used in the paper you referenced is terrible, the referees should have had that fixed.

ADD REPLYlink 2.4 years ago
Devon Ryan
90k
Entering edit mode
0

I don't agree with what you said about differntial isform/CDS usage. Differential isoform expression (DIE) and differential isoform usage (DIU) are related but distinct concepts. DIE assesses the difference of absolute expression in isoform level. In contrast, DIU assesses the difference of relative expression in isoform level. This is because a gene may have higher or lower expression overall, and it may also switch the usage of some RNA isoforms. For example, a gene may predominately use one isoform in one tissue and switch to another isoform in another tissue. Such relative expression of an RNA isoform is referred to the isoform usage. Reference1 Reference2 Reference3

ADD REPLYlink 2.4 years ago
SMILE
• 100

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0