Question

finding significance between transcripts in a RNA-Seq experiment

0

Entering edit mode

7.7 years ago

Assa Yeroslaviz ★ 1.8k

I have a difficult experiment and I need some advice on how to proceed.

I have a data set of ~400 human rnaseq samples from different risk groups and as subgroup various abnormalities. I'm interested in finding out whether there is a significant difference in the expression of two specific transcripts of the same gene between two conditions. What I mean is, I would like to know if the difference of expression of TX1 between condition1 and condition2 is significantly higher/lower than the expression differences of TX2 between the same two conditions.

I don't know how to run this analysis!

What I have tried so far was as follow - I have done DESeq2, limma-voom and DEXSeq, but all they give me is the significance of a specific gene between two conditions. Even if I run it on transcript level, I can only find out if TX1 is significantly expressed between condition1 and condition2.

I have already asked a few questions (e.g. here, or here) about this experiment previously, but they didn't realy helped me much. I think my problematic was not understandable, so i hope i have made it clearer here.

for completeness, I have also run Kallisto (with bootstrapping) & sleuth and Salmon on the complete data set. I have now a list of the counts from Kallisto on transcript level and on gene level (imported by tximport) and the TPM values calculated by Kallisto and by Salmon. But I still can only compare one transcript between two conditions.

I can't seem to find a way to analyse the data in the way I need it and get the results for a comparison of two transcripts over two conditions.

I would your advice on how to analyse the data. Is there a statistical robust way to analyse this kind of data?

thanks

Assa

tximport deseq2 edger limma kallisto • 3.0k views

ADD COMMENT • link 7.7 years ago by Assa Yeroslaviz ★ 1.8k

1

Entering edit mode

The other approach would be the cufflinks approach - Calculate the Jansen-Shannon entropy of the transcript divergence between the transcript distributions.

ADD REPLY • link 7.7 years ago by i.sudbery 19k

0

Entering edit mode

I know cufflinks. what does this entropy value gives me?

ADD REPLY • link 7.6 years ago by Assa Yeroslaviz ★ 1.8k

0

Entering edit mode

The mixture of different isoform fractions for a gene form a distribution. If you have two different conditions you can calculate how different these distributions are using JS divergence. Thus if one isofrom is dominant in one condition, and a different isoform dominant in a different condition, then there will be a large JS divergence.

AFAIK there hasn't been a lot of benchmarking of this approach, but if its anything like cufflinks differential calling, I might be a bit wary of it. But you could possibly reimplement it to use TPMs or counts in R, I don't remember it being particularly difficult to calculate.

ADD REPLY • link 7.6 years ago by i.sudbery 19k

0

Entering edit mode

Plot two box plots of

Fold change values of TX1 vs TX2 in all samples of condition1
Fold change values of TX1 vs TX2 in all samples of condition2

You don't need to normalise the data here as you are looking for FC with in the same sample.

to get an idea if the mean FC change at all across different condition ?

ADD REPLY • link 7.7 years ago by GouthamAtla 12k

1

Entering edit mode

Basically what you want is a model which asks if the fold change in one transcript is predicted by the other.

or you want to test the contrast (Trans1CondA- Trans1CondB) - (Trans2CondA-Trans2CondB) =/= 0

Have a look at DEXSeq, it might help you with what you want - it basically asks if the fold change in one exon of a gene is predicted by the fold change of other exons for the gene.

ADD REPLY • link 7.7 years ago by i.sudbery 19k

0

Entering edit mode

Hi, thanks for the help. I have had a look already at DEXSeq. But as far as I can say, it doesn't really help me. It shows me if an exon ( and for that this are DEXSeq specific exons and can be only part of a biological exon AFAIK) is changed between two conditions, but not its relationship to a different transcript in the same condition or the changes happening between two comparisons as I need here. Am I missing something in DEXSeq?

ADD REPLY • link 7.6 years ago by Assa Yeroslaviz ★ 1.8k

0

Entering edit mode

When DEXSeq calls an exon as differential, it is not saying that that exon is different between conditions, it's saying that its different relative to the other exons in the gene. Thus if the whole gene goes up or down, none of the exons will be significant, however, if only a single exon does, while the rest of the exons in the gene doen't change, or, conversely if all an exon stays the same, while the rest of the exons in a gene change, then that exon will be called significant.

You might be able to put counts for transcripts instead of counts for exons into DEXSeq to get what you want. But i'd check with the authors on support.bioconductor.org first.

ADD REPLY • link 7.6 years ago by i.sudbery 19k

0

Entering edit mode

Is there a way to apply this kind of contrast matrix to either DESeq2 or limma? I have not yet found a way to add the transcript to this kind of analysis :-(

ADD REPLY • link 7.6 years ago by Assa Yeroslaviz ★ 1.8k

0

Entering edit mode

There is certainly no mechanism intended to allow you to do this.

ADD REPLY • link 7.6 years ago by i.sudbery 19k

0

Entering edit mode

Have you heard of RATS? It might be worth taking a look:

https://github.com/bartongroup/RATS

ADD REPLY • link 7.4 years ago by miyakokodama ▴ 20