Biostar Beta. Not for public use.
Question: Why Cuffdiff Output Not Significant?
0
Entering edit mode

Hello,

I am performing RNAseq analysis using TopHat and Cufflinks programs. I have a question for why a gene I am interested is being called Not Significant by Cuffdiff.

Here the output for my gene of interest Rh4:

gene sample_1 sample_2 status value_1 value_2 log2(fold_change) test_stat p_value q_value signific

Rh4 WT MUT OK 1158.87 0.856892 -10.4013 -4.12026 0.1178 0.404333 no

I am not familiar with the statistics Cuffdiff uses to call my gene of interest Not Significant. Can someone explain it to me?

Thank you very much.

Rodrigo

ADD COMMENTlink 4.7 years ago rodrigo.nicodemos82 • 10 • updated 4.6 years ago Biostar 20
Entering edit mode
0

what's the variance, do you have any replicates?

ADD REPLYlink 4.7 years ago
Michael Dondrup
46k
Entering edit mode
0

As Michael says, Variance or replicates will most likely be your issue. Have you thought about plotting the FPKM values of each replicate to see how variable it is? - You can use cummeRbund to read in a cuffdiff output and extract the FPKM values.

ADD REPLYlink 4.7 years ago
andrew.j.skelton73
5.7k
Entering edit mode
0

The FPKM values for the replicates were: WT_1: 1245.24, WT_2: 874.11 and MUT_1: 0.897, MUT_2: 0.764, MUT_3: 0.823. The third replicate for the WT condition gave very few reads, which is why I didn't include it. I only used two replicates from the MUT condition to compare to the two replicates from the WT condition. I am not convinced it is a replicate issue or a variance issue, since the trend it pretty obvious to me, but Cuffdiff it not picking it up statistically.

ADD REPLYlink 4.7 years ago
rodrigo.nicodemos82
• 10
Entering edit mode
0

Have you ran your results through DESeq2? - DESeq2 has a function (I think it's called a Cook's distance), to figure out if samples have too high a variance for a given condition, thus it excludes them. The other thing is to pull out a few of the "Top Hits" from your analysis and see what their FPKM's are like, see if the variance of your wild type fits with the difference your seeing in the case above.

ADD REPLYlink 4.7 years ago
andrew.j.skelton73
5.7k
Entering edit mode
0

this is actually an interesting situation - I agree with the OP that the gene should come up as differentially expressed - but the way this and probably all other tools work is a bit mysterious and frankly occasionally plain wrong.

Even before multiple comparison testing the p value is already huge 0.1

I have a hard time believing that getting a 1245 and 874 versus 0.9 and 0.7 could possibly be that likely to obtain.

Caveat emptor before buying into any tool.

ADD REPLYlink 4.6 years ago
Istvan Albert
80k
Entering edit mode
0

That seems pretty wacky, but we're only seeing 1 gene. In general the p-values you see will depend also on the rest of the population (though that gene is very extreme). The numbers above suggest that you have highly scattered data - that your wt and mutants are very different from each other overall (is this true?). The second thing is that Cuffdiff works by summing the values of all transcripts from the locus (which makes for odd numbers sometime). Is this a complicated locus? Do you have the individual transcript values? Have you looked at the read mappings in a genome browser? Any artifacts going on?

ADD REPLYlink 4.6 years ago
seidel
6.8k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0