Cufflinks & Cuffdiff Novel Isoform Detection
0
0
Entering edit mode
10.1 years ago

Hi,

I've got an RNA seq experiment with ~30M reads per sample and two sample types. There are 10 Biological replicates of SampleA and 6 Biological replicates of SampleB. (Human hg19)

I've ran the tuxedo pipeline, Tophat -> Cufflinks -> cuffmerge -> cuffdiff

Upon inspection of the potentially novel isoform list, I found one of interest. The first exon contained a sequence that was a predicted protein coding domain, and fit perfectly between a start and a stop. However, there was an extra ~34 bases in front that seemed out of place, but was identified by cufflinks/cuffdiff as part of the exon.

When looking at the coverage from every sample's bam files, it was clear that the sequence I suspected, and that fit perfectly, was the true sequence. So my question really is where could this extra 34 or so bases have come from when predicted by cufflinks/cuffdiff?

Could it be something to do with the RABT assembly using faux reads?

I've tried every combination of the switches on Cufflinks and every time that novel isoform is detected, the 34 bases are appended to the front of the exon, which doesn't make sense.

If anyone has any insights, I'd be very grateful!

Thanks

isoform cufflinks cuffdiff • 3.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 2962 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6