Cufflinks Skips Loci, Marks With Hidata
1
1
Entering edit mode
10.1 years ago
sanderrr010 ▴ 10

Hi Guys,

When I try to run cufflinks, with the command:

cufflinks --GTF /.../B0510_manual_reindexed_v2.gff --min-isoform-fraction 0.5 --pre-mrna-fraction 0.05 --max-intron-length 2000 --small-anchor-fraction 0.06 --min-intron-length 30 --overlap-radius 1 --3-overhang-tolerance 0 --intron-overhang-tolerance 0 --no-faux-reads -p 8 -o /.../cufflinks_out_V3/Apo12B/ /media/cinerea/BGI_RNAseq_V2/.../Apo12B/accepted_hits.bam

Cufflinks just skips a huge part (+- 3.4Mb) of a scaffold, at the following step:

You are using Cufflinks v2.1.1, which is the most recent release. [14:00:50] Loading reference annotation. [14:00:50] Inspecting reads and determining fragment length distribution. Processing Locus B0510_5C01:490546-492362 [ ] 0%

I tried to tweak the parameter --max-bundle-frags up and down, but this does not make any difference. In isoforms.fpkm_tracking the transcripts are marked with HIDATA. The reads seem fine at this locus.

What is wrong? any ideas?

EDIT: I inspected the -verbose logs, and I see that exactly this part that's being skipped, is taken by cufflinks as one big bundle, with 1M reads on it. I lowered the --max-bundle-length flag, but this does not seem to have any effect at all?

EDIT2: It filters the large bundle after the "processing-step" resulting in no outcome at all for the genes in that locus. Where does cufflinks get it's bundle sizes from? Can I adjust this?

cufflinks • 3.9k views
ADD COMMENT
0
Entering edit mode
10.1 years ago

You have to increase the --max-bundle-frags option to a large enough number. By default, Cufflinks will skip those transcripts/regions that have >1 million reads mapped to them. I usually use something like 10^9 to be on the safe side :-)

ADD COMMENT
0
Entering edit mode

It does not make any difference. When I adjust this paramer it does not have any influence on the result. Hence the number of reads on that locus is not too high, but I think the locus is too big. I want to know where Cufflinks get its bundle sizes from, and if I can change this.

ADD REPLY
0
Entering edit mode

OK. I think you need to increase the max bundle length then. Have you tried that? In the explanation above, you only wrote that you had lowered it. In a run I looked at yesterday I used 10 million for both the size and length flags and it worked in that case. I am not too sure how the bundles are defined, unfortunately.

ADD REPLY
0
Entering edit mode

Something really stupid was the cause of this all. Inside my Gff file, there was a gene of size 3,5MB... So therefore cufflinks takes it as one bundle. After all no cufflinks problem.

ADD REPLY
0
Entering edit mode

So that would have been solved by increasing --max-bundle-length to 10 million I guess.

ADD REPLY

Login before adding your answer.

Traffic: 1534 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6