cuffmerge does not give merged.gtf
1
0
Entering edit mode
9.6 years ago
Floydian_slip ▴ 170

Hi,

I am running cuffmerge to merge a few gtfs that I have created using cufflinks. This is the command I use:

cuffmerge -o merged -s genome.fa list.txt

list.txt contains the list of the gtf files that I want to merge with the relative path.

Here is the stdout (no error)

[Fri Oct  3 16:15:22 2014] Beginning transcriptome assembly merge
-------------------------------------------
[Fri Oct  3 16:15:22 2014] Preparing output location merged/
Warning: no reference GTF provided!
[Fri Oct  3 16:15:22 2014] Converting GTF files to SAM
[16:15:22] Loading reference annotation.
....
[16:15:24] Loading reference annotation.
[Fri Oct  3 16:15:24 2014] Assembling transcripts

But there are only these 2 directories in merged:

logs  tmp

There is no merged.gtf

Has anybody seen and solved this error? There is nothing alarming in run.log file either. This is how log file ends:

cufflinks -o merged/ -F 0.05 -q --overhang-tolerance 200 --library-type=transfrags -A 0.0 --min-frags-per-transfrag 0 --no-5-extend -p 1 merged/tmp/mergeSam_fileoCJV2Y

Any help will be appreciated.

Thanks

RNA-Seq Assembly sequencing alignment • 5.6k views
ADD COMMENT
0
Entering edit mode

Hi, please see my follow-up question below.

ADD REPLY
0
Entering edit mode

I am facing the same issue. Please help

ADD REPLY
0
Entering edit mode

Even if it appears to be the same issue it always helps to post the command line you are using and the exact error you are encountering. There is always something slightly different in each of these cases.

ADD REPLY
0
Entering edit mode

I am having the same problem- Cuffmerge isn't producing the merged gtf. I don't know what is going wrong. I am pretty sure I inputted everything correctly. What did you do to fix this?

ADD REPLY
0
Entering edit mode

Are you getting an error?

ADD REPLY
0
Entering edit mode

No, I am not getting an error. It runs for hours and then there is no gtf file- just the run log and empty temp file.

I have another question as well. When do I need to include the reference annotation and the reference sequence fasta? There is that option in Cufflinks too and I do not know when I should input that information.

Thanks.

ADD REPLY
0
Entering edit mode

See this for the answer to your reference question. Do you have a lot of samples? Are the GTF files large?

ADD REPLY
0
Entering edit mode

Thank you so much.

I have ten files in total but I have been running two just to test it out. One is about 15,000 KB and the other is 6,000 KB.

I am running this command:

cuffmerge -g genes.gtf -s genome.fa 

file 1 path

file 2 path

I was reading something and it said there had to be one file per line but I don't exactly know how to format it that way. I tried it a bunch of times, just written a little differently, and it didn't work at all.

How could I make an assemblies.txt file in an editor that I can add to my script? Maybe it will work that way. But based on how things have been going, I am not too sure.

What could be going wrong?

Thanks again for the help.

ADD REPLY
0
Entering edit mode

You make the assemblies.txt file in a text editor (use any you like, just save the file in text format).

So assemblies.txt file will have these contents.

/path_to/file1
/path_to/file2
/path_to/file3

Command would be

cuffmerge -g genes.gtf -s genome.fa assemblies.txt
ADD REPLY
0
Entering edit mode

Thank you. I just ran it and it worked!. The gtf file is in my folder.

What do these outputs mean though? There is quite a long list of these:

SAM error on line 104: found spliced alignment without XS attribute

Warning: couldn't find fasta record for 'chr1_GL456221_random'!

I just have a few more questions:

What is the difference between including the reference annotation and sequence in both cufflinks and cuffmerge as opposed to just including that information in cuffmerge? I am looking to find novel genes in my study, so will preexisting reference information be a hindrance?

How does an annotation file made by cufflinks differ from one included in a reference genome download.

Also, I plan to input this merged gtf file into Cuffdiff. Do I make one large file that includes the two different experimental groups I am comparing?

I am having trouble running Cuffdiff as well. I have been trying for a long time now. When I enter the command, this comes up:

You are using Cufflinks v2.2.1, which is the most recent release.

And then it produces a bunch of empty output files.

I don't know if I am formatting the command correctly or what. I have two groups of files (each group has 5 files in it). How do I format that command? I tried another text file but it said it doesn't recognize that file type.

Thank you for all of the help. I am very new at all of this.

ADD REPLY
0
Entering edit mode
9.6 years ago
Manvendra Singh ★ 2.2k

You probably need to check your genome.fa and list.txt files

Use the same genome.fa which you used to make bowtie index file, which you would have provided during mapping

list.txt should be new line separated gtf files with full path which you got from cufflinks runs.

If things are in this way, then cuffmerge should be running smoothly

I ran cuffmerge like this:

/usr/local/bin/cuffmerge \
  -p 4 \
  -o output \
  -g gtf_files/Human_gencodeV14_anno.gtf \
  -s ../../hg19.fa \
  --keep-tmp \
  gtf_assembly/gtf_assembly.txt
ADD COMMENT
0
Entering edit mode

Hi, thanks for the reply. I checked my list.txt file and did not find anything wrong with it. Here is what it looks like:

/full/path/to/1.transcripts.gtf
/full/path/to/2.transcripts.gtf
/full/path/to/3.transcripts.gtf
/full/path/to/4.transcripts.gtf

I also checked the genome.fa file and it is the same as I used for building transcripts. But still no merged.gtf! I only see logs and tmp dir in the output dir.

Here is the command:

cuffmerge -o merged -s genome.fa list.txt

So, is there anything else that I can do/fix?

Do I absolutely need to provide a reference GTF file? How does it affect the results? I am merging transcripts that are non-coding and therefore are not present in the reference GTF file.

Please help.

Best.

ADD REPLY
0
Entering edit mode

Things look fine.

Does your system have enough memory to run this?

Maybe you need to use multiple cores by giving -p parameter

ADD REPLY
0
Entering edit mode

Yeah, the gtf files that I am merging are very small so memory should not be an issues. I tried with -p and still no merged.gtf

Would supplying a reference gtf help? But then how would it affect the results if the transcripts that I am merging are non-coding and not present in the reference gtf?

Thanks

ADD REPLY
0
Entering edit mode

Providing GTF file helps in assigning nearest reference IDs to the assembled transcripts. It can make downstream analyses easy. I provide the GTF file for tophat as well (along with cufflinks and cuffmerge). It can help in avoiding assignment of multiple xloc IDs to same gene which can improve quantitation. It doesnt matter if your gene of interest is not in the GTF. A transcript will still be assembled provided tophat wasn't made to align to GTF file ONLY.

ADD REPLY
0
Entering edit mode

Okay thanks- I just have a few follow up questions.

What is the difference between including the reference information in both cufflinks and cuffmerge as opposed to just including that information in cuffmerge? I am looking to find novel genes in my study, so will preexisting reference information be a hindrance?

Will using the gtf in the alignment (I use hisat2) help improve lower alignment rates which occurred when just the gene fasta file was used as a reference?

ADD REPLY
0
Entering edit mode

The pre-existing GTF file should not be a hindrance. It just wont be able to assign any known gene information to some genes and transcripts. But those genes will still have the XLOC IDs and TCONS Ids and location information which you able to use. Inclusion of GTF should not increase alignment rate per my knowledge.

ADD REPLY
0
Entering edit mode

https://genomebiology.biomedcentral.com/articles/10.1186/gb-2011-12-3-r22 here you can find some information on including reference in cufflinks

ADD REPLY

Login before adding your answer.

Traffic: 1497 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6