indicizing genome in tophat2
1
0
Entering edit mode
8.6 years ago
zizigolu ★ 4.3k

hey guys,

i have a reference genome fasta name CDS.fa and a genes.gtf...i don't want to use the pre-build genome or transcriptome index...but i got confused that how would be the below command wen i don't have a pre-build index???

tophat2 -p 8 -G genes.gtf genome file.fastq

tophat2 • 2.1k views
ADD COMMENT
0
Entering edit mode

Which index? The genomic index or the transcriptomic one?

ADD REPLY
0
Entering edit mode

Devon,

I tried this command

[izadi@lbox161 bowtie2-2.2.5]$ $TOP/tophat2 -G genes.gtf --transcriptome-index=transcriptome CDS

[2015-09-14 13:10:32] Building transcriptome files with TopHat v2.1.0
-----------------------------------------------
[2015-09-14 13:10:32] Checking for Bowtie
          Bowtie version:     2.2.5.0
[2015-09-14 13:10:32] Checking for Bowtie index files (transcriptome)..
[2015-09-14 13:10:32] Checking for Bowtie index files (genome)..
[2015-09-14 13:10:32] Checking for reference FASTA file
[2015-09-14 13:10:32] Using pre-built transcriptome data..
-----------------------------------------------
[2015-09-14 13:10:32] Transcriptome files prepared. This was the only task requested.

but when tried for running tophat2...this is the result

[izadi@lbox161 bowtie2-2.2.5]$ $TOP/tophat2 -p 8 -G genes.gtf -o outtt CDS SRR1944936.fastq

[2015-09-14 13:12:39] Beginning TopHat run (v2.1.0)
-----------------------------------------------
[2015-09-14 13:12:39] Checking for Bowtie
          Bowtie version:     2.2.5.0
[2015-09-14 13:12:39] Checking for Bowtie index files (genome)..
[2015-09-14 13:12:39] Checking for reference FASTA file
[2015-09-14 13:12:39] Generating SAM header for CDS
[2015-09-14 13:12:40] Reading known junctions from GTF file
[2015-09-14 13:12:40] Preparing reads
     left reads: min. length=25, max. length=47, 22825804 kept reads (4 discarded)
[2015-09-14 13:14:51] Building transcriptome data files outtt/tmp/genes
[2015-09-14 13:14:52] Building Bowtie index from genes.fa
    [FAILED]
Error: Couldn't build bowtie index with err = 1

I don't know why I cant run tophat here while using pre-build bowtie2 index from igenome tophat being run properly...

Do you know the reason please?

ADD REPLY
1
Entering edit mode

You need to specify --transcriptome-index=transcriptome every time. I haven't a clue why bowtie2 dies when you don't do this, you'd need to run it manually on genes.fa to find out (perhaps that would allow you to see a reasonable error message).

ADD REPLY
1
Entering edit mode

CDS.fa for a genomic sequence file makes as much sense as all_genomic_contigs.fa for storing transcripts. SRR1944936 is from yeast, both genome and gtf files are available i.e from http://fungi.ensembl.org/ so checking that tophat2 works as Ben Langmed and co intended (not getting error posted above) for indexing transcriptomes is a 20min job max.

ADD REPLY
0
Entering edit mode

Yeah, I was rather hoping that she didn't actually have CDS in CDS.fa :)

ADD REPLY
0
Entering edit mode

Devon,

actually my adviser asked me to replace the yeast whole genome fasta downloaded from igenome with orf_coding.fasta...

ADD REPLY
2
Entering edit mode
8.6 years ago

Don't use tophat to align against the CDS, the results won't be worthwhile. If your advisor didn't know this then he/she should stop giving advice.

Edit: "Not worthwhile" is probably too harsh. I should have instead written that the process would be a waste of time. Tophat2 is meant to align reads in a spliced manner. When one aligns against the CDS, one explicitly doesn't want spliced alignments.

ADD COMMENT
0
Entering edit mode

thank you, your comment will save me from getting mad more!

ADD REPLY

Login before adding your answer.

Traffic: 3030 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6