Question

indicizing genome in tophat2

0

Entering edit mode

8.6 years ago

zizigolu ★ 4.3k

hey guys,

i have a reference genome fasta name CDS.fa and a genes.gtf...i don't want to use the pre-build genome or transcriptome index...but i got confused that how would be the below command wen i don't have a pre-build index???

tophat2 -p 8 -G genes.gtf genome file.fastq

tophat2 • 2.1k views

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by zizigolu ★ 4.3k

0

Entering edit mode

Which index? The genomic index or the transcriptomic one?

ADD REPLY • link 8.6 years ago by Devon Ryan 104k

0

Entering edit mode

Devon,

I tried this command

[izadi@lbox161 bowtie2-2.2.5]$ $TOP/tophat2 -G genes.gtf --transcriptome-index=transcriptome CDS

[2015-09-14 13:10:32] Building transcriptome files with TopHat v2.1.0
-----------------------------------------------
[2015-09-14 13:10:32] Checking for Bowtie
          Bowtie version:     2.2.5.0
[2015-09-14 13:10:32] Checking for Bowtie index files (transcriptome)..
[2015-09-14 13:10:32] Checking for Bowtie index files (genome)..
[2015-09-14 13:10:32] Checking for reference FASTA file
[2015-09-14 13:10:32] Using pre-built transcriptome data..
-----------------------------------------------
[2015-09-14 13:10:32] Transcriptome files prepared. This was the only task requested.

but when tried for running tophat2...this is the result

[izadi@lbox161 bowtie2-2.2.5]$ $TOP/tophat2 -p 8 -G genes.gtf -o outtt CDS SRR1944936.fastq

[2015-09-14 13:12:39] Beginning TopHat run (v2.1.0)
-----------------------------------------------
[2015-09-14 13:12:39] Checking for Bowtie
          Bowtie version:     2.2.5.0
[2015-09-14 13:12:39] Checking for Bowtie index files (genome)..
[2015-09-14 13:12:39] Checking for reference FASTA file
[2015-09-14 13:12:39] Generating SAM header for CDS
[2015-09-14 13:12:40] Reading known junctions from GTF file
[2015-09-14 13:12:40] Preparing reads
     left reads: min. length=25, max. length=47, 22825804 kept reads (4 discarded)
[2015-09-14 13:14:51] Building transcriptome data files outtt/tmp/genes
[2015-09-14 13:14:52] Building Bowtie index from genes.fa
    [FAILED]
Error: Couldn't build bowtie index with err = 1

I don't know why I cant run tophat here while using pre-build bowtie2 index from igenome tophat being run properly...

Do you know the reason please?

ADD REPLY • link updated 19 months ago by Ram 43k • written 8.6 years ago by zizigolu ★ 4.3k

1

Entering edit mode

You need to specify --transcriptome-index=transcriptome every time. I haven't a clue why bowtie2 dies when you don't do this, you'd need to run it manually on genes.fa to find out (perhaps that would allow you to see a reasonable error message).

ADD REPLY • link 8.6 years ago by Devon Ryan 104k

1

Entering edit mode

CDS.fa for a genomic sequence file makes as much sense as all_genomic_contigs.fa for storing transcripts. SRR1944936 is from yeast, both genome and gtf files are available i.e from http://fungi.ensembl.org/ so checking that tophat2 works as Ben Langmed and co intended (not getting error posted above) for indexing transcriptomes is a 20min job max.

ADD REPLY • link updated 19 months ago by Ram 43k • written 8.6 years ago by Darked89 4.6k

0

Entering edit mode

Yeah, I was rather hoping that she didn't actually have CDS in CDS.fa :)

ADD REPLY • link 8.6 years ago by Devon Ryan 104k

0

Entering edit mode

Devon,

actually my adviser asked me to replace the yeast whole genome fasta downloaded from igenome with orf_coding.fasta...

ADD REPLY • link 8.6 years ago by zizigolu ★ 4.3k

Ram · Accepted Answer · 2015-09-15

2

Entering edit mode

8.6 years ago

Devon Ryan 104k

Don't use tophat to align against the CDS, the results won't be worthwhile. If your advisor didn't know this then he/she should stop giving advice.

Edit: "Not worthwhile" is probably too harsh. I should have instead written that the process would be a waste of time. Tophat2 is meant to align reads in a spliced manner. When one aligns against the CDS, one explicitly doesn't want spliced alignments.

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.6 years ago by Devon Ryan 104k

0

Entering edit mode

thank you, your comment will save me from getting mad more!

ADD REPLY • link 8.6 years ago by zizigolu ★ 4.3k