hey guys,
i have a reference genome fasta name CDS.fa and a genes.gtf...i don't want to use the pre-build genome or transcriptome index...but i got confused that how would be the below command wen i don't have a pre-build index???
tophat2 -p 8 -G genes.gtf genome file.fastq
Which index? The genomic index or the transcriptomic one?
Devon,
I tried this command
but when tried for running tophat2...this is the result
I don't know why I cant run tophat here while using pre-build bowtie2 index from igenome tophat being run properly...
Do you know the reason please?
You need to specify
--transcriptome-index=transcriptome
every time. I haven't a clue why bowtie2 dies when you don't do this, you'd need to run it manually on genes.fa to find out (perhaps that would allow you to see a reasonable error message).CDS.fa for a genomic sequence file makes as much sense as
all_genomic_contigs.fa
for storing transcripts. SRR1944936 is from yeast, both genome and gtf files are available i.e from http://fungi.ensembl.org/ so checking that tophat2 works as Ben Langmed and co intended (not getting error posted above) for indexing transcriptomes is a 20min job max.Yeah, I was rather hoping that she didn't actually have CDS in CDS.fa :)
Devon,
actually my adviser asked me to replace the yeast whole genome fasta downloaded from igenome with orf_coding.fasta...