Hi everyone,
I'm running RSEM on a sample using a de novo assembly where part of the contigs were removed because it were duplicates (I used CD-HIT_EST clustering method)...Now it appears that this does not work. It does work when I use the original assembly fasta file. Is there a reason for this?
These is part of the error I get when using the assembly where duplicates were removed:
RSEM can not recognize reference sequence name 183617!
Error, cmd: rsem-calculate-expression --paired-end -p 4 --no-bam-output --bam FCH7F53BBXX-HKISCsixEAAARAAPEI-208_L7_one.bowtie.bam /ddn1/vol1/site_scratch/leuven/315/vsc31552/Trinity95.fasta.RSEM FCH7F53BBXX-HKISCsixEAAARAAPEI-208_L7_one died with ret: 65280 at /data/leuven/315/vsc31552/Trinity/trinityrnaseq-2.2.0/util/align_and_estimate_abundance.pl line 743.
My script is the following:
/data/leuven/315/vsc31552/Trinity/trinityrnaseq-2.2.0/util/align_and_estimate_abundance.pl --seqType fq \
--left /scratch/leuven/315/vsc31552/RNAseq/F16FTSEUHT0283ISCsixE/cleanreads/P-02/FCH7F53BBXX-HKISCsixEAABRAAPEI-209_L7_forward_paired.fq.gz --right /scratch/leuven/315/vsc31552/RNAseq/F16FTSEUHT0283ISCsixE/cleanreads/P-02/FCH7F53BBXX-HKISCsixEAABRAAPEI-209_L7_reverse_paired.fq.gz \
--transcripts /scratch/leuven/315/vsc31552/RNAseq/Trinity95.fasta \
--est_method RSEM --aln_method bowtie \
--trinity_mode --prep_reference --output_dir /scratch/leuven/315/vsc31552/RNAseq/F16FTSEUHT0283ISCsixE/cleanreads/P-02/
echo "================="
exit 0
Thanks in advance!
Janne
is 183617 in your Trinity95.fasta? the wrapper script should have prepped/indexed your reference sequence (given you used the parameter), but 183617 is not a Trinity fasta header, and you are specifying 'trinity mode'. Did you fasta headers change?