Question

RNA-SeQC error no output

1

Entering edit mode

9.9 years ago

Parimala Devi ▴ 100

I am running RNA-SeQC on RNA-seq data. The following is the command.

Command:

java -jar ~/SPRING-SUMMER_2014/Softwares/RNA-SeQC_v1.1.7.jar \
    -r Saccer3_genome.fa \
    -o ~/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/ \
    -s C1W8_8hr_PE_output_soap_BAM_sorted.bam \
    -t ~/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/Sac_cerevisiae.gtf

Output:

RNA-SeQC v1.1.7 05/14/12
Creating rRNA Interval List based on given GTF annotations
java.lang.ArrayIndexOutOfBoundsException: 1
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics$MetricSample.readInSamples(RNASeqMetrics.java:1369)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.prepareFiles(RNASeqMetrics.java:182)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.execute(RNASeqMetrics.java:165)
    at org.broadinstitute.cga.rnaseq.RNASeqMetrics.main(RNASeqMetrics.java:135)
RNA-SeQC Total Runtime:    0 min

There's no output file.

quality-control RNA-Seq RNA-seQC • 7.6k views

ADD COMMENT • link updated 9 months ago by Ram 43k • written 9.9 years ago by Parimala Devi ▴ 100

0

Entering edit mode

I have the same issue how did you resolve?

ADD REPLY • link 8.1 years ago by jack1 ▴ 10

0

Entering edit mode

I found it does work if you use the previous release (21) of the annotation file. It is not ideal, but it is still GRCh38

ADD REPLY • link 8.1 years ago by ceres.fernandez.rozadilla • 0

0

Entering edit mode

sorry, this is the answer to another issue below

ADD REPLY • link 8.1 years ago by ceres.fernandez.rozadilla • 0

Ram · Answer 1 · 2014-06-10

1

Entering edit mode

9.9 years ago

Jeremy Leipzig 22k

are your chromosomes consistently named 1,2,3... in all reference files?

ADD COMMENT • link 9.9 years ago by Jeremy Leipzig 22k

0

Entering edit mode

They're numbered in roman numerals.

ADD REPLY • link 9.9 years ago by Parimala Devi ▴ 100

1

Entering edit mode

Looks like some file is looking for a "1".

What is the output of:

head Saccer3_genome.fa ~/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/Sac_cerevisiae.gtf
samtools view -H C1W8_8hr_PE_output_soap_BAM_sorted.bam

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Jeremy Leipzig 22k

0

Entering edit mode

This is the output:

head Saccer3_genome.fa ~/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/Sac_cerevisiae.gtf

==> Saccer3_genome.fa <==
>chrI
CCACACCACACCCACACACCCACACACCACACCACACACCACACCACACC
CACACACACACATCCTAACACTACCCTAACACAGCCCTAATCTAACCCTG
GCCAACCTGTCTCTCAACTTACCCTCCATTACCCTGCCTCCACTCGTTAC
CCTGTCCCATTCAACCATACCACTCCGAACCACCATCCATCCCTCTACTT
ACTACCACTCACCCACCGTTACCCTCCAATTACCCATATCCAACCCACTG
CCACTTACCCTACCATTACCCTACCATCCACCATGACCTACTCACCATAC
TGTTCTTCTACCCACCATATTGAAACGCTAACAAATGATCGTAAATAACA
CACACGTGCTTACCCTACCACTTTATACCACCACCACATGCCATACTCAC
CCTCACTTGTATACTGATTTTACGTACGCACACGGATGCTACAGTATATA

==> /home/bioratcliff/SPRING-SUMMER_2014/RNA-seq/RNA-seq_C1W8_time_work_bench/Sac_cerevisiae.gtf <==
I    protein_coding    CDS    335    646    .    +    0    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; protein_id "YAL069W"; transcript_id "YAL069W"; transcript_name "YAL069W"; tss_id "TSS1128";
I    protein_coding    exon    335    649    .    +    .    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; seqedit "false"; transcript_id "YAL069W"; transcript_name "YAL069W"; tss_id "TSS1128";
I    protein_coding    start_codon    335    337    .    +    0    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; transcript_id "YAL069W"; transcript_name "YAL069W"; tss_id "TSS1128";
I    protein_coding    CDS    538    789    .    +    0    exon_number "1"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; p_id "P5377"; protein_id "YAL068W-A"; transcript_id "YAL068W-A"; transcript_name "YAL068W-A"; tss_id "TSS5439";
I    protein_coding    exon    538    792    .    +    .    exon_number "1"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; p_id "P5377"; seqedit "false"; transcript_id "YAL068W-A"; transcript_name "YAL068W-A"; tss_id "TSS5439";
I    protein_coding    start_codon    538    540    .    +    0    exon_number "1"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; p_id "P5377"; transcript_id "YAL068W-A"; transcript_name "YAL068W-A"; tss_id "TSS5439";
I    protein_coding    stop_codon    647    649    .    +    0    exon_number "1"; gene_id "YAL069W"; gene_name "YAL069W"; p_id "P3633"; transcript_id "YAL069W"; transcript_name "YAL069W"; tss_id "TSS1128";
I    protein_coding    stop_codon    790    792    .    +    0    exon_number "1"; gene_id "YAL068W-A"; gene_name "YAL068W-A"; p_id "P5377"; transcript_id "YAL068W-A"; transcript_name "YAL068W-A"; tss_id "TSS5439";
I    protein_coding    exon    1807    2169    .    -    .    exon_number "1"; gene_id "YAL068C"; gene_name "PAU8"; p_id "P6023"; seqedit "false"; transcript_id "YAL068C"; transcript_name "PAU8"; tss_id "TSS249";
I    protein_coding    stop_codon    1807    1809    .    -    0    exon_number "1"; gene_id "YAL068C"; gene_name "PAU8"; p_id "P6023"; transcript_id "YAL068C"; transcript_name "PAU8"; tss_id "TSS249";

Output for

samtools view -H C1W8_8hr_PE_output_soap_BAM_sorted.bam

@HD    VN:1.3    SO:coordinate
@SQ    SN:Y55.chr10    LN:770597
@SQ    SN:Y55.chrm    LN:107061
@SQ    SN:Y55.chr01    LN:248261
@SQ    SN:Y55.chr11    LN:686124
@SQ    SN:Y55.scplasm1    LN:7602
@SQ    SN:Y55.chr02    LN:800992
@SQ    SN:Y55.chr12    LN:1067059
@SQ    SN:Y55.chr03    LN:321691
@SQ    SN:Y55.chr13    LN:923317
@SQ    SN:Y55.chr04    LN:1522688
@SQ    SN:Y55.chr14    LN:781629
@SQ    SN:Y55.chr05    LN:577152
@SQ    SN:Y55.chr15    LN:1105914
@SQ    SN:Y55.chr06    LN:273660
@SQ    SN:Y55.chr16    LN:946183
@SQ    SN:Y55.chr07    LN:1113452
@SQ    SN:Y55.chr08    LN:566494
@SQ    SN:Y55.chr09    LN:467776

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Parimala Devi ▴ 100

1

Entering edit mode

see the difference?

chrI != I

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.9 years ago by Jeremy Leipzig 22k

Ram · Answer 2 · 2014-10-02

0

Entering edit mode

9.6 years ago

Tim Amos ▴ 20

The error message says there is a problem with "readInSamples"

The correct use of the -s option is, according to http://www.broadinstitute.org/cancer/cga/rnaseqc_run :

-s <arg>

Sample File: tab-delimited description of samples and their bams. This file header is:
Sample ID Bam File Notes
When running on just one sample, this argument can be a string of the form
"Sample ID|Bam File|Notes", where Bam File is the path to the input file.

i.e.:

-s "C1W8_8hr_PE|C1W8_8hr_PE_output_soap_BAM_sorted.bam|NA"

rather than:

-s C1W8_8hr_PE_output_soap_BAM_sorted.bam

I got this error by accidentally having '\t' in my tab-delimited samples file rather than a literal tab. My mistake was to use:

echo "Sample ID\tBam File\tNotes" > ${OUTDIR}/Samples.txt

rather than including the -e option:

echo -e "Sample ID\tBam File\tNotes" > ${OUTDIR}/Samples.txt

ADD COMMENT • link updated 4.3 years ago by Ram 43k • written 9.6 years ago by Tim Amos ▴ 20

0

Entering edit mode

Hi Timothy,

It appears that I have very similar/the same problem as the person about. I tried all the answers above and I still can't get to work (my RNA-SeQC run). Would you be able to elaborate a bit more on the string under -s flag?

This is my actual bam file name cfDMG2_ACTGAT_sorted_reordered_removed_dups.bam. And this is now I'm inputting it in RNA-SeQC run

-s "TestID|cfDMG2_ACTGAT_sorted_reordered_removed_dups.bam|NA"

And I get this error

The required transcript_id attribute was not found on line 1

Here is verbose look of what I did

rna-seqc -t HomoSapiensH38.gtf -r HomoSapiensH38.fa -o outDir -s "TestID|cfDMG2_ACTGAT_sorted_reordered_removed_dups.bam|NA"

RNA-SeQC v1.1.8.1 07/11/14
Creating rRNA Interval List based on given GTF annotations
Retriving contig names from reference
         contig names in reference: 194
Loading GTF for Read Counting
The required transcript_id attribute was not found on line 1    havana  gene    11869   14409   .       +       .       gene_id "ENSG00000223972"; gene_version "5"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene";

Thanks,

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 9.0 years ago by Kirill Tsyganov ▴ 370

0

Entering edit mode

I think your command line is ok. The problem is your GTF file does not have transcript_id attribute. Try use one from GENCODE.

ADD REPLY • link updated 21 months ago by Ram 43k • written 9.0 years ago by yyaobo ▴ 30

Ram · Answer 3 · 2015-04-27

I got it all working by reverting back to assembly and annotation Ensembl 37. I did source my fasta and gtf files from GENCODE, but I think it didn't matter in the end, as long as I used other assembly, not the latest one. I'm pretty sure RNA-SeQC breaks when the latest assembly - Ensembl 38 used, as it was build when Ensembl 37 was the latest, but the gtf format has changed in Ensembl 38. I raised an issue on github RNA-SeQC error no output , but it doesn't look they check they page. I wanted to write to them directly, but couldn't find the best email to contact.

If you or anyone else knows the best contact email regarding RNA-SeQC please let me know. And also if you or anyone else have any thoughts on RNA-SeQC not working with the latest assembly - Ensembl 38, please comment on github issue page or here.

Thanks,
Kirill

score 0 · Answer 4 · 2017-08-30

0

Entering edit mode

6.6 years ago

Isaac C. Joseph ▴ 90

Also resolved my issue by making sure my Sample File was in the correct format and referred to actually extant .bam files. According to the input spec:

-s <arg>

Sample File: tab-delimited description of samples and their bams.

This file header is: Sample ID[tab]Bam File[tab]Notes

ADD COMMENT • link 6.6 years ago by Isaac C. Joseph ▴ 90