HTSeq count error
1
0
Entering edit mode
9.6 years ago
EagleEye 7.5k

I am getting following error while using htseq-count over BAM file.

Command used

htseq-count -f bam -s no -a 15 -t exon -m intersection-strict input.sorted_genome_alignments.bam gencode.gtf > output.sorted_genome_alignments.htseq

Input first line:

INPUT_121:8:2205:3570:86937/2    153    chr1    10534    58    50M    *    0    0    AGTACCACCGAAATCTGTGCAGAGGAGAACGCAGCTCCGCCCTCGCGGTG    CC>2B??DDCCCCEECCDDFHEFHEHJIGGDGEGF@DADHFGFFDFD@@@    RG:Z:110912_UNC15-SN850_0121_AD0E16ABXX_8_    IH:i:7    HI:i:1    NM:i:1

Error as follows

100000 GFF lines processed.
2500000 GFF lines processed.
2536790 GFF lines processed.
Error occured when processing SAM input (record #1 in file input.sorted_genome_alignments.bam):
tid -1 out of range 0<=tid<25
[Exception type: ValueError, raised in csamtools.pyx:897]

Please let me know if there is any solution exists.

htseq-count software-error RNA-Seq • 9.3k views
ADD COMMENT
0
Entering edit mode

Can you post the entire header of that file? Just samtools view -H input.sorted_genome_alignments.bam. The error relates to seeing an alignment to a chromosome that it doesn't think is in the header.

Also, please mention which version of htseq-count

ADD REPLY
0
Entering edit mode

I am using HTSeq version 0.6.1p1.

@HD    VN:1.0    SO:unsorted
@SQ    SN:chr1    LN:249250621
@SQ    SN:chr10    LN:135534747
@SQ    SN:chr11    LN:135006516
@SQ    SN:chr12    LN:133851895
@SQ    SN:chr13    LN:115169878
@SQ    SN:chr14    LN:107349540
@SQ    SN:chr15    LN:102531392
@SQ    SN:chr16    LN:90354753
@SQ    SN:chr17    LN:81195210
@SQ    SN:chr18    LN:78077248
@SQ    SN:chr19    LN:59128983
@SQ    SN:chr2    LN:243199373
@SQ    SN:chr20    LN:63025520
@SQ    SN:chr21    LN:48129895
@SQ    SN:chr22    LN:51304566
@SQ    SN:chr3    LN:198022430
@SQ    SN:chr4    LN:191154276
@SQ    SN:chr5    LN:180915260
@SQ    SN:chr6    LN:171115067
@SQ    SN:chr7    LN:159138663
@SQ    SN:chr8    LN:146364022
@SQ    SN:chr9    LN:141213431
@SQ    SN:chrM_rCRS    LN:16569
@SQ    SN:chrX    LN:155270560
@SQ    SN:chrY    LN:59373566
@RG    ID:110912_UNC15-SN850_0121_AD0E16ABXX_8_    PL:illumina    PU:barcode    LB:TruSeq    SM:110912_UNC15-SN850_0121_AD0E16ABXX_8_
ADD REPLY
0
Entering edit mode

Hello Santhilal Subhash!

It appears that your post has been cross-posted to another site: SeqAnswers.

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLY
0
Entering edit mode

Sorry I have redirected it to biostars.

ADD REPLY
1
Entering edit mode
9.6 years ago

Oh, I missed the -1 part of that error message. htseq-count doesn't play well with unmapped reads. Just filter them out or:

samtools view -F 4 input.sorted_genome_alignments.bam | htseq-count -f bam -s no -a 15 -t exon -m - intersection-strict gencode.gtf > output.sorted_genome_alignments.htseq
ADD COMMENT
1
Entering edit mode

The hyphen u mentions was after '-m -'.... but it should be after '-m intersection-strict -'

you meant like this rite !!!

samtools view -F 4 input.sorted_genome_alignments.bam | htseq-count -f bam -s no -a 15 -t exon -m intersection-strict **-** gencode.gtf > output.sorted_genome_alignments.htseq
ADD REPLY
0
Entering edit mode

I have tried it ... it seems like htseq-count not getting input from pipe.

Error occured when reading beginning of SAM/BAM file.
Argument must be string or unicode.
[Exception type: TypeError, raised in csamtools.pyx:50]
ADD REPLY
0
Entering edit mode

Try adding -h to the samtools view command (I wrote scripts to automate this sort of thing long enough ago that my memory of the exact details of doing this manually is a bit fuzzy).

ADD REPLY
0
Entering edit mode

It gives the same error even after adding '-h'.

ADD REPLY
1
Entering edit mode

That's a first. You can always just

samtools view -bF 4 input.sorted_genome_alignments.bam > filtered.bam

and use the result, though that's annoying. I think featureCounts is OK with the unmapped reads.

ADD REPLY
0
Entering edit mode

I don't wanted to go for it becoz I have more than 1000 BAM files ... is there any other option ?

ADD REPLY
1
Entering edit mode

featureCounts, it's faster anyway.

ADD REPLY
0
Entering edit mode

I have never tried it. Does it give reliable results?? Anyway I will try it. Thanks for your suggestions.

ADD REPLY
0
Entering edit mode

Yes, it's as reliable as htseq-counts and I think many of us switched over to using it already (I've also found it simpler to install on our cluster without root access).

ADD REPLY
0
Entering edit mode

Great, I am running it. Thanks a lot for your valuable time.

ADD REPLY
0
Entering edit mode

Wow it is real faster .... finished running. Just took 4 mins.

ADD REPLY
0
Entering edit mode

Yeah, that's the reason I switched to it :)

ADD REPLY
0
Entering edit mode

Oops, yeah, that was a typo!

ADD REPLY
0
Entering edit mode

ok I will try this.

ADD REPLY
0
Entering edit mode

Keep in mind that htseq-count may still compain (you should only get a warning though). You might also try featureCounts, which is much faster.

ADD REPLY

Login before adding your answer.

Traffic: 1499 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6