Ht-Seq Read Count And Strand-Specificity
1
5
Entering edit mode
11.0 years ago

Hi,

I am new to RNA sequencing and I am a bit confused with the HT-Seq read count options and I want to know whether I am thinking in the right direction. I have a set of paired-end strand-specific RNA-Seq reads and I am now trying to count the reads in a set of features (genes).

The HT-Seq documentation says that the option "stranded" by default is set to "yes" which means that HT-Seq assumes the reads to be strand-specific. They also say

"If your RNA-Seq data has not been made with a strand-specific protocol, this causes half of the reads to be lost. Hence, make sure to set the option --stranded=no unless you have strand-specific data! "

This makes sense, since if I use "stranded=yes" option for non-strand specific data, the reads mapping to the opposite strand of the feature will NOT be counted.

However, this makes me wonder, if I use "stranded=no" even for strand-specific data, it would not affect my counts in any way. Is that correct ? Because with "stranded=no", it does not matter if a read maps to the same or the opposite strand as the feature. It would be counted as long as it is mapping to a feature, regardless of the strand.

So then a follow up question comes to mind as to why HT-Seq even has the "stranded=yes" and "stranded=reverse" options.

I am sorry if this is a very naive and incorrect question, but I really need to get the strand-specific concept clear in my mind.

Any help would be much appreciated.

htseq read counts strand rna • 10k views
ADD COMMENT
6
Entering edit mode
11.0 years ago
Ido Tamir 5.2k

If the transcripts/genes whatever would not overlap you would get the same results whether you specify stranded=yes or stranded=no. But sometimes exons overlap (at least in mammals), and they do this in opposite directions which allows htseq-count to differentiate between the two genes/transcripts it the input was stranded. So you should see a higher rate of ambigous reads when using unstranded.

Depending on the protocol either the sense or the antisense strand gets sequenced, which makes the reverse option necessary. A not completeley illuminating figure (a little bit more colour would have been nice to see which strand gets sequenced: A not completely illuminating figureImage Credit: Zhao Zhang

And no its not naive. It is confusing and complicated with all these strands, protocols etc...

ADD COMMENT

Login before adding your answer.

Traffic: 1880 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6