read length of sequence
2
1
Entering edit mode
7.7 years ago
Bulbul Ahmed ▴ 20

i have found 30-50 bp is the length of short reads but do not know about long reads......... what is the appropriate length of long reads in an average???

RNA-Seq Assembly • 5.1k views
ADD COMMENT
2
Entering edit mode

from which sequencing platform? and how did you found that 30 bp is length of short reads?

ADD REPLY
1
Entering edit mode

30-50 bp is where we started at a decade ago. We are way past that stage now even on the short read technologies (e.g. Illumina, Ion).

With technologies like 10x genomics/Illumina sequencing, the reads (contigs that come out of the process) can be on the order of hundreds of kb.

ADD REPLY
6
Entering edit mode
ADD COMMENT
0
Entering edit mode

A nice graphic indeed.

ADD REPLY
2
Entering edit mode
7.7 years ago

30-50 bp is very short, inappropriately short for most applications. Long reads on PacBio platform are ~12kb if I'm not mistaken, reads using Oxford Nanopore sequencers can be tens of kb's (easily 20kb) and maximum mapping reads that has been reported is >150kb.

ADD COMMENT
1
Entering edit mode

Most sequencing today is 50bp, which is long enough for most applications.

ADD REPLY
2
Entering edit mode

That does not match my observations... I think it depends on your specific field. I rarely encounter reads <150 bp.

ADD REPLY
1
Entering edit mode

Sure. Specific fields can vary. Some fields are also okay with Sanger. I meant overall.

You can check something like SRA for exact stats.

Or you can do a back of the envelope calculation. Most sequencing is Illumina. Most Illumina sequencing is HiSeq. Most HiSeqs do not support 150bp reads.

ADD REPLY
1
Entering edit mode

We usually use MiSeq and NextSeq, either 2x 250bp or 2x 300bp and respectively 2x 150bp. But I guess 2x 75bp is quite common for (RNA-seq on) HiSeq.

ADD REPLY
0
Entering edit mode

50 bp runs are most economical option at majority of the providers. So they are actually pretty popular with users.

ADD REPLY
0
Entering edit mode

30-50 bp is very short, inappropriately short for most applications.

Not sure I agree with that... For ChIP-Seq and RNA-Seq for gene expression (no de novo assembly or splicing) the advantage of going above 50 I think is quite small. Also for bisulfite sequencing above 50 you don't gain much in mappability. (I'm referring to mouse or human genomes). The of course, the longer the better...

ADD REPLY
1
Entering edit mode

Well since TopHat2 is designed for reads starting from 75bp, I wouldn't go much lower than that. I don't have to convince your that longer reads will contribute to better alignment.

ADD REPLY
1
Entering edit mode

Of course other things being equal you would go for longer reads. Still I'm quite convinced that in most differential gene expression experiments 50 bp reads are going to be substantially similar to 75+ bp. The gain in better alignment is probably minimal. If for example money is a limiting factor, I would definitely prefer to sequence shorter reads but do more replicates or more meaningful follow up experiments. If tophat2 "refuses" to align shorter reads, which I kind of doubt, just choose a different aligner. Again, I'm just talking in general terms...

ADD REPLY
0
Entering edit mode

I don't think that anyone is arguing that longer is not better. If you pick a random RNA-seq study from the last year, it will be 50bp sequencing if it's basic differential gene expression (not something more complex like splicing or de novo transcriptome assembly).

ADD REPLY

Login before adding your answer.

Traffic: 2842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6