How to truncate the short reads to 30x coverage from 60x coverage
3
0
Entering edit mode
7.2 years ago
lys8918 • 0

In experiments of some paper, they try to truncate the coverage of short reads for a reasonable running times. But, I don't know how to do that, such as truncating the short reads to 30x coverage from 60x coverage.

RNA-Seq next-gen sequencing • 1.8k views
ADD COMMENT
0
Entering edit mode

To get from 60x to 30x, you just remove half the reads (making sure that, if they are pair, you include or exclude both reads of the pair together). Note that the reads themselves are not shorted/truncated/modified in any way, you just use fewer of them.

Edit: technically you could get to 30x from 60x by halving the read length, but I've never seen a paper that does that - they've all just downsampled the input.

ADD REPLY
0
Entering edit mode

I can't see a situation where shortening your already short reads would be a good idea!

Also, you wouldn't necessarily half the coverage depth by halving the read length if they overlap significantly (which you would hope they do if you want to assemble well)

ADD REPLY
0
Entering edit mode

Shortening reads won't change your physical coverage (number of fragment remains unchanged), but you will halve your sequenced coverage since you have half the number of bases sequenced. If your pipeline involves merging overlapping reads then yes, your post-merge coverage might be more than half, but at a cost of a higher error rate due to fewer overlapping bases.

Regardless,changing you read length is almost always^ going to give you worse results than down-sampling.

^ exceptions are read pair based structural variant calling, and libraries in which the median fragment length is already comparable to the read length.

ADD REPLY
1
Entering edit mode
7.2 years ago
Joe 21k

This is super common. Should be loads of answers (and many ways of doing it) in google, e.g:

http://ivory.idyll.org/blog/2014-downsample-to-given-coverage.html

ADD COMMENT
0
Entering edit mode

Tks very much. I think I have searched with error keywords.

ADD REPLY
1
Entering edit mode

In this instance, "downsample" is the magical search keyword. Depending on your pipeline, you'll probably want to downsample the BAM, or the FASTQ.

ADD REPLY
1
Entering edit mode
7.2 years ago

This Bioconductor package might be of interest: subSeq

ADD COMMENT
0
Entering edit mode
7.2 years ago
d-cameron ★ 2.9k

Both samtools and Picard tools support downsampling of SAM/BAM/CRAM files.

ADD COMMENT

Login before adding your answer.

Traffic: 2568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6