extract aligned reads to a sequence part and reset indices
0
0
Entering edit mode
2.3 years ago

Hello,

I have NGS reads mapped to genome. Now I need to extract only a portion of the reads mapping to certain region, which I can do easily with:

samtools view my_bam.bam GENOME:1000-2000 > my_region.sam

However, the aligned reads are still recorded with their original indices on the genome (from 1000 to 2000 in this example). However, I need to have them indexed from 1, as if the requested region is new genome sequence.

1) is there any tool (or samtools/sambamba setting) that can do this?

2) Sure, I can process the file manually and subtract the offset from the index. Is this the way to go, are there any gotchas regarding the sam format e.g. offsets for reads mapping to reverse strand? (I know I will also need to replace the sequence id in the file.)

Ps. I wouldn't do this, but the tool I want to use require such input :(.

samtools sam bam • 454 views
ADD COMMENT
1
Entering edit mode

use awk to substract the POS and the mate-POS ?

ADD REPLY

Login before adding your answer.

Traffic: 1460 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6