Extract Alignment By Read Id From A Sam File
1
2
Entering edit mode
11.1 years ago

Hi,

Is there a rapid way to extract alignment from a sam file using read ids (about~100 read ids in average). If the read ids are in a file (one per line), I could do :

cat in.sam | grep -f idFile.txt > out.sam

but with a big sam file (~40Gb) it takes a lot of time.... so is there maybe a method to extract these alignments faster ?

Thanks,

N.

sam read id • 13k views
ADD COMMENT
2
Entering edit mode

well, not really duplicate. It was BAM, not SAM.

ADD REPLY
0
Entering edit mode
ADD REPLY
11
Entering edit mode
11.1 years ago

faster ?

 LC_ALL=C grep -w -F -f idFile.txt  < in.sam > subset.sam
ADD COMMENT
1
Entering edit mode

+1 for C locale.

ADD REPLY
0
Entering edit mode

amazingly simple!!!! thanks so much

ADD REPLY
0
Entering edit mode

thanks so much for this! any suggestion on how to also keep the sam header in the output subset.sam file?

ADD REPLY
0
Entering edit mode

Capture the header (Lines starting with ^@). Add to the new subset.sam file.

ADD REPLY

Login before adding your answer.

Traffic: 2433 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6