Biostar Beta. Not for public use.
Extracting Reads from paired end data in SAM file that map to a list of contig IDs
1
Entering edit mode
16 months ago
yp19 • 20

I have paired end data in a SAM file and I want to extract all the reads which correspond to specific contig ids. My contig ids are in a text file with one ID per line.

Output of SAM/BAM is preferred.

Thanks for your help.

ADD COMMENTlink
0
Entering edit mode

would the solution in that link also work for paired data? Thanks!

ADD REPLYlink
0
Entering edit mode

@max_19 why wouldn't it work for paired data ?

ADD REPLYlink
0
Entering edit mode

If one read in a pair matches the contig but the other read in that pair does not. Is there a way to include that pair even if only one read matches?

ADD REPLYlink
2
Entering edit mode
16 months ago
France/Nantes/Institut du Thorax - INSE…

ah, I see. Using awk, matching chromosomes RF01 or RF02

/samtools view -h S1.bam | awk -F '\t' 'function fun(C) { return C=="RF01" || C=="RF02";} /^@/ {print;next;} {if(fun($3) || fun($7)) print;}'

or using samjdk: http://lindenb.github.io/jvarkit/SamJdk.html

 java -jar dist/samjdk.jar -e 'Predicate<String> f=C->C.equals("RF01") || C.equals("RF02"); return (!record.getReadUnmappedFlag() && f.test(record.getReferenceName())) || (!record.getMateUnmappedFlag() && f.test(record.getMateReferenceName()));' in.bam
ADD COMMENTlink
0
Entering edit mode

That does what I need, thank you!

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1