Forward Strand Or Reverse Strand
3
5
Entering edit mode
11.3 years ago
siyu ▴ 150

How can I know that a read is mapping to forward strand or reverse strand ?

read strand • 33k views
ADD COMMENT
0
Entering edit mode

Hello everyone,

This post is really interesting, I actually need to analyse (ht-seq counts) .bam files (sorted and indexed) that have been mapped by someone else before me. My files are as this form: - sample.merged.mdup.forward.bam - sample.merged.mdup.reverse.bam.

When looking for ht-seq counts tutorials, there was only one .bam (and the .gff) as input... ->Does someone, working on paired-end mode, have any idea of how can we manipulate the forward.bam and the reverse.bam for counting step

Many thanks in advance

ADD REPLY
15
Entering edit mode
11.3 years ago

A sam record contains a binary flag (aka SAM-FLAG). The bit for 16 is set if the read is mapped on the reverse strand

see the SAM specification http://samtools.sourceforge.net/SAM1.pdf ( section "FLAG: bitwise FLAG")

see also "explain SAM flag " http://picard.sourceforge.net/explain-flags.html

ADD COMMENT
1
Entering edit mode
ADD REPLY
8
Entering edit mode
11.3 years ago

One way to extract stranded reads from a BAM file:

samtools view Reads.bam | gawk '(and(16, $2))' > forwardStrandReads.sam
samtools view Reads.bam | gawk '(! and(16, $2))' > reverseStrandReads.sam
ADD COMMENT
3
Entering edit mode

faster:

samtools view -F 16 Reads.bam 
samtools view -f 16 Reads.bam
ADD REPLY
2
Entering edit mode

Won't this and Alex's solution will only work if you already have filtered away the unmapped reads using the -F 4 option? Otherwise for with the -F 16 option in SAMtools view or 'not containing 16' filtering by gawk you just get everything that isn't mapping to the reverse strand which could be more than those that map to the forward strand.

This post seems to work for single paired reads to sort forward reads without that initial step because it excludes anything that is mapping to reverse strand or unmapped.

ADD REPLY
0
Entering edit mode

Nice, cheers for the suggestion.

ADD REPLY
0
Entering edit mode

Could you give me a detail description about this ? I don't understand why the Flag 16 stands for reverse strand.Thank you !

ADD REPLY
0
Entering edit mode

Read the PDF link in Pierre's answer for an explanation of the flags. Hexadecimal 0x10 is decimal 16.

ADD REPLY
0
Entering edit mode

Is this still a SAM file? can it be converted to BAM? because it is missing the headers

ADD REPLY
0
Entering edit mode

SAM is text-based, so perhaps use samtools to excise the headers to a separate file, then cat the header file and a filtered per-strand SAM file. Then convert the per-strand, headered SAM back to BAM.

ADD REPLY
0
Entering edit mode
7.0 years ago
liangqinsi ▴ 40

Pierre and Alex's answers are great.

But flag :

16 or 0x10 means SEQ being reverse complemented

Which means it should be:

samtools view Reads.bam | gawk '(and(16, $2))' > reverseStrandReads.sam

samtools view Reads.bam | gawk '(! and(16,$2))' > forwardStrandReads.sam

------

I found it's strange when I grep primer in forward and reverse sam ( most forward primer in reverse sam, and vice versa ), it comes out to be misunderstanding of the flag.

wc -l *sam.out

2551 ATACTGGTATGTATTTAACCATGCAGATCC.forwardStrandReads.sam.out

1001 TTTCAGTTGATTTGCTTGAGATCAAGATTG.reverseStrandReads.sam.out

ADD COMMENT

Login before adding your answer.

Traffic: 2168 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6