Biostar Beta. Not for public use.
How To Get Junctions File Using Gsnap
2
Entering edit mode
14 months ago
Varun Gupta ♦ 1.1k
United States

Hi Everyone

So i am working on rna seq data. I am using gsnap for my analysis(though i have also used tophat). With tophat , a junctions.bed file is produced which actually tells us about the splicing patterns going on.

Using gsnap i am able to produce sam/bam files. Is there a way i can convert these sam/bam files into something similar to junctions.bed files.

My main goal is to see what is the difference between junctions produced from tophat and gsnap. Tophat produces junctions.bed file but gsnap doesnt. So is there a way i can get a junctions file from sam/bam output of gsnap.

Hope to hear from you guys soon

Regards

Varun

file • 2.6k views
ADD COMMENTlink
0
Entering edit mode

Are you comfortable writing perl scripts?

ADD REPLYlink
0
Entering edit mode

yes i am comfortable

ADD REPLYlink
0
Entering edit mode

great, then it is relatively straightforward to obtain the junctions given a SAM/BAM file. All you need is the start position of a read and its CIGAR. Suppose a read starts at 10000 and cigar is 15M20N75M, then the read is present from 10000-10014 and 10035-10109. So, the read's junction coordinates are 10015-10034. It should be a bit more involved to take into account the effect of Insertions and deletions. But this is the basic principle. Let me know if you run into problems.

ADD REPLYlink
0
Entering edit mode

Hey Arun

What if the CIGAR string is something like this 1S14M20N75M

ADD REPLYlink
0
Entering edit mode
15 months ago
Malcolm.Cook ♦ 1.0k
kansas, usa

If you're comfortable with R/BioConductor

and have a current installation

then

you can the the approach given in my recent update to my answer to a nearly identical question

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3