Filtering Bed Files By Using Bedops
2
1
Entering edit mode
10.8 years ago
Raghav ▴ 100

hello every one,

I have paired end illumina reads, R1.fastq and R2.fastq and I have mapped them as single-end reads by using bowtie2 default parameters, I performed further downstream analysis by using samtools and bedops, and now I have R1.bed and R2. bed I made two sets, one of them have R1_uniquely_mapped.bed, R2_uniquely_mapped.bed and second of them R1_mapped_more_than_1.bed , R2_mapped_more_than_1.bed.

because R1 and R2 belongs paired end reads, and my restriction library has maximum 2KB size, then R1 and R2 pairs must be present in less than 2 kb territory of chromosome

theoretically I am assuming, in R1.bed format,

chr1  100   180    @R1_read1______1 .................  
chr1   1000  1090 @R1_read2______1................

In R2.bed format,

chr1 2100   2180 @R2_read1_____2............. ## I just add 2KB length with respect to R1.bed###
chr1 2500 2590    @R2_read______2......... ## I just add 1.5KB [1500nts] with respect to R1.bed, because my library is >= 2KB.

How can I customize downstream tools like BEDOPS or bedtool which can capture such type of reads or alignment????? How can I filter this type of infromation by using bedops tool????

all suggestions and comments are most welcome,

bedtools bed bowtie2 • 5.7k views
ADD COMMENT
2
Entering edit mode

I really think you should format your question in a way people can understand. Maybe I am slow this morning but I do not understand the problem at hand. Also asking a question with "????????" is adding 10 superfluous question marks, one suffices to ask a question.

ADD REPLY
2
Entering edit mode

If you have paired end reads, I don't understand why you're mapping them as single end? Bowtie2 can map paired end reads: http://bowtie-bio.sourceforge.net/bowtie2/manual.shtml#paired-end-example

ADD REPLY
0
Entering edit mode

Dear Jelena, I tried botie2 for paired end also and I ran it over more than 20 times for paired end reads by adjusting -I and -X parameters, also with different parameters, and got four type of mapped results, aligned exactly 1 time, aligned >1 times, aligned concordantly exactly 1 time, aligned concordantly >1 times, aligned discordantly 1 time, it was very difficult for me to choose which one is better for my analysis, and it was very complicated to go with all simultaneously. because, I did not find any significant difference between single end and paired end mapping [in both cases alignment rate is 70%] this force me to go through single end mapping, Here might be I am missing technicality of paired end mapping, but if mapping percent is equal, it is my belief that, in further downstream analysis I would be able to customize tool according to me.

comments of technicality of paired-end mapping are always welcome.

ADD REPLY
0
Entering edit mode

how can I extract chromosome name, chromosome start position and chromosome end position of those reads in R1.bed and R2.bed which are paired-ends?

ADD REPLY
0
Entering edit mode

I don't get it, you have a tab delimited file and you want to extract a given column ?

ADD REPLY
2
Entering edit mode
10.8 years ago
Jordan ★ 1.3k

Like Jelena said, you should be using paired end option while mapping using Bowtie2.

Regarding the question, I think you can use range in bedops to identify regions which are with in a range of 2000bp.

First you need to sort the bed file:

sort-bed R1.bed > R1.sorted.bed
sort-bed R2.bed > R2.sorted.bed

Now, we can use the range operator to find regions in R1.sorted.bed which are as far as 2000bp away from R2.sorted.bed.

bedops --range 2000 -e -1 R1.sorted.bed R2.sorted.bed

It's more detailed here: bedops wiki

EDIT: Edited as said by Alex Reynolds.

ADD COMMENT
1
Entering edit mode

Using the --everything operator reports and pads everything from R1 and R2 - this is equivalent to a multiset union operation.

You might perhaps instead replace that operator with --element-of -1 to report regions from R1.sorted.bed that overlap R2.sorted.bed by one or more bases (along with --range 2000 to look for 2000-base-padded elements in R1 that overlap R2 elements).

In other words:

$ bedops --range 2000 --element-of -1 R1.sorted.bed R2.sorted.bed > answer.bed
ADD REPLY
1
Entering edit mode

Ah, my bad! Corrected it.

ADD REPLY
0
Entering edit mode
10.8 years ago
JC 13k

A third vote to repeat your mapping USING the pair-end information, first, this will improve the mapping because one pair can confirm or reject alternative alignments.

ADD COMMENT

Login before adding your answer.

Traffic: 2036 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6