Why does BedTools Map operation produce all dots as output?
2
0
Entering edit mode
9.4 years ago
Davide Chicco ▴ 120

I am using BedTools Map operation to map the DNAse I signal of a cell type into some chromosome regions, by computing the mean on the third column

The command I use is the following:

$ bedtools map -a inputFile1.bed -b inputFile2.bedgraph -c 4 -o mean 1> outputFile

In the output file, I have real value for chrom1 -> chrom9, but strangely I find all dots for the other chromosome regions:

chr1    66660   66810   0.849999999999999977796
chr1    87640   87790   0.0500000000000000027756
chr1    96520   96670   0
chr1    115600  115750  115.527272727272702468
chr1    118840  118990  3.10000000000000008882
chr1    125340  125490  0
chr1    136280  136430  .
chr1    136960  137110  .
chr1    235600  235750  39.0559633027522963289
chr1    237020  237170  1.59999999999999986677
....     ....     ....     ....    
....     ....     ....     ....    
....     ....     ....     ....    
chr10   134874600       134874750       .
chr10   134876820       134876970       .
chr10   134877940       134878090       .
chr10   134878160       134878310       .
chr10   134879420       134879570       .
chr10   134897500       134897650       .
chr10   134907140       134907290       .
chr10   134915640       134915790       .
chr10   134939120       134939270       .
chr10   134939280       134939430       .
chr10   134940860       134941010       .
....     ....     ....     ....     ....     ....

Do you know why this strange behavior happens?

Why don't I have all the values for chrom10...19, too?

genomics map bedtools • 3.7k views
ADD COMMENT
1
Entering edit mode

sounds like a sorting problem. what are the outputs of:

cut -f 1 inputFile1.bed | uniq -c
cut -f 1 inputFile2.bedgraph | uniq -c

you may need to sort one of them.

ADD REPLY
0
Entering edit mode

Output of the first:

71485 chr1
56771 chr2
46916 chr3
34197 chr4
39869 chr5
38117 chr6
37795 chr7
34040 chr8
28966 chr9
32310 chr10
41346 chr11
24212 chr12
16072 chr13
9869 chr14
9376 chr15
13554 chr16
20892 chr17
5369 chr18
14898 chr19
20618 chr20
10253 chr21
15227 chr22
18857 chrX
1214 chrY

Output of the second:

13545206 chr1
6057074 chr10
6891539 chr11
6386478 chr12
3187543 chr13
3284847 chr14
3873336 chr15
3957177 chr16
4492389 chr17
3169684 chr18
3796102 chr19
10769055 chr2
2781156 chr20
1850148 chr21
2028433 chr22
8502537 chr3
7832694 chr4
8125616 chr5
9465221 chr6
8314745 chr7
6241766 chr8
4687957 chr9
14655 chrM
4504505 chrX

Any clue?

ADD REPLY
0
Entering edit mode

yes, a sorting problem. sort both with sort -k1,1 -k2,2n $bed

ADD REPLY
1
Entering edit mode
9.4 years ago

See option -null http://bedtools.readthedocs.org/en/latest/content/tools/map.html

-null      The value to print if no overlaps are found for an A interval. Default: "."
ADD COMMENT
0
Entering edit mode

Thanks Pierre, but it's not that case. Overlaps are present, at least for some regions. It just put dots for ALL, and I cannot understand why....

ADD REPLY
1
Entering edit mode
9.4 years ago
Davide Chicco ▴ 120

Thanks to Brentp that suggested me to use the command cut -f 1 fileName1.bed | uniq -c, I noticed that the two chromosome region files were sorted in different order.

This first file was sorted alphanumerically: chr1, chr2 , chr3, ..., chr9, chr10, chr11, ...

Conversely, the second file was sorted alphabetically: chr1, chr10, chr11, ..., chr19, chr2, chr20, chr21, ...

This inconsistency between these two files messed up the BedTools Map operation. To solve this, I simply sorted alphabetically the first file, too. I used the Linux command: sort -k1,1 -k2n file > sortedFile

Merci Pierre!

ADD COMMENT
0
Entering edit mode

I am glad you were able to sort this out. We are working on enhancements that will (in most cases, not all) detect inconsistent sorting rules and throw an error to alert the user. Hoping to have it out in the next release.

ADD REPLY
0
Entering edit mode

This helped me even after 5 years ... tried various sorting but this one works perfectly .... thanks

sort -k1,1 -k2n file > sortedFile

Works great.

ADD REPLY

Login before adding your answer.

Traffic: 2704 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6