How to crop gff feature into pieces when an overlap is found
0
0
Entering edit mode
6.2 years ago

I have a gff file where I have already removed nested features, similar to the 1st, 2nd, 4th and 5th bed features as shown in the figure:

enter image description here

But now I want to remove features corresponding to 3, but in a way that I can keep the leftmost unique region of the first feature and rightmost unique region of the second feature, keeping the overlap as a part of the longer feature.

Example data:

scaffold1   RepeatMasker    similarity  1627986 1629296 11.5    +   .   Clust2783_Helitron
scaffold1   RepeatMasker    similarity  1628280 1638525 0   +   .   Clust1896_LTRRT
scaffold1   RepeatMasker    similarity  1634325 1644243 0   +   .   Clust1098_Helitron
scaffold1   RepeatMasker    similarity  1643445 1644561 2.3 +   .   Clust305_Helitron

Output:

scaffold1   RepeatMasker    similarity  1627986 1628279 11.5    +   .   Clust2783_Helitron
scaffold1   RepeatMasker    similarity  1628280 1638525 0   +   .   Clust1896_LTRRT
scaffold1   RepeatMasker    similarity  1638526 1644243 0   +   .   Clust1098_Helitron
scaffold1   RepeatMasker    similarity  1644244 1644561 2.3 +   .   Clust305_Helitron

Is there a simple way to do this. Please let me know if you have any further suggestions on how to remove such overlaps.

Thanks

gff bedtools annotation genome bedops • 1.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 3178 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6