How to obtain and compare genomic coordinates from RepeatMasker output?
2
1
Entering edit mode
8.5 years ago
gbdias ▴ 150

Hello folks!

I want to mask a genome with a particular repeat library using RepeatMasker.

Then I want to cross the coordinates of the repeats with those of gene annotations to find overlaps between them and study associations and stuff.

I'm only starting to consider feasible ways to do that so any input would be great.

Thanks!

repeats repeatmasker genome browser gbrowse • 3.0k views
ADD COMMENT
2
Entering edit mode
8.5 years ago
Jon ▴ 360

Sounds like a job for BedTools. You should be able to make a GFF from the RepeatMasker output and use bedtools intersect to find overlaps.

ADD COMMENT
0
Entering edit mode

Thank you very much! I reading BEDTools' documentation and it really seems to be the right tool for this task.

ADD REPLY
2
Entering edit mode
8.5 years ago
SES 8.6k

The answer by Jon is a good one to get overlaps, and I can tell you how to make the GFF. In the 'util' directory of the RepeatMasker distribution there is a script called 'rmOutToGFF3.pl' that will do the conversion. The usage is pretty simple since it writes to stdout:

perl rmOutToGFF3.pl my_repeatmasker.out > my_repeatmasker.gff
ADD COMMENT
0
Entering edit mode

Thank you very much! This will help a lot.

ADD REPLY

Login before adding your answer.

Traffic: 2592 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6