Closed:Extracting intergenic region coordinates and type
0
0
Entering edit mode
8.8 years ago
timjoncooper ▴ 320

Is there a way to extract the coordinates of all intergenic regions from the S. cerevisiae S288c reference genome - BUT also their type as defined by their associated genes i.e. convergent, divergent or tandem? To get intergenic region coordinates, I know that I could pull down all annotations from the UCSC table browser, create a merged .BED file but I do not know how to assign types.

EDIT:

I have a file containing the coordinates of all genes and their respective strands:

https://www.dropbox.com/s/sjwtrkj41b8l0o8/Annotation.txt?dl=0

The + or - strand designation reveals the direction of the gene i.e. on chrI, the first four genes are marked +/+/-/+ and look like this:

https://www.dropbox.com/s/levlucc5aj0we6x/Screen%20Shot%202015-06-16%20at%2021.18.43.png?dl=0

As the first two genes overlap, there is no intergenic region here. The first intergenic region occurs between the 2nd and 3rd genes and it is convergent (the genes point toward one another). The next is divergent (the genes point away from one another).

I basically require a script or method to automate this designation - one that takes into consideration genes which overlap too. The output would ideally contain the coordinates of the intergenic region, the chrI it resides on and it's type.

genome gene • 408 views
ADD COMMENT
This thread is not open. No new answers may be added
Traffic: 1692 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6