Assesing genomic segment enrichment with Granges (R-package)
0
0
Entering edit mode
5.0 years ago
moithuti • 0

I am running genomic ranges to analyse genomic segment enrichment. The first three columns in my dataset are: chr, start, end, followed by 3 additional metadata columns. All the columns are separated by tabs.

I have successfully run subsetByOverlaps(cases, controls, type="within", invert="true"). According to here, my output should be genomic segments within my chromosome start and end points, as well as being exclusive to my cases. Conversely, I also ran subsetByOverlaps(controls, cases, type="within, invert="true") to look for segments exclusive to controls. I then looked for segments that are found in both by removing the invert option. In a certain instance my queryLength was approximately 4000 segments and subject length 200 odd segments. Given the size of my queryLength, if I run subsetByOverlaps(cases, controls, type="within") I get more than 200 segments in granges object. Am I missing something with respect to the behaviour of the function, since I expected my output to be less than 200 segments assuming that the segments are treated as sets?

The second question is, if I then swap the cases and controls to run subsetByOverlaps(controls, cases, type="within"), how can I combine the data from the 2 runs? Finally, am I correct to assume that combining the two in a dataframe would give me the equivalent of the union of genomic segments found within my cases and controls? If not, is there a way to use Granges to obtain that union without doing it in 2 steps?

R genome segment enrichment GenomicRanges • 960 views
ADD COMMENT

Login before adding your answer.

Traffic: 2398 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6