Biostar Beta. Not for public use.
Finding out if certain SNP positions fall into certain gene regions
0
Entering edit mode
13 months ago
SGMS • 60
European Union

Hi all,

I currently have a certain list of SNPs (chromosome and position). I need to check whether these SNPs fall into another list of gene regions (chr:start_position-end_position) that I have.

I was previously able to find overlaps between positions using findOverlaps in R and I also saw we can do something similar using bedtools intersect. But it seems like those tools are all for finding chr:start-end overlaps, whereas I want to see whether my SNP positions fall into my gene regions of interest.

Any suggestions would be greatly appreciated.

Thank you!

ADD COMMENTlink
1
Entering edit mode

it seems like there was a discussion about this: A: How To Intersect A Range With Single Positions

ADD REPLYlink
0
Entering edit mode

Thanks, my search didn't even fall on that one. It seems I will go with Pierre's suggestion and turn the SNP position into a SNP region having basically the same start and end coordinates. And then do the overlap.

ADD REPLYlink
1
Entering edit mode

Good description of data. It would help if you can post input data and expected output SGMS

ADD REPLYlink
0
Entering edit mode

. But it seems like those tools are all for finding chr:start-end overlaps,

I want to see whether my SNP positions fall into my gene regions of interest.

not clear. What is the difference ?

ADD REPLYlink
0
Entering edit mode

With SNPs, I only have a certain position whereas for the region I have chr:start-end. Do you think findOverlaps would work in this case too?

ADD REPLYlink
2
Entering edit mode

why don't you convert your positions into 1-base intervals ?

ADD REPLYlink
0
Entering edit mode

I thought so. You mean for example:

1:169549811-169549811

right?

ADD REPLYlink
0
Entering edit mode

It would be 1:169549810-169549811, because the BED format is 0-based.

ADD REPLYlink
0
Entering edit mode

Thanks. If I want to do that in R though, the positions will remain the same..

ADD REPLYlink
1
Entering edit mode

Yes, the overlap functions from IRanges/GenomicRanges assume 1-based coordinates.

ADD REPLYlink
1
Entering edit mode
15 months ago
Seattle, WA USA
$ vcf2bed < snps.vcf > snps.bed
$ gff2bed < genes.gff > genes.bed
$ bedmap --echo --echo-map-id genes.bed snps.bed > answer.bed

The convert2bed binary (vcf2bed and gff2bed) takes care of indexing. You can replace gff2bed with gtf2bed if you have GTF-formatted input.

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1