finding the gene contain the snps ID
1
0
Entering edit mode
7.2 years ago
mms140130 ▴ 60

I have the following data set about the snps ID

CHROM  POS ID   
chr7    78599583    rs987435
chr15   33395779    rs345783
chr1    189807684   rs955894
chr20   33907909    rs6088791
chr12   75664046    rs11180435
chr1    218890658   rs17571465
chr4    127630276   rs17011450
chr6    90919465    rs6919430

and a gene reference file

genename    name    chrom   strand  txstart txend
CDK1    NM_001786   chr10   +   62208217    62224616
CALB2   NM_001740   chr16   +   69950116    69981843
STK38   NM_007271   chr6    -   36569637    36623271
YWHAE   NM_006761   chr17   -   1194583 1250306
SYT1    NM_005639   chr12   +   77782579    78369919
ARHGAP22    NM_001347736    chr10   -   49452323    49534316
PRMT2   NM_001535   chr21   +   46879934    46909464
CELSR3  NM_001407   chr3    -   48648899    48675352

I'm trying to match the genes with the SNps location, so include the snps that has

postion >= txstart and position<= txend

for example I want a data set that has the following columns

genename SNPID chrom position txstart txend

gene R SNP • 1.3k views
ADD COMMENT
3
Entering edit mode
7.2 years ago

For these types of genome annotation tasks GRanges is great.

ADD COMMENT
0
Entering edit mode

so how can I use GRanges ?? do you have a code you can share

ADD REPLY
1
Entering edit mode

The text in blue (GRanges) is a hyperlink; if you click on it, voila! It takes you to the magic code repository!

ADD REPLY
0
Entering edit mode

If you follow the link and look at the PDF documents ('GenomicRanges HOWTOs') halfway down the page, all will be revealed! This does require some knowledge of R, but a little time spent now is, in my honest opinion, well worth it.

ADD REPLY

Login before adding your answer.

Traffic: 2341 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6