How do I extract genes from a KEGG pathway
1
2
Entering edit mode
5.1 years ago
ammasakshay ▴ 60

Using R I want to generate a list of genes only (without the accompanying text) from a pathway.

For example. If the input pathway is KEGG prostate cancer, I want my output to be a .csv list of the genes in that pathway. I tried:

library("KEGGREST")

keggGet("hsa05215")[[1]]$GENE

but that gives me a list of the number and gene description along with the gene symbol and I want a list consisting of the gene symbol alone.

How do I get this?

Thank you.

gene R bioconductor • 6.3k views
ADD COMMENT
0
Entering edit mode

Append all the hsa id's to the below URL and get the result. Later you can parse the webpage output.

Ex: http://rest.kegg.jp/get/hsa:04140+hsa:04510+hsa:04919

Hope this solves your problem.

ADD REPLY
0
Entering edit mode

I did that here, http://rest.kegg.jp/get/hsa05215 and the gene list is similar to the one i got through R. It has a number gene symbol and description on each line. I want the gene symbol alone. The output under gene looks like this: 1027 CDKN1B; cyclin dependent kinase inhibitor 1B [KO:K06624] 1017 CDK2; cyclin dependent kinase 2 [KO:K02206] [EC:2.7.11.22] 898 CCNE1; cyclin E1 [KO:K06626] I want something that looks like CDKN1B CDK2 CCNE1

ADD REPLY
0
Entering edit mode

This doesn't help. This is only for genes and does not give us the gene list in pathways. Try your solution with hsa05215 pathway and you will see it does not return the list of 90+ genes.

ADD REPLY
5
Entering edit mode
5.1 years ago
ammasakshay ▴ 60

I ended up solving it myself. Hopefully this helps anyone who has a similar need.

library("KEGGREST")

#Get the list of numbers, gene symbols and gene description
names <- keggGet("hsa05215")[[1]]$GENE
#Delete the gene number by deleting every other line
namesodd <-  names[seq(0,length(names),2)]
#Create a substring deleting everything after the ; on each line (this deletes the gene description).
namestrue <- gsub("\\;.*","",namesodd)
#export the vector as a csv
write.csv(namestrue, file = "hsa05215",quote = F, row.names = F)
ADD COMMENT
0
Entering edit mode

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY
0
Entering edit mode

Thank you. I'm a little new to this interface.

ADD REPLY

Login before adding your answer.

Traffic: 1923 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6