Get pathway information from KEGG using Entry id
1
2
Entering edit mode
6.5 years ago

Hi, I have a list of ids from KEGG, is there any way to get the corresponding pathway information in a single search?

Kegg_ids are - stm:STM2395, sau:SA0132, efa:EF2297

Expected output stm:STM2395 : Pathway : Cationic antimicrobial peptide (CAMP) resistance

Thanks in advance

R sequencing gene next-gen snp • 3.3k views
ADD COMMENT
2
Entering edit mode
> library(KEGGREST)
> kegg_ids=read.csv("kegg_ids", header = F, stringsAsFactors = F)
> kegg_ids
            V1
1 stm:STM2395 
2 sau:SA0132
3 efa:EF2297

> data.frame(Pathway=unlist(sapply(keggGet(kegg_ids[,1]), "[[", "PATHWAY")),stringsAsFactors = F)
                                                      Pathway
    stm01503 Cationic antimicrobial peptide (CAMP) resistance
    stm02020                             Two-component system
    efa00550                       Peptidoglycan biosynthesis
    efa01100                               Metabolic pathways
    efa01502                            Vancomycin resistance
    efa02020                             Two-component system
ADD REPLY
0
Entering edit mode

Thank you, but its giving pathway id, how we know the corresponding kegg id ?

ADD REPLY
0
Entering edit mode

I want to get the kegg_id along with the result. Are there any options?

ADD REPLY
1
Entering edit mode
library(KEGGREST)
library(purrr)
library(magrittr)


kegg_ids=read.csv("kegg_ids", header = F, stringsAsFactors = F)
kegg_ids

kegg_pathways=data.frame(Pathway=unlist(sapply(keggGet(kegg_ids[,1]), "[[", c("PATHWAY"))),stringsAsFactors = F)
kegg_list=map(keggGet(kegg_ids[,1]), extract, c("ENTRY", "PATHWAY"))
library(dplyr)
kegg_df=bind_rows(lapply(kegg_list, function (x) data.frame(t(unlist(x)),stringsAsFactors = F)))
library(tidyr)
kegg_df1=na.omit(gather(kegg_df,key = "", value = Pathway, -ENTRY.CDS))[,c(1,3)]

Note: beware of library loading. Some of the libraries mask the function of others and results in execution issues. output:

> na.omit(gather(kegg_df,key = "", value = Pathway, -ENTRY.CDS))[,c(1,3)]
   ENTRY.CDS                                          Pathway
1    STM2395 Cationic antimicrobial peptide (CAMP) resistance
4    STM2395                             Two-component system
9     EF2297                       Peptidoglycan biosynthesis
12    EF2297                               Metabolic pathways
15    EF2297                            Vancomycin resistance
18    EF2297                             Two-component system
ADD REPLY
0
Entering edit mode

Error in UseMethod("extract_") : no applicable method for 'extract_' applied to an object of class "list"

ADD REPLY
1
Entering edit mode

This is because of one library masking the function of another library. A note was added in between regarding the same. Following is the updated code with correct order of libraries:

library(KEGGREST)
kegg_ids=read.csv("test.txt", header = F, stringsAsFactors = F)
library(purrr)
library(magrittr)
kegg_list=map(keggGet(kegg_ids[,1]), extract, c("ENTRY", "PATHWAY"))
library(dplyr)
kegg_df=bind_rows(lapply(kegg_list, function (x) data.frame(t(unlist(x)),stringsAsFactors = F)))
library(tidyr)
kegg_df1=na.omit(gather(kegg_df,key = "", value = Pathway, -ENTRY.CDS))[,c(1,3)]
ADD REPLY
0
Entering edit mode

Thank you very much

ADD REPLY
0
Entering edit mode

I have used GENE NAME instead of pathway information for getting corresponding gene name. But it's not fetching output. Is there any other keyword for fetching corresponding gene name of KEGG id?

ADD REPLY
0
Entering edit mode

NAME for gene name

library(KEGGREST)
library(purrr)
library(magrittr)

kegg_ids=read.csv("test.txt", header = F, stringsAsFactors = F)
library(biomaRt)

kegg_pathways=data.frame(Pathway=unlist(sapply(keggGet(kegg_ids[,1]), "[[", c("PATHWAY"))),stringsAsFactors = F)
kegg_list=map(keggGet(kegg_ids[,1]), extract, c("ENTRY", "PATHWAY", "NAME"))
library(dplyr)
kegg_df=bind_rows(lapply(kegg_list, function (x) data.frame(t(unlist(x)),stringsAsFactors = F)))
library(tidyr)
na.omit(gather(kegg_df,key = "", value = Pathway, -c(ENTRY.CDS,NAME)))[,c(1,2,4)]

output:

> na.omit(gather(kegg_df,key = "", value = Pathway, -c(ENTRY.CDS,NAME)))[,c(1,2,4)]
   ENTRY.CDS  NAME                                          Pathway
1    STM2395  pgtE Cationic antimicrobial peptide (CAMP) resistance
4    STM2395  pgtE                             Two-component system
9     EF2297 vanYB                       Peptidoglycan biosynthesis
12    EF2297 vanYB                               Metabolic pathways
15    EF2297 vanYB                            Vancomycin resistance
18    EF2297 vanYB                             Two-component system
ADD REPLY
0
Entering edit mode

Thanks! Very useful. How if I want to list out all the results even with the ones that do not have any pathways but still have name or definition?

ADD REPLY
0
Entering edit mode
6.5 years ago
EagleEye 7.5k

You can download all KEGG pathways with ids, description and corresponding genes involved as a simple table (plain text file) using GeneSCF.

ADD COMMENT

Login before adding your answer.

Traffic: 2517 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6