Biostar Beta. Not for public use.
Question: Best Way To Do Pathway Analysis Of A Set Of Genes?
Entering edit mode

What is the best way to do pathway analysis computational for a set of genes or proteins of interest. Specifically I am trying to identify common functions or pathways in a set of genes mutated in cancer samples. I know I could look at Go terms, and use things like David. Anyone have some other really good techniques for this?

ADD COMMENTlink 7.9 years ago Wayne • 1000 • updated 7.9 years ago Guangchuang Yu ♦ 2.2k
Entering edit mode

ConsensusPathDB is a meta-search engine for pathway analysis. it basically incorporates all/most of the reputable public access pathway databases out there.

one major source outside of cpdb is ingenuity IPA. this is proprietary software and (in addition to public access database info) has a manually curated database of millions of pathway "associations" mined from academic papers.

between these 2, i think you can capture most compiled pathway info.

ADD COMMENTlink 7.9 years ago Occam • 380
Entering edit mode

+1 for CPDB. Useful resource.

ADD REPLYlink 7.9 years ago
Khader Shameer
Entering edit mode

can anyone tell me how to use IPA, I mean I have list of Differentially expressed genes now I want to use it for viewing the pathways in IPA , can anyone guide me?

ADD REPLYlink 6.5 years ago
♦ 4.8k
Entering edit mode

Yes, CPDB was incredibly useful. This database needs to be more well-known. Also Reactome and DAVID worked well for me.

ADD REPLYlink 2.4 years ago
• 10
Entering edit mode

There are a lot of posts here and elsewhere about pathway analysis. How you go about it depends on what data you have and what you want to see. This post and the review it refers to are good places to start:

ADD COMMENTlink 7.9 years ago Gareth Morgan • 310
Entering edit mode

To begin with there is no single best method. It is always depend on the data you have in hand.

Also remember

"Gene Ontology enrichment analysis != Pathway analysis"

For a detailed explanation of GO term enrichment see this previous discussion at Biostars.

You mentioned that

"I am trying to identify common functions or pathways in a set of genes mutated in cancer samples."

I assume your data could have come from an genome/exome/transcriptome analysis workflow. If your list of genes are from an exome or genome workflow the approach discussed in the previous answers will be enough but you need to take care of few important things.

To do a pathway analysis you primarily need

  • List of background genes
  • List of perturbed genes,
  • Annotation file that map each gene to a pathway

Now you have to be very careful when you define your background. If your data is from a tumor - normal pair your background should only contain the genes that are specific to the cell-line or tissue of your interest. Consult databases like HPRD/Human Protein Atlas to find cell/tissue specific genes. Once you have this data/files you can perform enrichment analysis (standard statistical test followed by multiple testing correction) using R to see significant pathways. You can use external tools only if they allow you to input a user-defined / experimental platform specific background.

If your data is from transcriptome/RNA-Seq you may use GOSeq: It uses a statistical approach developed specifically for RNA-seq data that can incorporate length or total count bias of RNA-Seq data into gene set tests.

If you are working with whole-genome level background you can use web-based tools like: Panther Pathways Reactome Pathways KEGG Pathway analysis using SubPathwayMiner or other R/BioC packages

You may also refer to a previous post here

ADD COMMENTlink 7.9 years ago Khader Shameer 18k
Entering edit mode

For gene ontology, is it necessary to do length bias correction, when using RNA-seq data? Even if for example I do differential expression in a count based manner?

ADD REPLYlink 5.9 years ago
• 50
Entering edit mode

There are many, many potential methods here:

Getting GO terms is a good start, but even here the level of curation is mixed.

Always remember to use a word of caution with pathway analyses, and have a plan for how to biologically validate your results if you plan to publish. Most publicly available analysis algorithms work from publicly available data -- and these data are just not complete for most genes of interest. This is true for online web tools such as String and GeneMania -- but if filtered with the most stringent search criteria, interesting connections can be found. Also take a look at the NCI Pathway Interaction Database.

Do you have questions about how to approach specific hypotheses through pathway analysis?

ADD COMMENTlink 7.9 years ago Alex Paciorkowski 3.3k • updated 7.9 years ago Istvan Albert 80k
Entering edit mode

you can use my package for reactome pathway analysis

ADD COMMENTlink 7.3 years ago Guangchuang Yu ♦ 2.2k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0