Gene set enrichment analyses of different samples of the same species
1
0
Entering edit mode
6.3 years ago
Mehmet ▴ 820

Dear All,

I want to perform gene set enrichment analysis of a species using specific gene families. I have a peptidase gene list that have different peptidase families (Aspartic, metal, serine etc.), and these genes have FPKM values at different life stage of the same organisms. Some genes have zero FPKM values, while others have bigger than zero.

Besides, all peptidase families have the same number of gene at each life stage, as this is the same organism. For instance, 100 Aspartic peptidases are at each life stage.

I would like to ask you. How can I perform gene set enrichment analysis and which tools or web servers can I use?

Best

RNA-Seq genome gene • 1.3k views
ADD COMMENT
0
Entering edit mode

To do an enrichment test on your data you need to define two things: 1) What conditions do you want to compare? 2) What subset of genes do you want to test for enrichment of differen gene-sets (and compared to what)? Could you elaborate on both of these I can easily help you out :-)

ADD REPLY
0
Entering edit mode

Thank you for reply.

  1. Instead of conditions, I have life stages starting from egg to adults (total 8 stages). 2.I have a peptidase gene set that has 5 subfamilies like aspartic, metallo, ... A total of 800 genes are peptidase and belong to 5 subfamilies. These genes have FPKM values. I want to know which peptidase family is enriched at which life stage.
ADD REPLY
0
Entering edit mode

All genes are present in all stages, I can assure you the genome is not changing. What do you think enriched means?

ADD REPLY
1
Entering edit mode
6.3 years ago

It sounds like you would want to select an FPKM threshold to establish which genes are highly expressed at each life stage. This would give you a gene list for each life stage which you could perform gene set enrichment analysis on using the full peptidase gene list as the background set with a tool such as DAVID. https://david.ncifcrf.gov/ You would do the GSEA for each life stage and then need to account for multiple comparisons.

Selecting the FPKM threshold may be arbitrary though, you could either set a global threshold off the distribution of all the FPKM values or on a per gene basis (ie. For each gene if FPKM(life_stageX) > median(FPKM of that gene across all life stages) then that gene at life stage X is highly expressed.

Either way the FPKM cutoff selection would require some justification on your end.

ADD COMMENT

Login before adding your answer.

Traffic: 1585 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6