select DE genes
2
2
Entering edit mode
7.0 years ago
mms140130 ▴ 60

Hello,

I have normalized gene expression data for 1095 patients with breast cancer part of the data as follows

      patient 1    patient 2   patient3   patient4 patient5  patient6 

    AASS    80.8588 135.3218    158.7152    20.7441 187.836 126.2016    
    AATF    2012.3344   990.1661    727.4445    1498.5344   1329.6371   
    AATK    179.534 209.8278    35.4275 13.5558 99.1263 51.2694 
    ABAT    2086.3408   77.9285 600.3779    101.0147    1564.1801 1439.816                                                                              
    ABCA11P 79.5614 91.9899 152.9806    65.3844 36.7641 85.9551 
    ABCA12  43.8556 1.0531  37.317  10.823  82.3253 4.9298   
    ABCA13  21.9278 2.6327  0.9447  0.9019  0.672   0

can I using this data find differentially expressed genes, what package in R can I used to get the DE genes?

R gene rna-seq • 1.8k views
ADD COMMENT
2
Entering edit mode

So you want to find differentially expressed genes, but you only have one group. Do you even know what differential expression means?

In a differential expression analysis you want to compare the expression of one group (e.g. patients) with another group (e.g. healthy controls). You want to find out which genes are differentially expressed (over- or underexpressed) in patients versus the control group.

ADD REPLY
0
Entering edit mode

Sorry but I'm learning how to analyze genomic data I have understood that the DE should be between 2 groups ( normal , tumor ) So I guess my question should be how to visualize the distribution of 20,000 genes in breast cancer patients

ADD REPLY
0
Entering edit mode

What is the aim of your analysis? What is the biological question you are trying to solve?

ADD REPLY
0
Entering edit mode

I'm trying to see if there association between gene expression and genotype snp data One assumption is normally distributed in a regression I'm trying to find a way to visualize the distribution af the gene expression data

ADD REPLY
0
Entering edit mode

association between gene expression and genotype snp data

That would be an eQTL analysis. You may want to have a look at this tutorial.

ADD REPLY
0
Entering edit mode

Is the package do the normal transform (log2(x+1)) or I have to do that before applying the eQTL since it uses regression and we have to validate the assumptions

ADD REPLY
2
Entering edit mode

Asking for diff expressed genes was probably not the right question here.

Are you interested in classifying these breast cancer samples into sub-types? Like in this paper?

ADD REPLY
1
Entering edit mode

DEGs between individual patients or groups (cancer vs normal)?

ADD REPLY
0
Entering edit mode

what I want is to find the DE genes and try to visualize the distribution of DE genes so I think it should be between phenotypes but I'm not sure ,

what is the difference between DE genes between samples and between (cancer, normal)

ADD REPLY
1
Entering edit mode

I think you need to think about a good research question first, before you can get some good answers.

ADD REPLY
0
Entering edit mode

Well I'm new to biology and genetics I'm trying my best

ADD REPLY
1
Entering edit mode

I wasn't trying to dis you or put you down or anything. It's just that I see many scientists, new to bioinformatics, expecting to get answers without formulating a research question first. Bioinformatics is just like any other science, hence defining a research question first.

What is your goal with your data set? Know the differences between subsets of cancer? Or the difference between cancer and healthy? Etc.

ADD REPLY
0
Entering edit mode

are these fpkm values ?

ADD REPLY
0
Entering edit mode

They are TPM values

ADD REPLY
4
Entering edit mode
7.0 years ago
TriS ★ 4.7k

once you have a clear idea of how you want to compare your groups and you did a good Google search on how to do it, this is one of my favorite tutorials from Bioconductor:

https://www.bioconductor.org/help/workflows/RNAseq123/

however, it requires some knowledge of R.

also, don't forget to look at the BioStar Handbook

ADD COMMENT
2
Entering edit mode
7.0 years ago
mbk0asis ▴ 680

If all of your samples have the same phenotype (cancer for example), why would you want to find DEGs?

You should have at least one control sample to compare with.

Or you want to see the overall expression pattern of your data?

In that case, you are going to need a lot of computer power.

Data with ~1,000 samples x ~30,000 genes is too big to run on an ordinary PC.

ADD COMMENT
0
Entering edit mode

I was trying to get DE genes since I have to get a visualization about the distribution of gene expression and they are 20,000 genes for 1095 cancer patients So mybe I can reduce the number of genes to get a good plot Is my thinking correct. Plz let me know?

ADD REPLY

Login before adding your answer.

Traffic: 1682 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6