How many genes are expressed in each condition: control / treatment group
1
0
Entering edit mode
4.0 years ago
ek699 ▴ 10

Hi I am working on RNA-seq differential expression analysis using DESeq2 in R. My code looks like:

countMatrix2 = as.matrix(read.table("countMatrixName.txt", header = TRUE, row.names = 1))
condition2 <- factor(rep(c("Treatment", "Control"), each=4))
colData2 <- data.frame(condition2)
rownames(colData2)<- colnames(countMatrix2)

dds2<-DESeqDataSetFromMatrix(countMatrix2, colData2, ~ condition2)

# Pre-filtering 
dds2 <- dds2[rowSums(counts(dds2))>=10,]
# Setting the factor levels
dds2$condition2 <- relevel(dds2$condition2, ref = "Control")

# Differential Expression Analysis
dds_des2 <- DESeq(dds2)
# Getting result table with alpha = 0.05
res0052 <- results(dds_des2, alpha = 0.05)

From here, I want to know how many genes are expressed (for example simply count > 0) separately in Control group and Treatment group. I am confused because the result table only shows Control vs Treatment information like log2foldchange, p-value, etc. How can I separately observe each condition: Control group and Treatment group? This is my first time to use deseq2, and any thoughts are greatly helpful. Thank you!

RNA-Seq DESeq2 gene expression • 801 views
ADD COMMENT
0
Entering edit mode
4.0 years ago
ATpoint 82k

For expression levels you have to check the counts itself, so counts(dds2) or counts(dds2, normalized=TRUE). The results only contain the pairwise comparison, so fold changes and significances, average expression.

ADD COMMENT
0
Entering edit mode

Thanks for the information! But, when I use counts(dds2, normalized=TRUE) then it might still contain the data that will be removed after I apply the function DESeq() and remove NAs. Is it still a valid result? Or, do I need to remove the NAs only for the differentially expressed genes? And the number of expressed genes for each group: Condition and Treatment, should I just do counts(dds2[,1:4], normalized=TRUE) and counts(dds2[,5:8], normalized=TRUE) ??

ADD REPLY
0
Entering edit mode

What exactly do you want to do? The counts are based on all genes that were in your reference that you quantified against. DESeq2 will remove genes during its independent filtering. What do you mean by valid? Considering every gene that has > 1 count as expressed in inaccurate since there is background transcription that is not biologically meaningful, alignment errors, sequencing noise etc. Which question do you want to answer?

ADD REPLY
0
Entering edit mode

As I addressed in my question above, I literally want to know how many genes are expressed separately in Control group and Treatment group from my code above. I also need to define the threshold for "expressed genes", for example, any gene with count > 0 or any gene with count > 0 in more than one sample. When I did results(dds_des2, alpha = 0.05), I was still able to see there are a few rows with adjusted p-value = 0, which means, even after DESeq2 applies the independent filtering, it still keeps the rows having a low mean normalized count by setting its adjusted p-value = 0.

ADD REPLY
0
Entering edit mode

cutoff for expressed genes in DESeq2 plus many other threads via the search function.

ADD REPLY

Login before adding your answer.

Traffic: 2568 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6