Comparing multiple conditions and understanding rlog/resLFC
0
0
Entering edit mode
3.8 years ago
ccha97 ▴ 60

Hello, I'm new to R and I'm having some trouble understanding some elements of the DESeq2 package (I'm an undergrad student who's never used the R prior to this project, so any help would be appreciated).

For context, I have three different conditions e.g. A, B, C (A = acute model, B = chronic model, C = a deletion in that chronic model B). I'm wanting to compare A vs B, as well as B vs C but wasn't sure which way to go about it. I was originally using the contrast function:

AB <- results(dds, contrast = c("condition", "A", "B"), alpha = 0.05)

BC <- results(dds, contrast = c("condition", "B", "C"), alpha = 0.05)

My current end goal is to use k-means clustering and form a heatmap. Based on the tutorials, I understand that the rlog function is used when visualising data.

pheatmap(assay(rld)[sigGenesAB,], cluster_rows=FALSE, show_rownames=FALSE,
         cluster_cols=FALSE, annotation_col = as.data.frame(cdata), row.names=rownames(cdata))

In this case [sigGenesAB,] refers to the deferentially expressed genes where the padj value < 0.05. However, when I generate this heatmap, it also includes the condition 'C' and I don't know what to make of it. I'm also unable to use the rlog function on AB as it comes up with this error:

rldAB <- rlog(AB)
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function ‘sizeFactors’ for signature ‘"DESeqResults"’

My supervisor suggested using factor levels, his code is something similar to this where he's obtained a matrix including the intercept, with zeroes and ones:

dds$condition <- factor(dds$condition, levels = c("A","B", "C"))    
condition <- factor(rep(c("A","B","C"))) 
model.matrix(~ condition)

I am aware of the Analyzing RNA-seq data with DESeq2 tutorial and have read through the sections (his code seems to be related to log fold shrinking/lfcshrink), but I'm still having trouble understanding things - should I be using rlog or lfcshrink to generate a heatmap? Ultimately, I want to do kmeans clustering and generate a heatmap, as well as investigate those specific clusters using GO-term analysis.

I've thought about making two different data sets (e.g. one with just the counts of A+B, and the other with just B+C) and doing a separate DESeq analysis for each, but it also means I'll have a lot of different variables which will probably get confusing downstream. I'd appreciate any help in understanding some of these concepts, as well as any recommendations regarding how I should approach my data.

R RNA-Seq rna-seq DESeq2 heatmap • 1.3k views
ADD COMMENT
0
Entering edit mode

EDIT: I've added the first line: dds$condition <- factor( c("A","B", "C")) I'm not sure if that will change my contrast results. Is someone able to explain the idea of a model matrix to me? I also have the code for lfcShrink

ABresLFC <- lfcShrink(dds, coef="A_vs_B", type="apeglm")
BCresLFC <- lfcShrink(dds, coef="B_vs_C", type="apeglm")

I also still want to make a heatmap for the differentially expressed genes - I've already stored the genes into variables (sigGenesAB, sigGenesBC), but just need help with coding the heatmap.

ADD REPLY

Login before adding your answer.

Traffic: 1536 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6