Hierarchical Clustering in single-channel agilent microarray experiment
2
1
Entering edit mode
6.4 years ago
Leite ★ 1.3k

Hello everyone,

I'm trying to put a Hierarchical Clustering and boxplot in this code but always from error, what would be the best package to do this?

#Load limma
library(limma)

#Set-up
targetinfo <- readTargets("Targets.txt",row.names="FileName",sep="")

#Read files
project <- read.maimages(targetinfo,source="agilent", green.only=TRUE)

#Background correction
project.bgc <- backgroundCorrect(project, method="normexp", offset=16)

#Normalize the data with the 'quantile' method for 1-color
project.NormData <-normalizeBetweenArrays(project.bgc,method="quantile")

# load colour libraries
library(RColorBrewer)
# set colour palette
cols <- brewer.pal(8, "Set1")

#Histogram of non-normalized
plotDensities(project.bgc, col=cols, legend=FALSE)

#Histogram of normalized
plotDensities(project.NormData, col=cols, legend=FALSE)

#Create the study design and comparison model
design <- paste(targetinfo$Target, sep="")
design <- factor(design)
comparisonmodel <- model.matrix(~0+design)
colnames(comparisonmodel) <- levels(design)
#Checking the experimental design
design
comparisonmodel

project.fit <- lmFit(project.NormData, comparisonmodel)
project.fit <- lmFit(project.NormData,comparisonmodel)

#Applying the empirical Bayes method
project.fit.eBayes <- eBayes(project.fit)
names(project.fit.eBayes)

#Make individual contrasts and fit the new model
CaseControl <- makeContrasts(CaseControl="D0A-Control", levels=comparisonmodel)
CaseControl.fitmodel <- contrasts.fit(project.fit.eBayes, CaseControl)
CaseControl.fitmodel.eBayes <- eBayes(CaseControl.fitmodel)

#Filtering Results
 nrow(topTable(CaseControl.fitmodel.eBayes, coef="CaseControl", number=99999, lfc=2))
 probeset.list <- topTable(CaseControl.fitmodel.eBayes, "CaseControl", number=99999, adjust.method="BH", sort.by="P", lfc=2)

#To save results
write.table(probeset.list, "results.txt", sep="\t", quote=FALSE)

Best regards,

Leite

R Hierarchical clustering agilent • 5.2k views
ADD COMMENT
1
Entering edit mode

Hi Leite,

I assume your question is about microarray data, but you never specified that. I adapted your title to clarify this.

but always from error

It's very hard for us to figure out what's going wrong if you don't show us the error message.

Cheers,
Wouter

ADD REPLY
0
Entering edit mode

Hey WouterDeCoster,

Sorry for the lack of information, you're correct, its about microarray data analysis.

I've tryed this code for bloxplot:

boxplot(project.NormData, col=cols)
Error in sort.int(x, na.last = na.last, decreasing = decreasing, ...) : 
'x' must be atomic

And this code for Hierarchical Clustering:

plot(project.NormData)
Error in as.matrix(E$E) : object 'E' not found

Thank you so much,

Leite

ADD REPLY
14
Entering edit mode
6.4 years ago

The expression values for a 2- or single-colour Agilent array are stored in the 'M' or 'E' variable, respectively, i.e., project.NormData$M or project.NormData$E ( C: Single-color Agilent array analyzing in R )

For each of the following functions, I encourage you to devote a full day to understanding what each and every parameter is doing. That is the best way for you to learn.

-------------------------------------------

Box-and-whisker plot

par(mar=c(8,8,5,5), cex=1.0, cex.axis=1.4, cex.lab=1.4)
boxplot(project.NormData$E,
  main="Box-and-whisker plot",
  xlab="", ylab=bquote(~Log[2]~expression),
  names=paste("Sample", c(1:ncol(project.NormData$E))),
  col="skyblue",
  las=2,
  outline=FALSE)

box


Violin plot (Wouter likes violin plots - maybe he plays a violin)

require(reshape2)
violinMatrix <- reshape2::melt(project.NormData$E)
colnames(violinMatrix) <- c("Gene","Sample","Expression")

library(ggplot2)
ggplot(violinMatrix, aes(x=Sample, y=Expression)) +
  geom_violin() +
  theme(axis.text.x = element_text(angle=45, hjust=1))

violin


Hierarchical clustering (unsupervised on entire dataset - very CPU and memory intensive)

For a simple dendrogram or circular dendrogram, take a look at my threads:

dend


heatmap.2 (hierarchical clustering dendrogram with heatmap)

For the heatmaps, you usually want to filter your expression matrix for genes that are differentially expressed. You appear to have just fitered out probes that are greater than absolute log (base 2) fold change 2, stored in your probeset.list object

#Filter the expression matrix to include only differentially expressed genes
sigmatrix <- project.NormData$E[probeset.list,]

#Scale the filtered expression matrix (convert to Z scale)
heat <- t(scale(t(sigmatrix)))

#Set colour
require(RColorBrewer)
myCol <- colorRampPalette(c("violet", "black", "springgreen"))(100)
myBreaks <- seq(-3, 3, length.out=101)

require("gplots")

#Euclidean distance; Ward's linkage
par(mar=c(1,1,1,1), cex=1.0)
heatmap.2(heat,
  col=myCol,
  breaks=myBreaks,
  main="",
  key=T, key.xlab="Expresssion\nZ-score", keysize=1.0,
  scale="none",
  ColSideColors=condition,
  density.info="none",
  reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean), 
  trace="none",
  cexRow=1.0, cexCol=1.0,
  distfun=function(x) dist(x, method="euclidean"),
  hclustfun=function(x) hclust(x, method="ward.D2"),
  margins=c(6, 6))
legend("top",
  bty="n",
  cex=1.0,
  title="Condition",
  c("Wild-type", "Knock-out"), fill=c("yellow", "royalblue"),
  horiz=TRUE)

#1 - Pearson correlation distance; Ward's linkage
par(mar=c(1,1,1,1), cex=1.0)
heatmap.2(heat,
  col=myCol,
  breaks=myBreaks,
  main="",
  key=T, key.xlab="Expresssion\nZ-score", keysize=1.0,
  scale="none",
  ColSideColors=condition,
  density.info="none",
  reorderfun=function(d,w) reorder(d, w, agglo.FUN=mean),
  trace="none",
  cexRow=1.0, cexCol=1.0,
  distfun=function(x) as.dist(1-cor(t(x))),
  hclustfun=function(x) hclust(x, method="ward.D2"),
  margins=c(6, 6))
legend("top",
  bty="n",
  cex=1.0,
  title="Condition",
  c("Wild-type", "Knock-out"), fill=c("yellow", "royalblue"),
  horiz=TRUE)

heatmap

ComplexHeatmap

For ComplexHeatmap, see my recent post here: C: how to cluster genes in heatmap

I have also posted code in a comment following this answer.

ADD COMMENT
2
Entering edit mode

ComplexHeatmap

  ColAnn <- data.frame(condition)
  colnames(ColAnn) <- c("Condition")
  ColAnn <- HeatmapAnnotation(df=ColAnn,
    which="col",
    col=list(Condition=c("Wild-type"="yellow", "Knock-out"="royalblue")))

  #Boxplots for rows (genes) and columns (samples)
  boxAnnCol <- HeatmapAnnotation(
    boxplot=anno_boxplot(heat,
      border=FALSE,
      gp=gpar(fill="#CCCCCC"),
      lim=NULL,
      pch=".",
      size=unit(2, "mm"),
      axis=FALSE,
      axis_side=NULL,
      axis_gp=gpar(fontsize=12)),
    annotation_width=unit(c(1, 7.5), "cm"))
  boxAnnRow <- rowAnnotation(
    boxplot=row_anno_boxplot(heat,
      border=FALSE,
      gp=gpar(fill="#CCCCCC"),
      lim=NULL,
      pch=".",
      size=unit(3, "cm"),
      axis=FALSE,
      axis_side="top",
      axis_gp=gpar(fontsize=12)),
    annotation_width=unit(c(3), "cm"))

  myCol <- colorRampPalette(c("violet", "black", "springgreen"))(100)
  myBreaks <- seq(-3, 3, length.out=100)

  hmap <- Heatmap(heat,
            name="Expression Z-score",
            col=colorRamp2(myBreaks, myCol),
            heatmap_legend_param=list(
              color_bar="continuous",
              legend_direction="horizontal",
              legend_width=unit(5,"cm"),
              title_position="topcenter",
              title_gp=gpar(fontsize=15, fontface="bold")),

            #Row annotation configurations
            cluster_rows=TRUE,
            show_row_dend=TRUE,
            row_title="Gene",
            row_title_side="left",
            row_title_gp=gpar(fontsize=16, fontface="bold"),
            row_title_rot=90,
            show_row_names=TRUE,
            row_names_side="left",
            row_names_gp=gpar(fontsize=10),

            #Column annotation configuratiions
            cluster_columns=TRUE,
            show_column_dend=TRUE,
            column_title="Samples",
            column_title_side="top",
            column_title_gp=gpar(fontsize=16, fontface="bold"),
            column_title_rot=0,
            show_column_names=TRUE,
            column_names_gp=gpar(fontsize=14),

            #Dendrogram configurations: columns
            clustering_distance_columns=function(x) as.dist(1-cor(t(x))),
            clustering_method_columns="ward.D2",
            column_dend_height=unit(30,"mm"),

            #Dendrogram configurations: rows
            clustering_distance_rows=function(x) as.dist(1-cor(t(x))),
            clustering_method_rows="ward.D2",
            row_dend_width=unit(30,"mm"),

            #Annotations (row annotation must be added with 'draw' function, below)
            top_annotation_height=unit(0.5,"cm"),
            top_annotation=ColAnn,

            bottom_annotation_height=unit(3, "cm"),
            bottom_annotation=boxAnnCol)

    draw(hmap + boxAnnRow, heatmap_legend_side="left", annotation_legend_side="right")

complexheatmap

ADD REPLY
1
Entering edit mode
6.4 years ago
Hussain Ather ▴ 990

Check out clustermap

ADD COMMENT

Login before adding your answer.

Traffic: 2420 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6