Using Circos plot to visualize RNA-seq data
2
0
Entering edit mode
5.7 years ago
▴ 210

Hello,

I recently got myself into Circos. It is an amazing tool to visualize data. However, I am a bit lost and I require some help understanding how to proceed. I have RNA-seq data of differentially expressed genes (Log2 fold change) and I would like to plot them on Circos.

I would like to have a plot like the following figure if possible.

Circos Plot

What type of data representation should I choose for Circos plot in order to obtain such representation ?

Thank you in advance.

Circos RNA-Seq visualization • 7.8k views
ADD COMMENT
1
Entering edit mode

Without wishing to rain on your parade too much, I've never really seen a Circos visualisation that was actually useful.

They look good on posters and stuff maybe, but if you're actually trying to show something, consider other alternatives like labelled volcano plots or something.

ADD REPLY
0
Entering edit mode

@badredda if you considering volcano plot then this is an elegant way to show volcano plot

ADD REPLY
0
Entering edit mode

Hello,

Thanks everyone for your feedback.

@toralmanvar, I have edited the image. I hope it is showing now. @krushnach80, Yes, I checked the documentation in Circos and I tried to produce a similar plot like the on in the picture but I couldn't find any tutorial to plot similarly to the example mentioned above. @jrj.healey, yes I agree that a Circos plot may not be useful to interpret but when you are working on big data, it is nice to have a plot to see how all your samples behave.

Thanks all.

ADD REPLY
0
Entering edit mode

hello badredda, but you have not attached the image properly. Please refer this post for adding image to biostar post.

ADD REPLY
0
Entering edit mode

have you seen the circos manual they have some proper documentation ...you have to make your datafiles as such given in the tutorial with chromosome number circos

ADD REPLY
0
Entering edit mode

Maybe it's obvious to others, but what do the different tracks in the figure represent? e.g. What are the dots, and why are they different colours? What are the peaks? What format is your data in?

ADD REPLY
0
Entering edit mode

@ badredda In addition, add some data.

ADD REPLY
0
Entering edit mode

@Russ, In the plot, the first outer-ring shows genes that are over-expressed and under-expressed relative to their position on the circos plot. The inner rings shows the density of the points.

ADD REPLY
0
Entering edit mode

You're trying to draw a scatter plot of log2 values of gene expression, correct? Have you looked at this tutorial? It explicitly defines the format your data should be in.

To change colours based on criteria (ie over- or under-expressed) you can create rules, explained here.

ADD REPLY
1
Entering edit mode

@Russ, thanks for the link, I will check it up and come back later with updates.

@cpad0112, sorry for late reply, my data looks like the following table:

example

And some downloadable data: https://goo.gl/pBvd3F

Thanks !

ADD REPLY
1
Entering edit mode

@ badredda

I am not sure of density plot inside circos as I do not see multiple points at any given coordinate (like SNPs or any other quantifiable information). However histogram and line plots are possible. P-values cannot be drawn unless they are transformed. with example data and Rcircos: Rplot

## Load libraries
suppressPackageStartupMessages(library(OmicCircos))
suppressPackageStartupMessages(library(biomaRt))
suppressPackageStartupMessages(library(naturalsort))
## Set options
options(stringsAsFactors = FALSE)
options(scipen = 999)

## Read example data
data=read.csv("Example.tsv", sep="\t")
data$padj=round(data$padj,6) ## round adjusted p-values to 6 digits
data$pvalue=round(data$pvalue,6) ## round p-values to 6 digits

## Load cytoband data. Cytoband data downloaded from UCSC for hg38.
hg38=read.csv("../reference/hg38/hg38.cyto", strip.white = T, sep="\t", header=F)
names(hg38)=c("chrom","chromStart","chromEnd","name","gieStain")
# Order the chromosomes in human readable format
hg38_ordered=hg38[naturalorder(hg38$chrom),]

## Use biomart to download the coordinates for genes using gene symbols in user data
mart=useMart(biomart = "ensembl", dataset = "hsapiens_gene_ensembl")
coords=getBM(attributes = c("chromosome_name","start_position","end_position","hgnc_symbol"), filters="hgnc_symbol", values=data$external_gene_name, mart = mart)
## Remove non-standard chromosomes from biomart data
coords_pruned=coords[- grep("CHR_", coords$chromosome_name),]

## Merge user data and ensembl data output
merged_data=merge(data, coords_pruned, by.x="external_gene_name", by.y="hgnc_symbol")
## Extract the columns of interest
final_data=merged_data[,c(9,12,13,4,8,1)]
names(final_data)[1:3]=c("chr","start","end")

## Sort and append "chr" to the chromosome columns
final_data_sorted=final_data[naturalorder(final_data$chr),]
final_data_sorted$chr=paste0("chr",final_data_sorted$chr)
## Plot circos
#png(paste0("circos.png"), width = 1200, height = 1200, units = "px",res = 80, type ="cairo")
chr.exclude <- "chrM"
cyto.info <- hg38_ordered
tracks.inside <- 10
tracks.outside <- 3
RCircos.Set.Core.Componentscyto.info, chr.exclude, tracks.inside, tracks.outside)
RCircos.Set.Plot.Area()
RCircos.Chromosome.Ideogram.Plot()
RCircos.Gene.Name.Plot(final_data_sorted, name.col = 6, track.num=2, side="in")
RCircos.Gene.Connector.Plot(final_data_sorted,  track.num = 1, side="in")
RCircos.Histogram.Plot(final_data_sorted,data.col=4, track.num=5, side="in")
RCircos.Line.Plot(final_data_sorted,data.col=4, track.num=4, side="in")

#dev.off()
ADD REPLY
1
Entering edit mode
5.7 years ago
Andrewoods ▴ 110

Maybe you want to try shinyCircos. It's an interactive APP. https://github.com/venyao/shinyCircos http://shinycircos.ncpgr.cn/

ADD COMMENT
0
Entering edit mode

Please note that further spamming will get your account suspended.

ADD REPLY
0
Entering edit mode

ywhzau : Please create a single new post under tool category to describe your application. Posting the same content in multiple threads is not the way to announce a tool you have developed.

ADD REPLY
0
Entering edit mode
5.7 years ago
bernatgel ★ 3.4k

As @jrj.healey said, circos plot are pretty but they are not always your best bet, specially with lots of data. If you want a whole genome view of your data and position on the genome is important to you, you can use karyoploteR to create non-circular genome plots.

There's a specific example at the karyoploteR examples page on visualizing the results of a differential gene expression analysis that creates a plot like the one below. You can easily zoom-in to specific regions if you need a detailed view.

enter image description here

ADD COMMENT

Login before adding your answer.

Traffic: 2188 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6