Question

Tools for visualization of CNVs called by XHMM

2

Entering edit mode

6.1 years ago

nkausthu ▴ 30

Hi,

I have a list of CNVs called using XHMM from exomes. I would like to know the best way to visualize these CNVs. I hope to get a reply soon.

CNV Visualization XHMM • 3.2k views

ADD COMMENT • link updated 6.1 years ago by bernatgel ★ 3.4k • written 6.1 years ago by nkausthu ▴ 30

score 1 · Answer 1 · 2018-03-20

Hi @nkausthu,

I've not used XHMM myself, but if you can load the data into R, you can plot the data using karyoploteR.

I've created a small simulated results using the .xcnv as decribed in the XHMM tutorial using the createRandomRegions function from regioneR with the following code:

library(karyoploteR)
set.seed(1231)

#Simulate output
xhmm.out <- data.frame()
for(nsample in 1:10) {
  dd <- toDataframe(createRandomRegions(nregions=10, length.mean = 20e6, length.sd=5e6, mask=NA, non.overlapping = TRUE))
  intervals <- paste0(dd[,1], ":", dd[,2], "-", dd[,3])
  xhmm.out <- rbind(xhmm.out, data.frame(SAMPLE=paste0("Sample", nsample),
                                   CNV=rep(c("DEL", "DUP"), 5),
                                   INTERVAL=intervals,  stringsAsFactors=FALSE))
}

That creates a data.frame with the columns we need to plot

> head(xhmm.out)
   SAMPLE CNV                 INTERVAL
1 Sample1 DEL  chr18:29077044-48674484
2 Sample1 DUP chr2:168227823-191704249
3 Sample1 DEL chr5:163255214-174457397
4 Sample1 DUP  chr20:14003146-35810966
5 Sample1 DEL  chr12:29226733-59642338
6 Sample1 DUP chr1:160156036-178393920

And we can start plotting. In this case we'll plot all chromosomes in a single line and represent each sample independently with gains in red and losses in green.

samples <- as.character(unique(xhmm.out$SAMPLE))
num.samples <- length(samples)

kp <- plotKaryotype(plot.type = 4, ideogram.plotter = NULL, labels.plotter = NULL)
kpAddChromosomeNames(kp, srt=45)
kpAddCytobandsAsLine(kp)

for(nsample in seq_len(num.samples)) {
  s <- samples[nsample]

  #Extract and prepare the data of the current sample
  sample.regions <- xhmm.out[xhmm.out$SAMPLE==s,]
  regs <- data.frame(do.call(rbind, strsplit(x = sample.regions$INTERVAL, split = c(":|-"))), stringsAsFactors=FALSE)
  regs[,2] <- as.numeric(regs[,2])
  regs[,3] <- as.numeric(regs[,3])
  regs <- toGRanges(regs)

  #And plot
    #Define the vertical space for the sample "track"
    r0 <- (nsample-1)/num.samples
    r1 <- (nsample)/num.samples - 0.05 #this 0.05 is a small margin between samples
    #Add the labels, etc...
    kpAddLabels(kp, r0=r0, r1=r1, labels = s)
    kpAbline(kp, h=0.5, r0=r0, r1=r1, col="#888888")
    #And plot the regions DUP and DEL
    kpSegments(kp, data=regs[sample.regions$CNV=="DUP"], y0 = 0.5, y1=0.5, r0=r0, r1=r1, col="red", lwd=5)
    kpSegments(kp, data=regs[sample.regions$CNV=="DEL"], y0 = 0.5, y1=0.5, r0=r0, r1=r1, col="green", lwd=5)
}

And you'll get something like this

enter image description here

You can change the chromosome arrangement and the representation of the DUPs and DELs using any of the available plotting functions you can fin in the karyoploteR tutorial and examples page.

score 0 · Answer 2 · 2018-03-15

0

Entering edit mode

6.1 years ago

h.mon 35k

The XHMM manual has a section titled "Visualizing XHMM results with included R scripts". Also, if you output the results to vcf, you can explore them with IGV.

ADD COMMENT • link 6.1 years ago by h.mon 35k

score 0 · Answer 3 · 2018-03-19

Thank you so much for the reply. I had tried the XHMM tutorial but an error as follows. ./pseq --group refseq --locdb locdb --group refseq --file /mnt/exome/Softwares/HG19/nexterarapidcapture_exome_target_region.bed --out annotated_targets.refseq

||||||||||||||||||||||||||| PSEQ (v0.10; 14-Jul-14) |||||||||||||||||||||||||||

pseq error : command refseq not recognised

Also I tried to load VCF file generated by XHMM in IGV but again I get an error.

Error loading DATA.vcf: The provided VCF file is malformed at approximately line number 34: Symbolic alleles not allowed as reference allele: , for input source: DATA.vcf