cn.mops - Error in if (n > 512) n <- 2^ceiling(log2(n))
1
1
Entering edit mode
9.9 years ago
emmasreid ▴ 20

Hi all,

Firstly, apologies if this is a daft/duplicated question - I am very new to this forum and data analysis using R in general.

I am trying to analyse some gene panel data using the cn.mops package in R using the code provided under Section 6 of the package vignette ("Exome sequencing data"). The first part of the code seems to work with no errors:

> BAMfiles <-c("~/Alignments/Filename.bam","~/Alignments/Filename2.bam","~/Alignments/Filename3.bam","~/Alignments/Filename4.bam","~/Alignments/Filename5.bam","~/Alignments/Filename6.bam","~/Alignments/Filename7.bam")
> segments <- read.table("~/Alignments/NeuroMetabolicCodingExons.bed",sep="\t",as.is=TRUE)
> gr <- GRanges(segments[,1],IRanges(segments[,2],segments[,3]))
> X <- getSegmentReadCountsFromBAM(BAMfiles,GR=gr,mode="unpaired")

However, when I then enter the following command as instructed:

> resCNMOPS <- exomecn.mops(X)

It comes up with the following error:

Normalizing...
Error in if (n > 512) n <- 2^ceiling(log2(n)) : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In normalizeChromosomes(X, chr = chr, normType = normType, qu = normQu) :
  There exists a reference sequence with zero reads for some samples.

If anyone could enlighten me as to what this error means/how to fix it I would be eternally grateful.

Many thanks.

ngs R next-generation-sequencing cn.mops • 5.3k views
ADD COMMENT
1
Entering edit mode

Hello,

Sorry for the inconveniences with the package. Indeed, the problem comes from chromosomes that have predominantly zeros in the read count matrix. Typically the median read count is used to calculate normalization factors...

I am now catching this case and I am returning an error message saying that people should remove this samples or chromosomes (usually, chromosome "Y").

For copy number detection on chromosomes X and Y, I suggest to run cn.mops separately for males an females. Please do not hesitate to contact me, if you encounter problems with the package.

The updated version should be on Bioconductor in a few days.

Regards,

Günter Klambauer

ADD REPLY
0
Entering edit mode

Thanks for updating the package!

ADD REPLY
0
Entering edit mode

I am sorry but it still gives the same error. The worst thing is that you have to make all the calculations again after trying to exclude some scaffolods with near zero alignments. Could you make it to write error log and automatically exclude bad chromosomes while working on all others without stopping?

Thank you very much!

ADD REPLY
0
Entering edit mode
9.9 years ago

The warning message is actually the cause of the error. In short, exomecn.mops() calls normalizeChromosomes(), which in turn iterates over all of the chromosomes, of which one seems to lack reads. Thus, there's a step where a matrix is subset by chromosome, producing a 0-dimensional matrix. Various things are done to that (e.g., median), which all produce NA. The actual error ends up being caused by normalizeChromosomes() calling density() with n=NA (since n depends on the aforementioned matrix). I would suspect that fixing that warning message will solve the error.

ADD COMMENT
0
Entering edit mode

Thank you so much for your swift reply! I think I understand your explanation of the problem - any ideas on the command(s) I should use instead to circumvent this problem?

(Apologies again for my ignorance - I've been struggling with this for days now...!)

ADD REPLY
0
Entering edit mode

You should be able to use seqlevels(X) to both see and set the chromosomes that will be looked at. If you table(seqlevels(X)), you'll see which chromsome has no alignments, so just remove that one.

ADD REPLY
0
Entering edit mode

Thank you! So I would use that code to determine which chromosome needs to be removed, and then I assume this line of code would need to be inserted somewhere:

> normalizeChromosomes(X, chr, normType = "poisson", qu = 0.25, ploidy)

Any chance you could hazard a guess as to where I should put this (or a similar) line? From what I've read, neither the exomecn.mops() nor the cn.mops() command seem to have the ability to specify particular chromosomal regions. From what I understand, these are the lines which determine the regions of interest:

> segments <- read.table("~/Alignments/NeuroMetabolicCodingExons.bed",sep="\t",as.is=TRUE)
> gr <- GRanges(segments[,1],IRanges(segments[,2],segments[,3]))

But I think I would still need to specify the .bed file, and I can't seem to see an easy way to do this and also exclude certain regions. Hope that made sense?

Thanks again for all your help.

ADD REPLY
0
Entering edit mode

exomecn.mops() calls normalizeChromosomes(), so I don't think you'll need to add that line anywhere. The list of all chromosomes is held in seqlevels(X), so you can just change that:

table(seqlevels(X)) #Just look at which one has 0 reads. Supposing it's the second one...
seqlevels(X) <- seqlevels(X)[-2]
ADD REPLY
1
Entering edit mode

Hello! Just an update - I did as you suggested and did table(seqlevels(X)). This gave a value of '1' for all chromosome:

chr1 chr10 chr11 ...etc.
    1     1     1      ...etc.

I then tried the head(X[,1:3]) command to see if the counts are even being called correctly at all and got this:

GRanges with 6 ranges and 3 metadata columns:
      seqnames               ranges strand | Sample_1.sorted.bam Sample_2.sorted.bam
         <Rle>            <IRanges>  <Rle> |                  <integer>                  <integer>
  [1]     chr1 [76190472, 76190502]      * |                          0                          0
  [2]     chr1 [76194085, 76194173]      * |                          0                          0
  [3]     chr1 [76198328, 76198426]      * |                          0                          0
  [4]     chr1 [76198537, 76198607]      * |                          0                          0
  [5]     chr1 [76199212, 76199313]      * |                          0                          0
  [6]     chr1 [76200475, 76200556]      * |                          0                          0

chr1  chr2  chr3  chr4  chr5  chr6  chr7  chr8  chr9 ... chr14 chr15 chr16 chr17 chr18 chr19 chr20 chr21 chr22
      NA    NA    NA    NA    NA    NA    NA    NA    NA ...    NA    NA    NA    NA    NA    NA    NA    NA    NA

So clearly something has gone very wrong somewhere...! It seems to be choosing the correct regions from my .bed file but clearly not getting any counts - would you think this is a problem with my .bam files or the code?

ADD REPLY
1
Entering edit mode

Hi emmasreid,

I was struggling with the very same problem for a few days now and I think I came over it a few hours ago... simply switch to mode="paired" in getSegmentReadCountsFromBAM and you will eventually get read count values in "X" for your samples. A look into the Table S3 of this paper lead me to this as they also use mode="paired" on 16 apparently unrelated Exomes.

Hope that helps!

cheers Max

ADD REPLY
0
Entering edit mode

This worked perfectly - thank you so much Max! Finally manged to pick up some CNVs after days of struggling. Thank you again for solving my problem!

Cheers,
Emma

ADD REPLY
1
Entering edit mode

I used mode="paired" in getSegmentReadCountsFromBAM and still am getting this error.

This is how my seqlevels look like:

> seqlevels(X)
 [1] "chr1"  "chr10" "chr11" "chr12" "chr13" "chr14" "chr15" "chr16" "chr17" "chr18" "chr19" "chr2"  "chr20" "chr21" "chr22" "chr3"  "chr4"  "chr5"  "chr6"  "chr7"  "chr8"  "chr9"  "chrX"  "chrY"
> table(seqlevels(X))

chr1 chr10 chr11 chr12 chr13 chr14 chr15 chr16 chr17 chr18 chr19  chr2 chr20 chr21 chr22  chr3  chr4  chr5  chr6  chr7  chr8  chr9  chrX  chrY
    1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1     1

Can someone please help me solve this error?

ADD REPLY
1
Entering edit mode

Try changing the normalization method, i.e.:

res <- cn.mops(X, normType="mean")

Default is normType="poisson"

Hope that helps.

ADD REPLY
0
Entering edit mode

Ah, that's informative. My guess would be that the chromosome names in the BAM and BED files are different (e.g., one uses Ensemble chromosome names and the other UCSC names). Have a quick look at a few reads in the BAM file and some positions in the BED file to check this. If that's not the problem, have a look at both files together in IGV or another browser.

ADD REPLY

Login before adding your answer.

Traffic: 1706 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6