Missing value imputation for beta values of Methylation data
1
0
Entering edit mode
8.1 years ago

I have download the beta values for methylation data form GEO on different sample. Values form different probes are missing . But instead of delete those probes form my data, i want to impute those values.

impute package and problem

I tried to use impute package though it was developed for microarray data. While using impute package , the computational time is long and facing some problem regarding infinite recursion.

Details about the following problems :

1. Problem :

I got an error while i am running data. Error has explained by the following sample data and code

## Load data 
mdata <- as.matrix(read.table('https://gubox.box.com/shared/static/qh4spcxe2ba5ymzjs0ynh8n8s08af7m0.txt', header = TRUE, check.names = FALSE, sep = '\t')) 

## Install and load library 
source("https://bioconductor.org/biocLite.R") 
biocLite("impute") 
library(impute) 

## sets a limit on the number of nested expressions 
options(expressions = 500000)

## Apply k-nearest neighbors for missing value imputation 
res <-impute.knn(mdata)

Error: protect(): protection stack overflow

2. Problem:

Data : https://gubox.box.com/shared/static/kynad5ajjpqelncdn6djaw7ga35lkvd6.rdata [Note : Big file, 190MB]

library(impute) 
if(exists(".Random.seed")) rm(.Random.seed) 
imputedData <- impute.knn(as.matrix(exp_data))

Error: evaluation nested too deeply: infinite recursion / options(expressions=)?

I will appreciate

  1. if anybody can help regarding problem of impute package.
  2. Suggest any better way (package/methods) to impute beta values (computationally faster).

Thanks!!

methylation R imputation Bioconductor • 6.6k views
ADD COMMENT
0
Entering edit mode
8.1 years ago

My Initial solution for "infinite recursion" of impute package :

I have split the input matrix in smaller number by row. I have decreased the number of row in each splited part unitil i dont see the "infinite recursion" error message. After that , i have merged all the imputed matrix of splited parts.

R code :

myimpute <- function(data,clus = 2000) # clus = Number or row in splited matrix
{
library(impute)
data <- data.frame(data)
row_data <-nrow(data)
ind<-as.factor(c(gl(round(row_data/clus)-1,clus),rep(round(row_data/clus)-1+1, nrow(mm)-length(gl(round(row_data/clus)-1,clus)))))
newMat <- split(data, ind)
res <- lapply(newMat,function(x)impute.knn(as.matrix(x)))
res <- lapply(res,"[[","data")
res <- do.call(rbind, res)
res
}

Any one has any thought on this , please share .

ADD COMMENT

Login before adding your answer.

Traffic: 3012 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6