DEXseq Error - SummarizedExperiment
1
1
Entering edit mode
4.9 years ago
BM ▴ 70

I have an error when performing a DEXseq count for exon usuage from RNA-seq data.

DEXSeqDataSetFromSE( SE, design= ~ sample + exon + condition:exon )

I receive the error: Error in DEXSeqDataSetFromSE(SE, design = ~sample + exon + condition:exon) : 
make sure your SummarizedExperiment object contain the columns gene_id, tx_name and exonic_part

Any help would be appreciated.

This is the full code

library(DEXSeq)
library(GenomicRanges)
library(GenomicFeatures)
library(GenomicAlignments)
library(biomaRt)
hse = makeTxDbFromBiomart( biomart="ensembl", dataset="mmusculus_gene_ensembl", host="ensembl.org")
library(txdb.mmusculus.ucsc.mm10.knowngene)
exonicParts = exonicParts( hse, linked.to.single.gene.only = TRUE )
list.files(pattern=".bam$")
fls = list.files(pattern="bam$", full=TRUE )
bamlst = BamFileList( fls, index=character(), yieldSize=100000, obeyQname=TRUE )
SE = summarizeOverlaps( exonicParts, bamlst, mode="Union", singleEnd=FALSE, ignore.strand=TRUE, inter.feature=FALSE, fragments=TRUE )
DEXSeqDataSetFromSE( SE, design= ~ sample + exon + condition:exon )
Error in DEXSeqDataSetFromSE(SE, design = ~sample + exon + condition:exon) : 
make sure your SummarizedExperiment object contain the columns gene_id, tx_name and exonic_part

**Session Info:**

> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] biomaRt_2.38.0              GenomicAlignments_1.18.1    Rsamtools_1.34.1           
 [4] Biostrings_2.50.2           XVector_0.22.0              GenomicFeatures_1.34.8     
 [7] DEXSeq_1.28.3               RColorBrewer_1.1-2          AnnotationDbi_1.44.0       
[10] DESeq2_1.22.2               SummarizedExperiment_1.12.0 DelayedArray_0.8.0         
[13] matrixStats_0.54.0          GenomicRanges_1.34.0        GenomeInfoDb_1.18.2        
[16] IRanges_2.16.0              S4Vectors_0.20.1            Biobase_2.42.0             
[19] BiocGenerics_0.28.0         BiocParallel_1.16.6        

loaded via a namespace (and not attached):
 [1] httr_1.4.0             bit64_0.9-7            splines_3.5.2          assertthat_0.2.1      
 [5] Formula_1.2-3          statmod_1.4.30         BiocManager_1.30.4     latticeExtra_0.6-28   
 [9] blob_1.1.1             GenomeInfoDbData_1.2.0 progress_1.2.1         pillar_1.4.0          
[13] RSQLite_2.1.1          backports_1.1.4        lattice_0.20-38        digest_0.6.18         
[17] checkmate_1.9.3        colorspace_1.4-1       htmltools_0.3.6        Matrix_1.2-17         
[21] plyr_1.8.4             XML_3.98-1.19          pkgconfig_2.0.2        genefilter_1.64.0     
[25] zlibbioc_1.28.0        xtable_1.8-4           snow_0.4-3             scales_1.0.0          
[29] htmlTable_1.13.1       tibble_2.1.1           annotate_1.60.1        ggplot2_3.1.1         
[33] nnet_7.3-12            lazyeval_0.2.2         survival_2.44-1.1      magrittr_1.5          
[37] crayon_1.3.4           memoise_1.1.0          hwriter_1.3.2          foreign_0.8-71        
[41] prettyunits_1.0.2      tools_3.5.2            data.table_1.12.2      hms_0.4.2             
[45] stringr_1.4.0          munsell_0.5.0          locfit_1.5-9.1         cluster_2.0.9         
[49] compiler_3.5.2         rlang_0.3.4            grid_3.5.2             RCurl_1.95-4.12       
[53] rstudioapi_0.10        htmlwidgets_1.3        bitops_1.0-6           base64enc_0.1-3       
[57] gtable_0.3.0           curl_3.3               DBI_1.0.0              R6_2.4.0              
[61] gridExtra_2.3          rtracklayer_1.42.2     knitr_1.22             bit_1.1-14            
[65] Hmisc_4.2-0            stringi_1.4.3          Rcpp_1.0.1             geneplotter_1.60.0    
[69] rpart_4.1-15           acepack_1.4.1          xfun_0.7
RNA-Seq DEXseq SummarizedExperiment DESeq2 • 1.8k views
ADD COMMENT
0
Entering edit mode

make sure your SummarizedExperiment object contain the columns gene_id, tx_name and exonic_part**

Did you do that? Check the DEXseq manual on #Requirements on GTF files.

ADD REPLY
0
Entering edit mode
4.9 years ago

This is a Bioconductor issue it seems... In the past it looks like DEXSeq took an "exonic parts" object created through a disjointExons call despite the current manual suggesting that users use an exonicParts call.

If it is no problem for you to reorder your ranges then a simple fix is the following:

orderByGeneName <- order(unlist(exonicParts$gene_id, use.names=FALSE))
exonic_rle <- runLength(Rle(unlist(exonicParts$gene_id[orderByGeneName],use.names=FALSE))) 
exonicParts <- exonicParts[orderByGeneName]
exonic_part <- unlist(lapply(exonic_rle, seq_len), use.names=FALSE)
exonicParts$exonic_part <- exonic_part

If you already have your ranges and want to keep them in the same order since you already made your SE object then use the following (simply uses a temporary variable):

orderByGeneName <- order(unlist(exonicParts$gene_id, use.names=FALSE))
exonic_rle <- runLength(Rle(unlist(exonicParts$gene_id[orderByGeneName],use.names=FALSE))) 
exonicParts2 <- exonicParts[orderByGeneName]
exonic_part <- unlist(lapply(exonic_rle, seq_len), use.names=FALSE)
exonicParts2$exonic_part <- exonic_part
exonicParts$exonic_part <- exonicParts2$exonic_part[match(as.character(exonicParts),as.character(exonicParts2))]
rm(exonicParts2)
## double check that the following is TRUE
identical(rowRanges(SE),exonicParts)
rowData(SE)$exonic_part <-  exonicParts$exonic_part

Although I did notice some minor discrepancies (~3-7%) between the disjointExons and exonicParts; which I think relates to associations between genes and exons.

ADD COMMENT
0
Entering edit mode

I tried what you suggested using option 1,now I have the following error:

converting counts to integer mode Error in DESeqDataSet(rse, design, ignoreRank = TRUE) : all samples have 0 counts for all genes. check the counting script.

ADD REPLY
0
Entering edit mode

I assume you checked that your count table actually contains non-zero counts?

Ultimately, this should only provide some meta-data to the final object... the counts should remain unchanged.

ADD REPLY

Login before adding your answer.

Traffic: 2581 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6