Question

DEXseq Error - SummarizedExperiment

1

Entering edit mode

4.9 years ago

BM ▴ 70

I have an error when performing a DEXseq count for exon usuage from RNA-seq data.

DEXSeqDataSetFromSE( SE, design= ~ sample + exon + condition:exon )

I receive the error: Error in DEXSeqDataSetFromSE(SE, design = ~sample + exon + condition:exon) : 
make sure your SummarizedExperiment object contain the columns gene_id, tx_name and exonic_part

Any help would be appreciated.

This is the full code

library(DEXSeq)
library(GenomicRanges)
library(GenomicFeatures)
library(GenomicAlignments)
library(biomaRt)
hse = makeTxDbFromBiomart( biomart="ensembl", dataset="mmusculus_gene_ensembl", host="ensembl.org")
library(txdb.mmusculus.ucsc.mm10.knowngene)
exonicParts = exonicParts( hse, linked.to.single.gene.only = TRUE )
list.files(pattern=".bam$")
fls = list.files(pattern="bam$", full=TRUE )
bamlst = BamFileList( fls, index=character(), yieldSize=100000, obeyQname=TRUE )
SE = summarizeOverlaps( exonicParts, bamlst, mode="Union", singleEnd=FALSE, ignore.strand=TRUE, inter.feature=FALSE, fragments=TRUE )
DEXSeqDataSetFromSE( SE, design= ~ sample + exon + condition:exon )
Error in DEXSeqDataSetFromSE(SE, design = ~sample + exon + condition:exon) : 
make sure your SummarizedExperiment object contain the columns gene_id, tx_name and exonic_part

**Session Info:**

> sessionInfo()
R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] biomaRt_2.38.0              GenomicAlignments_1.18.1    Rsamtools_1.34.1           
 [4] Biostrings_2.50.2           XVector_0.22.0              GenomicFeatures_1.34.8     
 [7] DEXSeq_1.28.3               RColorBrewer_1.1-2          AnnotationDbi_1.44.0       
[10] DESeq2_1.22.2               SummarizedExperiment_1.12.0 DelayedArray_0.8.0         
[13] matrixStats_0.54.0          GenomicRanges_1.34.0        GenomeInfoDb_1.18.2        
[16] IRanges_2.16.0              S4Vectors_0.20.1            Biobase_2.42.0             
[19] BiocGenerics_0.28.0         BiocParallel_1.16.6        

loaded via a namespace (and not attached):
 [1] httr_1.4.0             bit64_0.9-7            splines_3.5.2          assertthat_0.2.1      
 [5] Formula_1.2-3          statmod_1.4.30         BiocManager_1.30.4     latticeExtra_0.6-28   
 [9] blob_1.1.1             GenomeInfoDbData_1.2.0 progress_1.2.1         pillar_1.4.0          
[13] RSQLite_2.1.1          backports_1.1.4        lattice_0.20-38        digest_0.6.18         
[17] checkmate_1.9.3        colorspace_1.4-1       htmltools_0.3.6        Matrix_1.2-17         
[21] plyr_1.8.4             XML_3.98-1.19          pkgconfig_2.0.2        genefilter_1.64.0     
[25] zlibbioc_1.28.0        xtable_1.8-4           snow_0.4-3             scales_1.0.0          
[29] htmlTable_1.13.1       tibble_2.1.1           annotate_1.60.1        ggplot2_3.1.1         
[33] nnet_7.3-12            lazyeval_0.2.2         survival_2.44-1.1      magrittr_1.5          
[37] crayon_1.3.4           memoise_1.1.0          hwriter_1.3.2          foreign_0.8-71        
[41] prettyunits_1.0.2      tools_3.5.2            data.table_1.12.2      hms_0.4.2             
[45] stringr_1.4.0          munsell_0.5.0          locfit_1.5-9.1         cluster_2.0.9         
[49] compiler_3.5.2         rlang_0.3.4            grid_3.5.2             RCurl_1.95-4.12       
[53] rstudioapi_0.10        htmlwidgets_1.3        bitops_1.0-6           base64enc_0.1-3       
[57] gtable_0.3.0           curl_3.3               DBI_1.0.0              R6_2.4.0              
[61] gridExtra_2.3          rtracklayer_1.42.2     knitr_1.22             bit_1.1-14            
[65] Hmisc_4.2-0            stringi_1.4.3          Rcpp_1.0.1             geneplotter_1.60.0    
[69] rpart_4.1-15           acepack_1.4.1          xfun_0.7

RNA-Seq DEXseq SummarizedExperiment DESeq2 • 1.8k views

ADD COMMENT • link updated 4.9 years ago by benformatics 3.9k • written 4.9 years ago by BM ▴ 70

0

Entering edit mode

make sure your SummarizedExperiment object contain the columns gene_id, tx_name and exonic_part**

Did you do that? Check the DEXseq manual on #Requirements on GTF files.

ADD REPLY • link 4.9 years ago by ATpoint 81k

score 0 · Answer 1 · 2019-05-22

This is a Bioconductor issue it seems... In the past it looks like DEXSeq took an "exonic parts" object created through a disjointExons call despite the current manual suggesting that users use an exonicParts call.

If it is no problem for you to reorder your ranges then a simple fix is the following:

orderByGeneName <- order(unlist(exonicParts$gene_id, use.names=FALSE))
exonic_rle <- runLength(Rle(unlist(exonicParts$gene_id[orderByGeneName],use.names=FALSE))) 
exonicParts <- exonicParts[orderByGeneName]
exonic_part <- unlist(lapply(exonic_rle, seq_len), use.names=FALSE)
exonicParts$exonic_part <- exonic_part

If you already have your ranges and want to keep them in the same order since you already made your SE object then use the following (simply uses a temporary variable):

orderByGeneName <- order(unlist(exonicParts$gene_id, use.names=FALSE))
exonic_rle <- runLength(Rle(unlist(exonicParts$gene_id[orderByGeneName],use.names=FALSE))) 
exonicParts2 <- exonicParts[orderByGeneName]
exonic_part <- unlist(lapply(exonic_rle, seq_len), use.names=FALSE)
exonicParts2$exonic_part <- exonic_part
exonicParts$exonic_part <- exonicParts2$exonic_part[match(as.character(exonicParts),as.character(exonicParts2))]
rm(exonicParts2)
## double check that the following is TRUE
identical(rowRanges(SE),exonicParts)
rowData(SE)$exonic_part <-  exonicParts$exonic_part

Although I did notice some minor discrepancies (~3-7%) between the disjointExons and exonicParts; which I think relates to associations between genes and exons.