how can I turn a skinny/long data frame into a Seurat object
1
0
Entering edit mode
2.9 years ago
glocke01 ▴ 190

I'm trying to turn a long formatted tibble/data frame into a Seurat object. So, my input has one row per cell per gene:

library(tibble)
library(Seurat)
n_rows <- 10000
cnt_frame <- tibble(cell_id = sprintf("sample%02d_cell%03d", 
                                      sample(1:10, n_rows, replace=T), 
                                      sample(1:1000, n_rows, replace=T)),
                    gene_id = sprintf("TLA%03d", sample(1:500, n_rows, replace=T)),
                    read_count = rnbinom(n_rows, size=10, mu=10))
head(cnt_frame)
# A tibble: 6 x 3
  cell_id          gene_id read_count
  <chr>            <chr>   <dbl>
1 sample08_cell813 TLA092      6
2 sample10_cell481 TLA167     10
3 sample03_cell173 TLA029     14
4 sample07_cell140 TLA358     10
5 sample03_cell021 TLA314      9
6 sample08_cell091 TLA228      8

I now wish to turn this into a Seurat object.

cnt_mat <- cnt_frame %>%
  pivot_wider(names_from=cell_id, values_from=read_count, values_fill=0) %>% ## how to do without pivot?
  column_to_rownames("gene_id") %>%
  Seurat::as.sparse()
seur <- CreateSeuratObject(cntMat, meta.data=my_metadata)

My question is how to do this without the pivot_wider. My actual data frame is much bigger than this toy example and the pivot step is reaching integer overflow. Given that a sparse matrix is, in principle, quite close to a long table, pivoting to a full matrix seems like it should be unnecessary, but I can't figure out how to get from here to there.

sparse-matrix seurat scrna-seq • 1.6k views
ADD COMMENT
2
Entering edit mode
2.9 years ago

Slightly modified example data.

cnt_mat <- structure(list(cell_id = c("sample08_cell813", "sample10_cell481",
"sample03_cell173", "sample07_cell140", "sample03_cell021", "sample08_cell091",
"sample08_cell091"), gene_id = c("TLA092", "TLA167", "TLA029",
"TLA358", "TLA314", "TLA228", "TLA029"), read_count = c(6, 10,
14, 10, 9, 8, 2)), row.names = c(NA, -7L), class = "data.frame")

> cnt_mat
           cell_id gene_id read_count
1 sample08_cell813  TLA092          6
2 sample10_cell481  TLA167         10
3 sample03_cell173  TLA029         14
4 sample07_cell140  TLA358         10
5 sample03_cell021  TLA314          9
6 sample08_cell091  TLA228          8
7 sample08_cell091  TLA029          2

You can construct the sparse matrix directly. This is assuming that there are no 0 counts in the long data (might want to double check to be sure).

library("Matrix")

sparse_mat <- sparseMatrix(
  i=match(cnt_mat$gene_id, unique(cnt_mat$gene_id)),
  j=match(cnt_mat$cell_id, unique(cnt_mat$cell_id)),
  x=cnt_mat$read_count,
  dims=c(length(unique(cnt_mat$gene_id)), length(unique(cnt_mat$cell_id))),
  dimnames=list(unique(cnt_mat$gene_id), unique(cnt_mat$cell_id))
)

> sparse_mat
6 x 6 sparse Matrix of class "dgCMatrix"
       sample08_cell813 sample10_cell481 sample03_cell173 sample07_cell140
TLA092                6                .                .                .
TLA167                .               10                .                .
TLA029                .                .               14                .
TLA358                .                .                .               10
TLA314                .                .                .                .
TLA228                .                .                .                .
       sample03_cell021 sample08_cell091
TLA092                .                .
TLA167                .                .
TLA029                .                2
TLA358                .                .
TLA314                9                .
TLA228                .                8
ADD COMMENT
0
Entering edit mode

thanks. CreateSeuratObject(sparse_mat) seems to be working as expected. I'll report back once I try implementing with my non-toy data.

ADD REPLY
0
Entering edit mode

This is terribly obvious in retrospect. The effort of not slapping my forehead is good practice in self-compassion.

ADD REPLY
1
Entering edit mode

No worries! It's not immediately obvious how to generate a sparse matrix by hand like this, so don't beat yourself up.

ADD REPLY
0
Entering edit mode

yeah, actually, I'm not sure I've ever used match before.

ADD REPLY

Login before adding your answer.

Traffic: 1976 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6