Entering edit mode
5.0 years ago
zizigolu
★
4.3k
Hi,
I have this matrix of raw read counts from HTSeq
> head(mat[,1:4])
TCGA-L5-A4OG-11A-12R-A260-31 TCGA-IC-A6RE-11A-12R-A336-31 TCGA-L5-A4OJ-11A-12R-A260-31
ENSG00000000003 1818 4596 2732
ENSG00000000005 0 3 6
ENSG00000000419 1436 751 1500
ENSG00000000457 1175 840 992
ENSG00000000460 242 205 256
ENSG00000000938 536 253 331
TCGA-L5-A4OO-11A-12R-A260-31
ENSG00000000003 1075
ENSG00000000005 3
ENSG00000000419 1139
ENSG00000000457 726
ENSG00000000460 123
ENSG00000000938 372
>
> dim(mat)
[1] 56925 11
>
I want to summarize that by gene name and make matrix smaller to 35000 but I don't know how; @Love says I can not use tximport
Any help please?
I guess @Love is Mike Love? That means you posted that somewhere before. Please provide links and quotes to what he said. Probably he gave a reason why.
https://github.com/mikelove/tximport/issues/26
He says
A: R org.Hs.eg.db matching ensembl gene ids with gene symbol
Thanks, now I have
Now I want to extract the read counts of only 56720 matched gene symbol from mat
Then I suggest you use your years of experience in the field to find ways to accomplish that rather than asking for spoon-feeding.
How sad here there is not any emoji to imitate my face now!
No you don't, but if you did then you'd want to split by gene name and sum across rows. That you can figure out.