What happened to my mapping IDs in the matrix counts?
0
0
Entering edit mode
6.2 years ago
Pin.Bioinf ▴ 340

Hello, I built a summarized experiment with summarized overlaps by using the UCSC gtf annotation file and I got the object se.

  library(TxDb.Hsapiens.UCSC.hg38.knownGene)
  ebg <- exonsBy(TxDb.Hsapiens.UCSC.hg38.knownGene, by="gene")



    se <- summarizeOverlaps(features=ebg, reads=bamfiles,
                            mode="Union",
                            singleEnd=TRUE,
                            ignore.strand=FALSE,
                            fragments=FALSE )

But when I print it, I get this:

 > assay(se)

                 [,1]  [,2]   [,3]   [,4]  [,5]  [,6]  [,7]   [,8]   [,9]  [,10]  [,11]  [,12]  [,13]
1             425   293   1273   1531   878   142   153    597    266   3929   1499   1655    751
10            73    50    127    118   115    82    65    194     73    311    153    671    561
100            5     1      5     15    10    16    17     41     42     27     14      4     12
1000         134    95    105    139    95   176   110    243    140    219     96    130     81
10000         26    23      1      4     2    12     9     32     25     16     12     17     12
100008587      0     0      0      0     0     0     0      0      0      0      0      0      0
100008589      0     0      0      0     0     0     0      0      0      0      0      0      0
100009613      0     0      0      0     0     0     0      0      0      0      0      0      0
100009676      8    10     11     14     8    26    30     89     55     22     20     23      4
10001         49    41     31     52    24    79    73    154    136     74    104    171    175
10002          1     0      0      0     0     0     0      1      0      0      1      0      0
10003          0     0      1      0     0     0     0      0      0      0      0      0      0
100033413      0     0      0      0     0     0     0      0      0      0      0      0      0
100033414      3     4      3      3     0     2     2      2      1      0      0      0      3
100033415      0     0      0      0     0     0     0      0      0      0      0      0      0
100033416      0     0      0      0     0     0     0      0      0      0      0      0      0
100033417      0     0      0      0     0     0     0      0      0      0      0      0      0
100033420      1     0      1      0     1     0     0      0      0      0      0      0      0

Did I do something wrong? What are those IDs? (100033413,10002,10001,... ) What happened to my UCSC IDs? or what database do they belong to? How could I annotate these genes when I finish the DE analysis?

Thank you very much.

RNA-Seq R mapping quantification • 1.2k views
ADD COMMENT

Login before adding your answer.

Traffic: 3081 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6