Question about Isoform expression (miRNA) data from TCGA
1
2
Entering edit mode
4.6 years ago
Vasu ▴ 770

I have downloaded the miRNA Isoform expression quantification data (mirbase21.isoforms.quantification.txt) from TCGA. The data looks like below:

miRNA_ID    read_count  miRNA_region
hsa-let-7b       2       precursor
hsa-let-7b       1       precursor
hsa-let-7b       1       precursor
hsa-let-7b       58     MIMAT0000063
hsa-let-7b      173     MIMAT0000063
hsa-let-7b      5723    MIMAT0000063
hsa-let-7b     26947    MIMAT0000063
hsa-let-7b       1        stemloop
hsa-let-7b       1      MIMAT0004482
hsa-let-7b       2      MIMAT0004482
hsa-let-7b      129     MIMAT0004482
hsa-let-7b      401     MIMAT0004482

And based on miRBaseConverter and information from miRTarbase I have the mature regions 3p and 5p information with Accession like below:

miRNAName_v21     Accession
hsa-let-7b-5p    MIMAT0000063
hsa-let-7b-3p    MIMAT0004482

So, based on above information I can sum the counts of mature 3p and 5p. And the main precursor also I can do, but what are stemloops? Do I need to include that with precursor or need to exclude that from analysis?

thanq

mirna mirnaseq mirbase mirtarbase tcga • 1.6k views
ADD COMMENT
5
Entering edit mode
4.6 years ago

The the maturing of a miRNA is a multi-step process. The extra material at the 3' and 5' end is removed first, and then the loop, which acording to the TCGA pipeline description defined for the purposes of this pipeline as:

  1. stemloop, from 1 to 6 bases outside the mature strand, between the mature and star strand

In the TCGA pipeline, reads are classified heirarchically. Priority is given to reads mapping to mature miRNA, so if it aligns to something else as well as mature, it will be counted as mature (that actaully wouldn't be the way I would do, but never mind). Then to reads mapping to the pre-cursor, and then to the stemloop.

This explains the high levels of counts in the mature, but not the pre-cursor, despite the mature being contained within the precursor.

As you are probably intersted in the biologically active molecule, I would ignore everything other than the mature.

ADD COMMENT
0
Entering edit mode

Thanks a lot sudbery. So, now I will sum all the mature 3p and also sum the 5p and exclude precursor and stemloop for the analysis.

ADD REPLY
1
Entering edit mode

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.
Upvote|Bookmark|Accept

ADD REPLY

Login before adding your answer.

Traffic: 2220 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6