Questions about quantifying mRNA data
1
0
Entering edit mode
6.0 years ago
kamel ▴ 70

Hello biostar,

I am currently working on differential expression analysis of mRNA data, I have two questions and I need your help and your expertise.

I did a mapping on the reference genome by STAR and I need to quantify also the multi-mappers reads.

my first question:do you think featureCounts is reliable for quantifying multimapper reads with the -M option?

my second question: what do you think if I do the quantification at -g exon-id instead of -g gene-id, (because it interests me only the exons ). I used both parameters and I noticed a decrease of 2% in number of reads quantify when I used -g exon-id.

Thank you, Kamel

RNA-Seq gene count expression • 1.7k views
ADD COMMENT
0
Entering edit mode
6.0 years ago
h.mon 35k

Using featureCount -M is not the best way to account for multi-mapping reads, its default (counting multiple times) is a rather crude way of counting multi-mapped reads, and with option -M --fraction, it is a rather naive way of splitting the counts of multi-mapping reads. If you really need to quantify multi-mappers, use RSEM / Salmon / kallisto.

With RSEM and Salmon, you may align with STAR using --quantMode TranscriptomeSAM --quantTranscriptomeBan IndelSoftclipSingleend, quantify and then summarize the transcript counts to gene counts with tximport.

With Salmon and kallisto, you can quantify directly against a reference transcriptome and then summarize the transcript counts to gene counts with tximport.

what do you think if I do the quantification at -g exon-id instead of -g gene-id, (because it interests me only the exons )

You want to summarize to read counts on exons? Or do you want to only count read overlapping exons, but summarize to read counts on genes? Why only exons? But if only exons interest you, then it is fine to use -g exon-id - I just don't understand why you need it.

ADD COMMENT
0
Entering edit mode

I already used --quantMode with star, it gives directly the quantification level gene_id and I'm not sure if it quantify ambiguity or not, for other sofwtare you mentioned I can't use it because I want a mapping on the genome not on the transcriptome.

Could you answer me on my question if I can use featureCount -M option and my 2nd question if you have an answer. Thank you

ADD REPLY
0
Entering edit mode

I updated my answer.

No, STAR with --quantMode GeneCounts does not count multi-mapping reads.

You are mapping to the genome, but summarizing over genes, so mapping / counting over the transcriptome and summarizing over genes with tximport would result in exactly the same genes being summarized.

ADD REPLY
0
Entering edit mode

Thank you for your answer I will try to use them. what do you think about the software mmquant ??

ADD REPLY
0
Entering edit mode

It just so happens we had the very same question earlier today, here is the answer:

C: using two read counting methods with RNA-seq

ADD REPLY

Login before adding your answer.

Traffic: 2346 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6