Is Stringtie a suitable tool for both gene and transcript quantification?
2
0
Entering edit mode
3.9 years ago
tianshenbio ▴ 170

I am thinking about generating read count matrix at both gene-level and transcript (isoform)-level.

According to a previous post:

Why run FeatureCounts after Stringtie? (Galaxy recommends!)

How to get read counts on transcript level using featurecounts?

It seems that I can use FeatureCounts for gene quantification and Stringtie for transcript/isoform quantification, am I right?

Since transcripts are heavily overlapping, featurecounts cannot properly sort out reads mapping to the same exon, thus is not suitable to count transcripts/isoforms. Then how this can be overcome in Stringtie? Are common reads properly sorted using stringtie?

Many people suggested an alignment-free tool, Salmon, for transcript quantification. Since I am interested to find both DE genes and DE transcripts/isoforms in my DE analysis, I assume Stringtie would be a more handy option since I can get both gene and transcript counts in one run.

Therefore, my question would be, is gene/transcript quantification reliable using Stringtie? How does it distribute common reads shared by multiple isoforms, which is the major problem to quantify isoforms.

I have read the original papers and related posts here in biostars but still not sure...appreciate it if someone can clarify this for me.

stringtie RNA-Seq sequencing featurecounts gene • 3.6k views
ADD COMMENT
6
Entering edit mode
3.9 years ago

Every benchmark I have seen (as well as my own experence) shows that StringTie is less accurate that Salmon/kalisto/RSEM. I don't actaully know what the model at the heart of StringTie's quantification is. Salmon/kalisto/RSEM all use some variation of EM to distribute reads between transcripts.

As @ATpoint points out, its fairly easy to calculate gene expression from transcript expression.

ADD COMMENT
0
Entering edit mode

Thank you for your response. Will give it a try!

ADD REPLY
0
Entering edit mode

Thanks for this comment. Could you point to some example benchmarks you are referring to?

ADD REPLY
4
Entering edit mode
3.9 years ago
ATpoint 82k

You can easily aggregate the salmon transcript level abundance estimates to the gene level with the tximport package from Bioconductor. I would definitely go with salmon. From what I understand stringtie is mainly used to assemble reads into a transcriptome and I probably would only use it for that.

ADD COMMENT
0
Entering edit mode

Thank you for your reply. Yeh, since I will not perform transcriptome assembly, Salmon might be a better choice.

ADD REPLY

Login before adding your answer.

Traffic: 3340 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6