Question

Read counts from quantmode of Star and Stringtie

0

Entering edit mode

3 months ago

giacuong171 ▴ 10

Hi

I want to compare the number of reads of each gene between output from star (with quantmode) and stringtie (combined with prepDe.py file). The transcripts and exons are possibly overlapped so i only compare the number of reads in single exon, single transcript gene between stringtie and star. The total read of star is four times higher than stringtie, the correlation is 0,95, and there are few genes with only a few reads in star but many reads in stringtie. Does anyone have any idea how to explain that? I thought the number of reads should be the same or at least the correlation should be 1.

Thanks in advance.

STAR stringtie • 393 views

ADD COMMENT • link updated 3 months ago by ATpoint 82k • written 3 months ago by giacuong171 ▴ 10

0

Entering edit mode

These comparisons take time and effort. Unless you really need to know you could just use STAR and then featureCounts like the rest of the world and call it a day, focusing on the actual analysis.

ADD REPLY • link 3 months ago by ATpoint 82k

0

Entering edit mode

Thanks for your reply. After a long time of digging into that problem, I couldn't explain it to myself clearly. However, one assumption could be that Stringtie estimates the coverage level of the transcript by solving a maximum-flow problem that determines the maximum number of fragments that can be associated with the chosen transcript. That impacts the number of reads and coverage for each region. On the other side, Star only counts the reads, which are within the region. https://www.nature.com/articles/nbt.3122#Sec2:~:text=Second%2C%20StringTie%20estimates,in%20the%20ASG.