featurecounts more fragments than input read pairs
0
1
Entering edit mode
5.7 years ago
ewre ▴ 250

Hi All, I have a question for the output of featurecounts (from subread package). The total number of my input read pairs is 47168870 (reported by fastQC and STAR):

>  Number of input reads |  47168870
>                       Average input read length | 152
>                                     UNIQUE READS:
>                    Uniquely mapped reads number | 37604677
>                         Uniquely mapped reads % | 79.72%
>                           Average mapped length | 149.39
>                        Number of splices: Total | 16519867
>             Number of splices: Annotated (sjdb) | 0

But the total number of fragments reported by featurecount is 82845035, almost twice as much as the number of input read pairs. the number of SAM alignment pairs reported by htseqcount is 81037190.

> (featurecount): Total fragments : 82845035                            
>   Successfully assigned fragments : 32386307 (39.1%)

(htseqcount)

> 81000000 SAM alignment record pairs processed. Warning: Mate pairing
> was ambiguous for 965089 records; mate key for first such record: 
> 81037190 SAM alignment pairs processed.

This is the gff I used for count is gencode.v24.annotation.gff3

My question is that I want to know what is the definition of fragment in featurecount report? why there is more fragments compared with the input read pairs? In my understanding, each read pair indicates a fragment and the total number of fragment and total number of read pair should be equal.

Thank you for your time in advance.

ewre

featurecounts RNA-Seq pair-end htseq-count • 2.9k views
ADD COMMENT
1
Entering edit mode

Did you use the -p option to count fragments instead of reads?

ADD REPLY
0
Entering edit mode

Yes: featureCounts -p -s 2 -T 5 -a gencode.v24.annotation.gff3 -t exon -g gene_id -o sample.out sample.bam

ADD REPLY
0
Entering edit mode

Ewre, I have the same situation as you do. I wonder if you find out the answers to your questions, can you kindly share the answer?

ADD REPLY
0
Entering edit mode

If you isolate the read names of all the reads that have mapped, and then sort | uniq them, how many do you get? Chances that a multi-positional alignment is happening?

ADD REPLY

Login before adding your answer.

Traffic: 2398 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6