Can featureCounts use sortedbycoordinate files?
1
1
Entering edit mode
3.3 years ago
tryseq ▴ 10

Hi, I am a beginner at RNAseq analysis! I have paired end, stranded (reverse direction cDNAs) data. I used STAR to align my data and have SAM files that are sorted by coordinate.

Now I want to get the counts using featureCount. I have seen in some places that featureCount may not be able to use SAM files that are sorted by coordinate, and wants it to be sorted by name. Other documentation I have found talk about this not being an issue.

Can I get some clarity? Can I use the STAR sorted files as an input, or do I need to use SAM tools to change them?

RNA-Seq • 2.7k views
ADD COMMENT
1
Entering edit mode

featureCounts can use coordinate sorted sam files.

ADD REPLY
1
Entering edit mode

It will re-sort the files on the fly. From manual:

Automatically sort paired-end reads. Users can provide either location-sorted or name-sorted bams files to featureCounts. Read sorting is implemented on the fly and it only incurs minimal time cost.

ADD REPLY
0
Entering edit mode

I was using featureCounts on coordinate-sorted BAM files, and read sorting was incurring a large (>12hrs) time cost. Apparently, featureCounts will only use one thread for read sorting even if specified with more. I pre-sorted by read name with samtools and runtime with featureCounts was VASTLY quicker.

ADD REPLY
2
Entering edit mode
3.3 years ago
h.mon 35k

As already stated by rpolicastro and GenoMax , featureCounts not only accepts both name- and coordinate-sorted sam and bam files, it automatically detects the sorting type and uses the appropriate algorithm on the fly. What sources of documentation said featureCounts could not handle coordinate-sorted files?

Old versions of HTSeq-count couldn't handle coordinate-sorted bam files and would crash, but I believe this has been fixed for a while now.

Finally, if you are using STAR as aligner, you can use --quantMode GeneCounts to get counts estimates similar to those of featureCounts.

ADD COMMENT
0
Entering edit mode

I saw it mentioned during a tutorial on RNAseq data analysis, then when looking further into that, found older forums talking about how needing different ways to sort the data. Glad to know that everything is figured out already.

I found out today that STAR can do the counts haha. Would have been nice to know last week when I was doing alignments.

Thank you everyone for your help!

ADD REPLY

Login before adding your answer.

Traffic: 1243 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6