Biostar Beta. Not for public use.
SAM file size after STAR alignment
1
Entering edit mode
13 months ago
xqyjxau • 20

My RNA-Seq data is in format of fastq(ungzipped from fastq.gz format).I used STAR 2.5.3a mapping the reads with already indexed reference genome. It seems good. But I found the size of generated SAM file is strange. My original input fastq data is like 1.3-1.5 GB, but the SAM file ranges from 3.8 GB to 4.5 GB. Is that normal? If not, what is something wrong there?

RNA-Seq alignment • 1.4k views
ADD COMMENTlink
0
Entering edit mode

Have you checked the STAR logs to see if there were any errors generated and to see what the alignment percentages looked like? If not the resulting SAM file should be fine.

ADD REPLYlink
0
Entering edit mode

The data uniquely mapped is from 63%-64%, multiply mapped reads are from 27% to 32%.Is this OK?

ADD REPLYlink
1
Entering edit mode

There are so many variables here it's impossible to say. If you didn't get an error then presumably you're fine. The file size increase is perfectly normal. But by the sounds of things you really should look into pairing-up with someone who knows what is going on to teach you the ropes :)

ADD REPLYlink
4
Entering edit mode
9 weeks ago
University Park, USA

A SAM file will typically be larger than a FASTQ file because, in general, it contains all the information of the FASTQ plus a lot of other information.

In addition, each FASTQ record may produce more than one alignment, hence you can see how it could easily grow to be much larger than the original data.

ADD COMMENTlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1