Biostar Beta. Not for public use.
Question: HISAT2 output direct to bam
0
Entering edit mode

Hello,

HISAT2 can only output sam files, which can be quite large. I found an option to output only reads that mapped to the reference but still the files can be 100s of GB.

Is there any issue in directly converting to bam as such:

${hisat2}/hisat2 -p 4 --rg-id=${4} -x $l --dta ${strand} --no-unal -1 $2 -2 $3 -U $4 | samtools view -Sbh > hisat2_output.bam

I am specifically asking about piping stdout to samtools view to convert to bam.

Thanks.

ADD COMMENTlink 9 months ago Adrian Pelin ♦ 2.3k • updated 9 months ago ATpoint 17k
3
Entering edit mode

That will work fine (and is even the recommended workflow as SAM files are uncompressed and only take up space). You can even have it more efficient by doing:

hisat2 (options)... | \
  tee >(samtools flagstat - > hisat2_output.flagstat) | \
  samtools sort -O BAM | \
  tee hisat2_output.bam | \
  samtools index - hisat2_output.bam.bai

That will give you the sorted BAM, the flagstat summary statistics and the BAM index all from hisat2 stdout. tee duplicates the stream from stdin and here in combination with process substitution can be exploited to run multiple commands on essentially the same file.

ADD COMMENTlink 9 months ago ATpoint 17k
Entering edit mode
0

Ok, it makes sense to pipe to sort like swbarnes2 and yourself suggested. What is the impact on RAM usage? Piping to unsorted bam can be done on the go. Do you need to load the whole output into memory in order for you to sort?

ADD REPLYlink 9 months ago
Adrian Pelin
♦ 2.3k
• updated 9 months ago
ATpoint
17k
Entering edit mode
1

sort has a -m option that specifies the amount of memory to be used before it spills data as intermediate/tmp files to disk. Default is 768Mb I think per thread.

ADD REPLYlink 9 months ago
ATpoint
17k
1
Entering edit mode

That should work. Better, pipe straight to samtools sort.

ADD COMMENTlink 9 months ago swbarnes2 5.7k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0