Question

Hisat2 multiple paired end reads

1

Entering edit mode

6.3 years ago

sicat.paolo20 ▴ 30

For a single paired end read the script is hisat2 -x /path/to/hg19/indices -1 sample_1.fq.gz -2 sample_2.fq.gz | samtools view -Sbo sample.bam - Our queue is full so I'm not able to test this but is it possible to allign multiple paired end reads in one go and how would I go about it?

hisat2 RNA-Seq allignment • 6.3k views

ADD COMMENT • link updated 6.3 years ago by lakhujanivijay 5.8k • written 6.3 years ago by sicat.paolo20 ▴ 30

1

Entering edit mode

6.3 years ago

lakhujanivijay 5.8k

If the idea is to run hisat for multiple PE data then this may help:

# Runs of .gz files 
total_files=`find -name '*.gz' | wc -l`
arr=( $(ls *.gz) )

# Alignment

for ((i=0; i<$total_files; i+=2))
{
sample_name=`echo ${arr[$i]} | awk -F "_R1" '{print $1}'`
echo "[Hisat mapping running for sample] $sample_name"
date && time /opt/app/hisat2-2.0.5/hisat2 -p 60 -x $genome_name -1 ${arr[$i]} -2 ${arr[$i+1]} -S $sample_name.sam 
printf "\n\n"
}

ADD COMMENT • link 6.3 years ago by lakhujanivijay 5.8k

0

Entering edit mode

Thanks for the help. I will look into this.

ADD REPLY • link 6.3 years ago by sicat.paolo20 ▴ 30

score 3 · Accepted Answer · 2018-01-12

3

Entering edit mode

6.3 years ago

h.mon 35k

Probably your sample_1.fq.gz and sample_2.fq.gz files already contain multiple reads inside. I guess you want to align multiple files, right?

But do you want the output in a single file, or multiple files as output? For the former, you can pass a comma-separated list of files to hisat2 (see -1 and -2 on hisat2 manual). For the later, there are several option, such as a bash loop, using Parallel, a make script, job arrays (if your queue is managed with SLURM / Torque / SGE), among others options.