Hisat2 multiple paired end reads
2
1
Entering edit mode
6.3 years ago

For a single paired end read the script is hisat2 -x /path/to/hg19/indices -1 sample_1.fq.gz -2 sample_2.fq.gz | samtools view -Sbo sample.bam - Our queue is full so I'm not able to test this but is it possible to allign multiple paired end reads in one go and how would I go about it?

hisat2 RNA-Seq allignment • 6.3k views
ADD COMMENT
3
Entering edit mode
6.3 years ago
h.mon 35k

Probably your sample_1.fq.gz and sample_2.fq.gz files already contain multiple reads inside. I guess you want to align multiple files, right?

But do you want the output in a single file, or multiple files as output? For the former, you can pass a comma-separated list of files to hisat2 (see -1 and -2 on hisat2 manual). For the later, there are several option, such as a bash loop, using Parallel, a make script, job arrays (if your queue is managed with SLURM / Torque / SGE), among others options.

ADD COMMENT
0
Entering edit mode

Thank you for the help. I'll be trying to do the latter.

ADD REPLY
1
Entering edit mode
6.3 years ago

If the idea is to run hisat for multiple PE data then this may help:

# Runs of .gz files 
total_files=`find -name '*.gz' | wc -l`
arr=( $(ls *.gz) )

# Alignment

for ((i=0; i<$total_files; i+=2))
{
sample_name=`echo ${arr[$i]} | awk -F "_R1" '{print $1}'`
echo "[Hisat mapping running for sample] $sample_name"
date && time /opt/app/hisat2-2.0.5/hisat2 -p 60 -x $genome_name -1 ${arr[$i]} -2 ${arr[$i+1]} -S $sample_name.sam 
printf "\n\n"
}
ADD COMMENT
0
Entering edit mode

Thanks for the help. I will look into this.

ADD REPLY

Login before adding your answer.

Traffic: 1977 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6