Alignment with bowtie2 but empty .fastq files
1
0
Entering edit mode
5.6 years ago
luzglongoria ▴ 50

Hi,

I am trying to align my .fq reads (end-pair reads) to my reference genome (a "close" related species) with this command:

bowtie2 --threads 4 --local --no-unal \
    -x /home/luz_garcia_longoria/workspace/reference_genomes/parasitereference.fasta \
    -q -k 1 --al aligned_reads.fastq \
    -1 `/home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq \
    -2 /home/luz_garcia_longoria/workspace/s21_2.fq,s22_2.fq,s23_2.fq,s24_2.fq,s25_2.fq,s31_2.fq,s32_2.fq,s33_2.fq,s34_2.fq,s35_2.fq \
    > aligned_host_parasite.sam |\
    samtools view -b -o aligned_host_parasite.bam`

BUT if I do ll :

-rw-rw-r--. 1 luz_garcia_longoria luz_garcia_longoria          62 Sep 24 17:42 aligned_host_parasite.bam
-rw-rw-r--. 1 luz_garcia_longoria luz_garcia_longoria 30494171957 Sep 25 00:07 aligned_host_parasite.sam
-rw-rw-r--. 1 luz_garcia_longoria luz_garcia_longoria           0 Sep 24 17:42 aligned_reads.fastq

As you can see the .fastq file is completly empty and it's the one that I want to use for for doing my next step.

What am I missing?

RNA-Seq Assembly • 2.9k views
ADD COMMENT
0
Entering edit mode

Your fastq is empty, so it seems you somehow corrupted your data after the alignment was performed, since that file seems pretty big. Can you regenerate the fastq file?

You can also convert the bam back to fastq using samtools fastq myaln.bam > myreads.fastq

ADD REPLY
0
Entering edit mode

Ok. I though also to do this but I wonder if I am going to alter somehow the BAM files and then the fastq file will be the "real" one...

ADD REPLY
0
Entering edit mode

Is your command pasted here properly? The ` begins after the -1 but does not end at the right place.

ADD REPLY
0
Entering edit mode

-1 `/home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq

You have to re-write the path for all your fq or launch your script from your workspace

Same for option -2

ADD REPLY
0
Entering edit mode

What if OP used shell expansion?

-1 /home/luz_garcia_longoria/workspace/{s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq}
ADD REPLY
0
Entering edit mode

Ofc it is an elegant solution but brackets are missing from OP command line

ADD REPLY
0
Entering edit mode

the command is

bowtie2 --threads 4 --local --no-unal -x /home/luz_garcia_longoria/workspace/reference_genomes/parasitereference.fasta \
-q -k 1 --al aligned_reads.fastq -1 
/home/luz_garcia_longoria/workspace/s21_1.fq,s22_1.fq,s23_1.fq,s24_1.fq,s25_1.fq,s31_1.fq,s32_1.fq,s33_1.fq,s34_1.fq,s35_1.fq
-2
/home/luz_garcia_longoria/workspace/s21_2.fq,s22_2.fq,s23_2.fq,s24_2.fq,s25_2.fq,s31_2.fq,s32_2.fq,s33_2.fq,s34_2.fq,s35_2.fq 
> aligned_host_parasite.sam \
 | samtools view -b -o aligned_host_parasite.bam
ADD REPLY
0
Entering edit mode

And I am running this command in workspace :)

ADD REPLY
0
Entering edit mode

If you're running your command from /home/luz_garcia_longoria/workspace/ you don't need it in option -1 and -2

ADD REPLY
3
Entering edit mode
5.6 years ago
ATpoint 81k

Ok, there is apparently some confusion here:

1) You are right now saving your alignment to > aligned_host_parasite.sam. The SAM format is the format that alignments are by default stored in. Using > redirects the stream from bowtie2 to the specified file (aligned_host_parasite.sam). If you do that, you (probably) cannot pipe it into samtools view, therefore the BAM is empty.

2) aligned_reads.fastq being empty is not surprising. From the bowtie2 manual:

--al <path>        write unpaired reads that aligned at least once to <path>

That means it will contain only files that are unpaired but aligned in any fashion. It being empty means you have no unpaired reads that aligned, probably because all reads were paired, which is expected from paired-end fastq files.

The correct command would be (removing the quotes around -1 and -2, which I think are not necessary):

$WORK_DIR="/home/luz_garcia_longoria/workspace"
bowtie2 --threads 4 --local --no-unal \
        -x ${WORK_DIR}/reference_genomes/parasitereference.fasta \
        -q -k 1 --al aligned_reads.fastq \
        -1 ${WORK_DIR}/s21_1.fq,${WORK_DIR}/s22_1.fq,\
           ${WORK_DIR}/s23_1.fq,${WORK_DIR}/s24_1.fq,\
           ${WORK_DIR}/s25_1.fq,${WORK_DIR}/s31_1.fq,\
           ${WORK_DIR}/s32_1.fq,${WORK_DIR}/s33_1.fq,\
           ${WORK_DIR}/s34_1.fq,${WORK_DIR}/s35_1.fq \
        -2 ${WORK_DIR}/s21_2.fq,${WORK_DIR}/s22_2.fq,\
           ${WORK_DIR}/s23_2.fq,${WORK_DIR}/s24_2.fq,\
           ${WORK_DIR}/s25_2.fq,${WORK_DIR}/s31_2.fq,\
           ${WORK_DIR}/s32_2.fq,${WORK_DIR}/s33_2.fq,\
           ${WORK_DIR}/s34_2.fq,${WORK_DIR}/s35_2.fq | \
        samtools view -b -o aligned_host_parasite.bam

By the way, SAM/BAM are the files you use for downstream analysis. Fastq is only a format for unaligned data. I hope you do not want to use fastq files for any downstream. What is your next step?

ADD COMMENT
0
Entering edit mode

thank you for your useful comment!

My next step is Trinity and I think I need fastq files, isn't it?

As WouterDeCoster said I can also convert my sam files into fastq files but the thing is that I will have two SAM files (one that map with the parasite and other one that won't map either with the parasite or the host). So in order to run Trinity I want these two files together (to merge them) and then run Trinity.

I hope is clear

ADD REPLY
0
Entering edit mode

Regardless of whether you need fastq for trinity, the bowtie command you posted above is wrong.

You cannot redirect to a file and keep on piping to other commands. Try the following stupid example

echo "foobar" > myfile.txt | echo

"foobar" will be in myfile.txt but the final echo doesn't do anything because after > nothing is send to |.

ADD REPLY
0
Entering edit mode

| echo doesn't do much, | cat is a better test.

Also, there are exceptions.

Exception

ADD REPLY
0
Entering edit mode

Using > redirects the stream from bowtie2 to the specified file (aligned_host_parasite.sam). If you do that, you cannot pipe it into samtools view, therefore the BAM is empty.

While this is true in most cases and makes logical sense, there is a particular zsh feature called MULTIOS that makes the shell behave unexpectedly when it encounters a redirect followed by a pipe. It essentially becomes a tee at that point.

ADD REPLY
0
Entering edit mode

Good catch, I added a "probably" to my sentence ;-)

ADD REPLY

Login before adding your answer.

Traffic: 2627 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6