Biostars beta testing.
Question: SRA to FASTQ problem
0
Entering edit mode

Hi, I downloaded RNA-Seq data from NCBI and used fastq-dump --gzip --skip-technical --readids --read-filter pass --dumpbase --split-3 --clip SRR1043177.sra to convert into FASTQ. However, it appears that header in @ and + are different:

> zcat SRR1043177_pass_1.fastq.gz | head
@HWI-ST960:133:C1FJJACXX:6:1101:1708:2209/1
TNAAACTTAAAGGAAAAACATGGAATTTGTTTCTATGTTCTGCTTATTTGCGATTGTTTCTTTCTCTCTTCNNNNNNNNNATTCNNNNTNCNNNNTCNTG
+SRR1043177.1.1 HWI-ST960:133:C1FJJACXX:6:1101:1708:2209 length=100
@#1=DDFFHHHHGIIJIJIJJHIIFIIJJCGIJJJJJIIDDGGHIJGIIJFHFBGHFHHFHGIGIGHGJJC#############################

This caused an error in trim_galore

trim_galore -o /scratch/waterhouse_team/tmp/galore --cores 2 --paired NbSRR_WtR1.fastq.gz NbSRR_WtR2.fastq.gz
...
cutadapt: error: Error in FASTQ file at line 3: Sequence descriptions don't match ('HWI-ST960:133:C1FJJACXX:6:1101:1708:2209/1' != 'SRR1043177.1.1 HWI-ST960:133:C1FJJACXX:6:1101:1708:2209 length=100').
The second sequence description must be either empty or equal to the first description.

How it possible to fix the FASTQ file?

Thank you in advance,

ADD COMMENTlink 8 months ago m.t.lorenc • 0 • updated 8 months ago ATpoint 17k
3
Entering edit mode

You need to pass option --origfmt to fastq-dump to make it not append /1 to forward reads and /2 to reverse reads. Rather than that it will use the original read names from the sequencer so that the two read names are fully identical.

ADD COMMENTlink 8 months ago ATpoint 17k

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.0