Question

fastq dump error

1

Entering edit mode

7.4 years ago

Satyajeet Khare ★ 1.6k

Hi Biostars

I am trying to convert sra from PRJNA282735 dataset to fastq and I am getting following error...

fastq-dump.2.1.7 fatal: SIGNAL - Segmentation fault

My fastq-dump command is

fastq-dump --split-3 SRR2016445.sra -O SRR2016445

I am not able to find similar error elsewhere. The ENA page for some samples of this dataset has three files per SRX experiment (e.g. SRR2016445.fastq, SRR2016445_1.fastq and SRR2016445_2.fastq).

This is unusual for me as I usually get one or two SRR runs per experiment (depending on single end paired end) but never 3. I am wondering if this is the reason for getting errors.

Anybody with similar experience?

sra fastq-dump sratoolkit • 10k views

ADD COMMENT • link updated 9 months ago by Ram 43k • written 7.4 years ago by Satyajeet Khare ★ 1.6k

1

Entering edit mode

Are you using the latest sratoolkit? NCBI has moved to HTTPS only connections. I am getting two files dumping with (v. 2.8) fastq-dump --split-3 SRR2016445

ADD REPLY • link 7.4 years ago by GenoMax 141k

4

Entering edit mode

7.4 years ago

piet ★ 1.8k

Please note that most SRA files are not self contained, they depend on a reference sequence which is a separate download. Thus it is not enough to download the SRA file with wget. 'fastq-dump' will try to download the reference sequence behind the scenes before it extracts any reads. The reference sequence for SRR2016445 is https://www.ncbi.nlm.nih.gov/nuccore/149361431.

ADD COMMENT • link 7.4 years ago by piet ★ 1.8k

0

Entering edit mode

Thanks a lot! I tried following command...

prefetch SRR2016445

A reference file got downloaded in ~/public/refseq/ folder and SRR file in ~/public/sra/ folder. I could split SRR file into three fastq files using fastq-dump2.8.0 command. I guess the small fastq file without '_1' or '_2' extension comprises of unpaired reads.

For some reason, I am not able to convert the reference file 'NC_000072' from binary to fasta using fastq-dump.

P.S. fastq-dump does not work very well for download. It downloads both SRR file and reference file just like prefetch command, but the files retain .cache extension, which I believe is an indication of incomplete download.

ADD REPLY • link 7.4 years ago by Satyajeet Khare ★ 1.6k

0

Entering edit mode

Ok, so vdb-dump was of help there. Here is the command.

vdb-dump.2.8.0 -f fasta1 --output-file NC_000072.5.fa NC_000072.5

ADD REPLY • link 7.4 years ago by Satyajeet Khare ★ 1.6k

0

Entering edit mode

I noticed some time ago that the .cache files would always placed in your home by fastq-dump, while you download to a possibly much larger partition. This can easily fill up your home and it won't remove the files. I would therefore try to run fastq-dump like this HOME=./ fastq-dump --split-3 SRR2016445

ADD REPLY • link 7.4 years ago by Michael 54k

0

Entering edit mode

No luck. Still get .cache files. Will try and figure out whats going wrong.

ADD REPLY • link 7.4 years ago by Satyajeet Khare ★ 1.6k

0

Entering edit mode

Running vdb-config -i allows one to choose directories that will be used by SRAtoolkit. This needs to be done once and will require X-windows (if run with -i). If you want a pure text version run vdb-config -i --interactive-mode textual.

ADD REPLY • link 7.4 years ago by GenoMax 141k

0

Entering edit mode

@genomax2,

Configuration looks fine. Default path is ncbi/public. There is no proxy and rest of the settings are default. So why fastq-dump didn't download the SRR file properly (without .cache extension) and why reference file was not converted to fasta is a mystery to me. For now, I can survive with this three step process (prefetch/fastq-dump/vdb-dump).

P.S. fastq-dump works fine for other datasets which don't have refseq file.

ADD REPLY • link 7.4 years ago by Satyajeet Khare ★ 1.6k

0

Entering edit mode

Brave people will just edit '~/.ncbi/user-settings.mkfg' with their favorite text editor. Having ' vdb-config' to modifying a simple config file is over engineered.

ADD REPLY • link 7.3 years ago by piet ★ 1.8k

score 2 · Accepted Answer · 2016-12-09

2

Entering edit mode

7.4 years ago

Michael 54k

I think 2.1. is fairly old, with a slightly newer version 2.4.2 I get: fastq-dump.2.4.2 err: error unexpected while resolving tree within virtual file system module - failed to resolve accession 'SRR2016445' - Obsolete software. See https://github.com/ncbi/sra-tools/wiki ( 406 )

The latest release on github is 2.8, you should get and install the latest version as outlined here: https://github.com/ncbi/sra-tools/wiki

ADD COMMENT • link 7.4 years ago by Michael 54k

0

Entering edit mode

@Mike, that was it! I see three fastq files for SRR2016445 as expected using fastq-dump2.8.

@genomax2, download was not an issue. I used wget FTP to download the sra file. The conversion was. Your suggestion was right though.

ADD REPLY • link 7.4 years ago by Satyajeet Khare ★ 1.6k