Converting a bunch or SRA files using fastq-dump split-files
1
0
Entering edit mode
5.9 years ago
Mitra • 0

I have recently downloaded a bunch of sra files. and i like to convert them to fastq paired reads. It works by doing this :

./fastq-dump --split-files /Users/medsmit/ncbi/public/sra/SRR3501908.sra

But I need a way to convert them all together.

I was trying this

for i in  `ls /Users/medsmit/ncbi/public/sra/*.sra' ; do ./fastq-dump -- split-files $f; done

But definitely doing some silly mistake as its not working. Can anyone please help me? Thank you, Suparna

SRA split-files fastq-dump ncbi • 4.0k views
ADD COMMENT
0
Entering edit mode

Do you have a space between -- and split-files in the loop? I would also use ls -1 so only one file is fed to fastq-dump for each iteration of the loop.

ADD REPLY
0
Entering edit mode

Thanks Genomax. Yes I do have space between -- and split-files in the loop. Also I tried with ls -l After I pass this code bellow:

 medsmit$ for i in  `ls -l /Users/medsmit/ncbi/public/sra/*.sra' ; do ./fastq-dump -- split-files $f; done

I only see

>

As if it entered in any interface. Not sure what wrong I am doing. Thanks, Suparna

ADD REPLY
0
Entering edit mode

You can't have a space between --split-files. That was also a 1 (number one) not l(L) in the ls command.

And two additional mistakes noted by @jean below.

ADD REPLY
0
Entering edit mode

Alternatively, you can always check the ENA for your files, which are typically mirrored there directly as fastq, or use parallel-fastq-dump (python3) if the sra files are big (tens of Gb).

ADD REPLY
2
Entering edit mode
5.9 years ago
jean.elbers ★ 1.7k
  1. You have ' instead of `
  2. You need $i instead of $f
  3. You need --split-files not -- split-files

    for i in `ls -1 *.sra` ; do ./fastq-dump --split-files $i; done

ADD COMMENT
0
Entering edit mode

jean.elbers Thanks for pointing these errors to me.. part of which just got introduced when I wrote this post. Now it seems like its creating fastq files, but unfortunately also creating some strange error:

./fastq-dump : 2.9.0

2018-05-24T12:10:07 fastq-dump.2.9.0 err: item not found while constructing within virtual database module - the path '1' cannot be opened as database or table
2018-05-24T12:10:07 fastq-dump.2.9.0 err: item not found while constructing within virtual database module - the path 'medsmit' cannot be opened as database or table
2018-05-24T12:10:08 fastq-dump.2.9.0 err: item not found while constructing within virtual database module - the path 'staff' cannot be opened as database or table
2018-05-24T12:10:08 fastq-dump.2.9.0 err: item not found while constructing within virtual database module - the path '25754388' cannot be opened as database or table
2018-05-24T12:10:08 fastq-dump.2.9.0 err: item not found while constructing within virtual database module - the path '18' cannot be opened as database or table
2018-05-24T12:10:08 fastq-dump.2.9.0 err: item not found while constructing within virtual database module - the path 'May' cannot be opened as database or table
2018-05-24T12:10:08 fastq-dump.2.9.0 err: item not found while constructing within virtual database module - the path '15:16' cannot be opened as database or table
Read 134818 spots for /Users/medsmit/ncbi/public/sra/SRR3502002.sra
Written 134818 spots for /Users/medsmit/ncbi/public/sra/SRR3502002.sra

Trying to understand why are they there . Thanks,

ADD REPLY
1
Entering edit mode

What command are you using exactly? It seems to me that fastq-dump wants to include part of the path as individual files. Note sure. It looks like the reads were properly extracted. You can double check by seeing if the number of spots matches the SRA run browser (https://trace.ncbi.nlm.nih.gov/Traces/sra/sra.cgi?view=run_browser&run=SRR3502002). From what I see, the number of spots is correct.

ADD REPLY
0
Entering edit mode

Yes the spots looks correct. So I get the fastq results. But I am not sure what are all these messages though !

ADD REPLY
0
Entering edit mode

Are you still using a l (lower case L) instead of 1 (number one) in your ls command?

ADD REPLY
0
Entering edit mode

yes I am using for f inls -l /Users/medsmit/ncbi/public/sra/*.sra; do ./fastq-dump --split-files $f; done as my command now.

ADD REPLY
1
Entering edit mode

You need to use number 1 instead of lower-case letter l.

ADD REPLY
0
Entering edit mode

Thank you Genomax. Sorry I couldn't reply you yesterday as Biostars restricted my daily comments limit. This time with your suggestion it works :) can you please tell me what is the exact difference in l and 1. I can see from man page -l use a long listing format and -1 list one file per line. But what I don't understand is why -l wouldn't work. Sorry for asking all these question. Actually I am a self learner. Thank you again.

ADD REPLY
1
Entering edit mode

Not a problem. With the long-listing (l, character lower-case L) you are getting additional information about unix permissions/group ownership/file size etc in the listing. You don't want that to be used an an input for fastq-dump so listing just the file paths one line at a time is the way to do this with 1 (number one).

ADD REPLY
0
Entering edit mode

Great ..that is really helpful. I understand now thats the reason with -l I was getting result but also additional error msg as probably fastq-dump was not dealing well with all that extra information. Thank you very much, S :)

ADD REPLY

Login before adding your answer.

Traffic: 2134 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6