Question: Download All The Bacterial Genomes of Pseudomonas aeruginosa From Ncbi
0
Entering edit mode
3 months ago
Optimist • 60
India

Dear members of Biostar!!

Greetings

I would like to download all the genomes available with regards to Pseudomonas aeruginosa species from NCBI.

Kindly let me know the way to download all the 4761 genomes for species from NCBI (link).

Thanks

ADD COMMENTlink 3 months ago Optimist • 60 • updated 3 months ago jrj.healey 12k
3
Entering edit mode
3 months ago
vkkodali ♦ 1.1k
United States

Since all the other answers appear to be command-line based, here's a point-and-click method. Follow the link to the NCBI Genomes page you have provided in your post (https://www.ncbi.nlm.nih.gov/genome/?term=pseudomonas%20aeruginosa) and click on the 'Assembly' link in the 'Related Information' panel on the right hand side. You will be directed to the NCBI Assembly page (https://www.ncbi.nlm.nih.gov/assembly?LinkName=genome_assembly&from_uid=187) where you will find a blue 'Download Assemblies' button. You can use the filters on the left hand side to further filter your data if you like. From the Download button, choose the source RefSeq or GenBank and the file type of interest to you.

ADD COMMENTlink 3 months ago vkkodali ♦ 1.1k
2
Entering edit mode
3 months ago
SMK ♦ 1.3k
Ghent, Belgium

You can try this way:

curl -s "ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/assembly_summary.txt" \
  | awk -v FS="\t" '$8~/Pseudomonas aeruginosa/{print $20}' \
  | sed -r 's|(ftp://ftp.ncbi.nlm.nih.gov/genomes/all/.+/)(GCA_.+)|\1\2/\2_genomic.fna.gz|' \
  > asm_list.txt

wget -i asm_list.txt

Where asm_list.txt contains the locations of those genomes:

ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/006/765/GCA_000006765.1_ASM676v1/GCA_000006765.1_ASM676v1_genomic.fna.gz
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/014/625/GCA_000014625.1_ASM1462v1/GCA_000014625.1_ASM1462v1_genomic.fna.gz
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/017/205/GCA_000017205.1_ASM1720v1/GCA_000017205.1_ASM1720v1_genomic.fna.gz
......

You can check the full information of these assemblies by keeping the results of:

curl -s "ftp://ftp.ncbi.nlm.nih.gov/genomes/genbank/bacteria/assembly_summary.txt" | awk -v FS="\t" '$8~/Pseudomonas aeruginosa/{print}'
ADD COMMENTlink 3 months ago SMK ♦ 1.3k
1
Entering edit mode
3 months ago
jrj.healey 12k
United Kingdom

As genomax's comment alluded to, you can follow the approach of using ncbi-genome-download from Kai Blin.

There are a few examples here you can also try: A: Easiest way to download all Enterobacteria

ADD COMMENTlink 3 months ago jrj.healey 12k

Login before adding your answer.

Powered by the version 1.5