How to download all the genome sequences (including draft and complete) of a particular genus bacteria available in NCBI?
1
0
Entering edit mode
5.3 years ago
Kumar ▴ 120

I would like to conduct a comparative genome analysis of Xanthomonads. For which, I have been downloading all the available Xanthomonas genome (including draft and complete genome fasta format file) in NCBI of all Xanthomonas strains. As of now, I found more than 2000 genomes are available in NCBI bioproject and NCBI Sequence set database (https://www.ncbi.nlm.nih.gov/Traces/wgs/?page=1&view=wgs&search=Xanthomonas). Downloading one by by one genome is too tedious . Hence, please suggest me any shortcut or easy way to do the same. However, I could only download the summary of all the available genome sequences in xls or csv format.

Please note that, I have already tried with NCBI Assembly database and NCBI Genome Database, But I could not find all the draft and complete genome sequences, which I found in NCBI sequence set browser and NCBI bioproject (please refer the below link) https://www.ncbi.nlm.nih.gov/Traces/wgs/?page=1&view=wgs&search=Xanthomonas

I have tried with biomartr R package, wherein, I could find only NCBI RefSeq, NCBI nr, NCBI nt, NCBI Genbank, NCBI nt, Ensembl, Ensembl Genome databases. However I could not find my draft genome of interest in the above mentioned databases, which I found at NCBI Bioproject and NCBI sequence set browser mentioned above.

I am looking forward for your valid suggestion for the same.

genome sequence assembly next-gen • 2.3k views
ADD COMMENT
1
Entering edit mode
ADD REPLY
0
Entering edit mode

Dear genomax, I need to download complete genomes along with draft genomes. The cited answer meant for complete genome downloading.

ADD REPLY
0
Entering edit mode

The tool linked in @jrj.healey's post can be used to download all genomes available by a command like

ncbi-genome-download --genus "Streptomyces coelicolor" bacteria

Substitute the name shown above for one you are interested in. Use multiple names separated by commas if you need more than one species.

ADD REPLY
0
Entering edit mode
5.2 years ago
vmicrobio ▴ 290

I'll try something like this:

/../edirect/esearch -db nucleotide \
        -query "Xanthomonas[organism] AND genome[title]" \
    | /../edirect/efetch -format fasta > allXanthomonas.fasta

then

makeblastdb -in allXanthomonas.fasta -parse_seqids -dbtype nucl -title xanthomonas -out xanthomonas

good luck

ADD COMMENT
0
Entering edit mode

Dear vmicrobio, Thank you for your answer. But, I could not understand the command given below, /../edirect/esearch -db nucleotide \ -query "Xanthomonas[organism] AND genome[title]" \ | /../edirect/efetch -format fasta > allXanthomonas.fasta is it a script or command, if it so, in which I have to execute this.

The second command seems like a blast command (makeblastdb -in allXanthomonas.fasta -parse_seqids -dbtype nucl -title xanthomonas -out xanthomonas), I need to download all the draft and complete genome sequences from this url https://www.ncbi.nlm.nih.gov/Traces/wgs/?page=1&view=wgs&search=Xanthomonas

ADD REPLY
0
Entering edit mode

Could you please elaborate your answer and give me more details about the ftp or the programme environment where to execute those commands.

ADD REPLY
0
Entering edit mode

@vmicrobio's answer makes use of Entrezdirect. You will need to download that software.

ADD REPLY

Login before adding your answer.

Traffic: 1842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6