Question: How to submit bacterial genomes to the database?
0
Entering edit mode

Dear all,

which database is commonly used to submit bacterial genomes? I have a genome in the form of one fasta file consisting of ~150 seqences (Illumina MiSeq). Some of these sequences are less than 200 nucleotides long. These are mostly homopolymeric DNA stretches. When trying to submit to the ncbi database, I cannot complete the process because sequences <200 nucleotides are not allowed. If I just delete the short sequences, don't I distort the data? As you can easily recognize, this is my first time to submit a genome. In the ncbi manual, it is not stated how to deal with short sequences. Could anyone please tell me how I should continue and why?

Thank you very much!

ADD COMMENTlink 2.6 years ago olp123 • 0
Entering edit mode
1

Note that submitting low-quality data to public databases makes life harder in perpetuity for everyone using those databases. Please put a lot of effort into curating the data yourself prior to submission to ensure that the genomes are pure (uncontaminated), represent the correct species, and are as complete and contiguous as possible. NCBI has some automated checks to prevent low-quality submissions from degrading the databases, as you can see, but they are not foolproof. I suggest you study the matter a bit more before submitting anything.

ADD REPLYlink 2.6 years ago
Brian Bushnell
16k
2
Entering edit mode

Very short sequences are not informative and may easily be from a contamination (another unknown organism) - that's why these reads are so short, there was little supporting evidence for their validity. Hence it is a tradeoff between quality and quantity.

It is is perfectly fine to not include or make use these short sequences - it is usually for the better.

ADD COMMENTlink 2.6 years ago Istvan Albert 80k
1
Entering edit mode

On what bacteria do you work? If you have a close reference genome, I would recommend that you do a consensus sequence

ADD COMMENTlink 2.6 years ago vmicrobio • 240
0
Entering edit mode

It's Kocuria. Do you have an easy to use program in mind (for a beginner)?

Thanks you all. I definitely will have to study more.

ADD COMMENTlink 2.6 years ago olp123 • 0
Entering edit mode
0

You can use a tool such as PAGIT. It's a tool which gather together several softwares to do draft genome : abacas, image, icorn and ratt allowing you to do (after generating contigs sequences): scaffold building, gap closing, iterative mapping and genome annotations. See publication

ADD REPLYlink 2.6 years ago
vmicrobio
• 240

Login before adding your answer.

Powered by the version 1.8