Biostar Beta. Not for public use.
adding fields to a genbank (translation)
0
Entering edit mode
6 weeks ago
Joe 12k
United Kingdom

I have a genbank that I got from someone that I'm doing some analysis for, and somewhere along the line it was either dodgy to start with, or has been borked.

The gbk is the correct format, and has all information, except the /translations are missing, like so:

LOCUS       Sakai_contig000001    4952793 bp    DNA     linear   UNC 05-JAN-2016
DEFINITION  [gcode=11] [organism=Escherichia coli] [strain=Sakai].
FEATURES             Location/Qualifiers
     CDS             concatenate_genome:85..6084
                     /inference="ab initio prediction:Prodigal:2.60,protein
                     motif:CLUSTERS:PRK09751"
                     /locus_tag="PROKKA_00001"
                     /product="putative ATP-dependent helicase Lhr"
     CDS             concatenate_genome:6081..8195
                     /EC_number="3.6.4.12"
                     /gene="pcrA"
                     /inference="ab initio prediction:Prodigal:2.60,similar to
                     AA sequence:UniProtKB:P64319"
                     /locus_tag="PROKKA_00002"
                     /product="ATP-dependent DNA helicase PcrA"
     CDS             complement(concatenate_genome:9148..9393)
                     /inference="ab initio prediction:Prodigal:2.60"
                     /locus_tag="PROKKA_00003"
                     /product="hypothetical protein"

Given that I still have the locus-tags, and the co-ordinates for the each CDS in the file, as well as most header information such as the inferences etc. Does anyone know of a way I can read this in to a program or script (So far I've fiddled with CLC and Artemis but without any luck), such that it puts the CDS's in the correct positions and I can then write a new GBK which will take this information and give me the translations as well.

It's important that whatever method doesn't alter the locus tags in any way else it will screw up some RNAseq analysis I've done prior to discovering this issue.

genbank • 1.2k views
ADD COMMENTlink
2
Entering edit mode
16 months ago

I just used this script to add translations to a genbank file and it seems to work perfectly: github.com/thackl/seq-scripts/blob/master/bin/gb-add-trans

It is straightforward to use:

./gb-add-trans genbank_without_translation.gb >genbank_with_translation.gb
ADD COMMENTlink
0
Entering edit mode

Nice find!

Funnily enough I already follow him on github and never saw this code!

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.1