Mauve Headers / Add DNA sequence to genbank file
0
0
Entering edit mode
8.3 years ago
Cricket ▴ 10

I have several Genbank files that I would like to align using Mauve, and then export the ortholog alignments to a file. It is this file that will be analyzed with an in-house script. This script is expecting the header format as follows [>fileNumber:start-stop:Name]:

>0:1483-2550:Campy1147c_20 +
TTATATCACATTGCTGAAAA........

No problem for genbank files when the sequence is in the file. However, when I use a fasta file, I will get a header like this:

>7
TTATATCACATTGCTGAAAA........

Which is also the format I will see when the genome in question does not have a particular ortholog.

>7
--------------------------------------------------------------------------------
----------------------------------------------------------------------...

My problem is that some of the genbank files (http://www.ncbi.nlm.nih.gov/nuccore/CP006702) for some reason do not have any sequences (translated amino/DNA) in the file. Their inclusion into Mauve will throw an error (after all, there is no sequence to align).

There is a fasta file that I can snag...however, with my limited Mauve experience, as mentioned previously, when I export the orthologs (post alignment), the headers will not include any information (other than >[1-9]*).

As I see it, I can re-write my code (mild pain) or figure out one of these two items...

  1. In Mauve, is there a way to force the headers into the exported ortholog file when using a fasta file (with the file name or the GI from the fasta file)?
  2. Is there a way to get the sequence from the fasta file into the corresponding genbank file?
genbank alignment fasta • 1.8k views
ADD COMMENT

Login before adding your answer.

Traffic: 1984 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6