Biostar Beta. Not for public use.
0
Entering edit mode
15 months ago
kspata • 50
Chicago

Hi,

I wish to download genbank sequence for KRAS with all the exonic regions highlighted. GenBank has an option of 'Highlight Sequence Feature' which displays exons one at a time. But I want to highlight these sequences (exons only) in the downloaded GenBank file.

https://www.ncbi.nlm.nih.gov/nuccore/NG_007524.1?&feature=any#feature_NG_007524.1_exon_0

Is there a way to do this with NCBI while downloading the sequence? Or I have to do it manually which will take time and is more prone to errors.

gene Genbank • 384 views
1
Entering edit mode

If you are after coding sequences then following would work:

esearch -db nuccore -query NG_007524.1|efetch -format fasta_cds_na


However, like genomax2 mentioned in the comment, it is not possible to 'highlight sequences'.

0
Entering edit mode

Sequence is normally in text format so any annotation (like the highlighting that you refer to) is applied on top/after the fact.

You may be able to use UCSC Table browser which offers an option of downloading genomic sequences with Exons in upper case, everything else in lower case. That may fit your need of being able to distinguish the exons from the rest of the sequence.

0
Entering edit mode

Hi genomax,

Thank you for replying. Your suggestion is the most close to what I need. Also, on Ensemble you can download the sequence with exons highlighted in the RTF (Rich Text Format).

https://useast.ensembl.org/Homo_sapiens/Gene/Sequence?g=ENSG00000133703;r=12:25204789-25250936

However, as you and Sej mentioned there is no other way to highlight a sequence in Genbank format.

0
Entering edit mode
19 months ago
National Centre for Cell Science, Pune

You can use efetch from ncbi-entrez-direct utilities. But first, you must have the accessions (with start and end of seuences which you want to extract). Make one file for this data and use following command in any script.

efetch -db nuccore -id NG_007524.1 -format fasta -chr_start start -chr_stop stop

0
Entering edit mode

I know you want to help but please check the requirements in original question before you provide answers.

But I want to highlight these sequences (exons only) in the downloaded GenBank file.

Your solution does not satisfy that requirement. It only retrieves the sequence.