Genomic Locus Diagram
1
0
Entering edit mode
10.2 years ago
Pappu ★ 2.1k

I am trying to compare the locus of a homologous gene of a set of embl files. I am wondering if there is any existing tool which inputs a set of protein_ids and genbank files and outputs a diagram of genes +/-3 of the protein_ids :

--->  <---  --->  ===>  -->  <---  --->

-->  <---  --->  ===>  --->  <---  --->

where ===> is the gene of interest.

python biopython • 2.5k views
ADD COMMENT
2
Entering edit mode
10.2 years ago

I'm not sure of what you really want (what is the content of the Genbank files ? ). Here is a simple script that query the UCSC mysql server for the gene NOTCH2 , find the 3 transcripts on the left/right and transforms the XML/MYSQL output to SVG. ucsc-sql2svg.xsl is available at: https://github.com/lindenb/xslt-sandbox/blob/master/stylesheets/bio/ucsc/ucsc-sql2svg.xsl

MYSQL="mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -N "

${MYSQL} -e 'select K.chrom,K.txStart,K.txEnd from knownGene as K, kgXref as X where K.name=X.kgId and X.geneSymbol="NOTCH2" limit 1' |\
awk '{
    printf("select name from knownGene where chrom=\"%s\" and txEnd<%s order by txEnd desc limit 3;\n",$1,$2);
    printf("select name from knownGene where chrom=\"%s\" and NOT(txStart>=%s or txEnd<%s);\n",$1,$3,$2);
    printf("select name from knownGene where chrom=\"%s\" and txStart>%s order by txStart asc limit 3;\n",$1,$3);}' |\
${MYSQL} | sort | uniq |\
awk 'BEGIN {printf("select * from knownGene where name in(\"xxx\"");} {printf(",\"%s\"",$0);} END {printf(");\n");}' |\
${MYSQL} -X |\
xsltproc ucsc-sql2svg.xsl - > ouput.svg

Ouput:

enter image description here

ADD COMMENT
0
Entering edit mode

Thank you. The exact term would be synteny.

ADD REPLY

Login before adding your answer.

Traffic: 1950 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6