Determine if a gene is involved in a specific developmental process
0
0
Entering edit mode
9.4 years ago
Robert Sicko ▴ 630

I have a list of ~2000 genes that are in copy number variations(CNVs). We suspect these CNVs to be related to the observed phenotype. I am trying to determine if (and how many) of the genes overlapped are involved in a particular development process.

Currently I'm trying to hack together a python script to text-mine using wget and iHOP (getLatestSymbolInformation) for all genes in my list. I then would search the XML response for processY and output a 1 for all genes in my list where processY was found in the iHOP response and 0 where it was not. I could run the script for a list of genes in CNVs of control subjects and see if more of the case genes are associated with processY than control genes.

#Get all xmls from iHOP
fname = raw_input('Enter the gene list filename: ')
try:
    fhand = open(fname)
    subprocess.call("mkdir iHOP_results", shell = True)
    for line in fhand:
        if line == "" : break
        gene_symbol = line.rstrip()
        iHOP_url = "http://ws.bioinfo.cnio.es/iHOP/cgi-bin/getLatestSymbolInformation?synonym=%s&ncbiTaxId=9606" % gene_symbol
        shell_cmd = "wget -O iHOP_results/%s %s" % (gene_symbol, iHOP_url) 
        #print repr(shell_cmd)
        subprocess.call(shell_cmd,shell = True)
except:
    print 'File cannot be opened:', fname
    exit()
finally:
    fhand.close()

This works if my gene file only has a couple of entries, but fails with

unable to resolve host address `ws.bioinfo.cnio.es' failed: Name or service not known.

with a large file.

Anyone have any ideas 1) how to fix my python script or 2) a better method of testing if the case gene list is more closely associated to a specific developmental process than a control gene list?

text-mining CNV python • 1.7k views
ADD COMMENT
0
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2665 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6