Cannot post query a webserver using httplib and urlib in python
3
1
Entering edit mode
8.9 years ago
mgab ▴ 60

I am trying to post query to a webserver : http://www.imtech.res.in/raghava/antibp/submit.html

but I am getting an error

Traceback (most recent call last):
  File "crawler.py", line 4, in <module>
    conn = httplib.HTTPConnection("http://www.imtech.res.in/raghava/antibp/submit.html")
  File "/usr/lib/python2.7/httplib.py", line 704, in __init__
    self._set_hostport(host, port)
  File "/usr/lib/python2.7/httplib.py", line 732, in _set_hostport
    raise InvalidURL("nonnumeric port: '%s'" % host[i+1:])
httplib.InvalidURL: nonnumeric port: '//www.imtech.res.in/raghava/antibp/submit.html'

The python script is shown below:

import httplib, urllib
params = urllib.urlencode({'seqname':'GICACRRRFCPNSERFSGYCRVNGARYVRCCSRR','format':'Amino acid sequence in single letter code', 'terminus':'N-terminus', 'method':'svm', 'svm_th':'0', 'type': 'Submit'})
headers = {"Content-type": "application/x-www-form-urlencoded", "Accept": "text/plain"}
conn = httplib.HTTPConnection("http://www.imtech.res.in/raghava/antibp/submit.html")
conn.request("POST", "", params, headers)
response = conn.getresponse()
print response.status, response.reason
data = response.read()
conn.close()

What could be the problem? Thank you.

python • 5.4k views
ADD COMMENT
0
Entering edit mode

Try omitting the http:// part in the URL you supply to httplib.HTTPConnection(). The method seems to split by : and use the part after it as the port number.

ADD REPLY
0
Entering edit mode

I have made changes, that is

conn = httplib.HTTPConnection("www.imtech.res.in/raghava/antibp/submit.html")

but I am getting the error:

socket.gaierror: [Errno -2] Name or service not known

Please assist.

ADD REPLY
4
Entering edit mode
8.9 years ago

That's how I would do it, with the disclaimer that I'm no expert in querying web pages and I don't know anything about the server in question:

python
import mechanize

br = mechanize.Browser()
br.set_handle_robots(False)
br.open("http://www.imtech.res.in/raghava/antibp/submit.html")
br.select_form(nr = 0)

## See what is available on this web page:
for f in br.forms():
    print f

#<POST http://www.imtech.res.in/cgibin/antibp/antibp1.pl multipart/form-data
#  <TextControl(seqname=)>
#  <TextareaControl(seq=)>
#  <FileControl(file=<No files added>)>
#  <SelectControl(format=[*nformat, sformat])>
#  <RadioControl(terminus=[*1, 2, 3])>
#  <RadioControl(method=[*1, 2, 3])>
#  <TextControl(svm_th=0)>
#  <TextControl(ann_th=0.6)>
#  <TextControl(qm_th=-0.2)>
#  <SubmitControl(<None>=Submit) (readonly)>
#  <IgnoreControl(<None>=<None>)>>

## Input your sequence and parameters:
br['seqname']= 'myseq'
br['seq']= 'GICACRRRFCPNSERFSGYCRVNGARYVRCCSRR'
br['format']= ['nformat']
br['terminus']= ['1']
br['svm_th']= '0'

## Sumbit and collect results:
res= br.submit()
html= res.read()

Now html is string in html format that you could parse with an html parser or something else. The relevant bit in html should look like:

<td><font size="4"><b>Antibacterial Activiy</b></font></td></tr><tr>
<td align="CENTER">GICACRRRFCPNSER</td><td align="CENTER">1</td><td align="CENTER">1.975</td><td align="CENTER">YES</td></tr><tr>
<td align="CENTER">GYCRVNGARYVRCCS</td><td align="CENTER">18</td><td align="CENTER">1.051</td><td align="CENTER">YES</td></tr><tr>
<td align="CENTER">ICACRRRFCPNSERF</td><td align="CENTER">2</td><td align="CENTER">1.001</td><td align="CENTER">YES</td></tr><tr>
...
ADD COMMENT
0
Entering edit mode

Excellent Dariober. It is working perfectly. Unbelievable. Thank you very much.

ADD REPLY
2
Entering edit mode
8.9 years ago

Don't use httplib (and other native libraries) directly. If you want to stay sane that is.

Have a look at the requests library instead, I bet you will be able to code your request just by reading the first page of documentation.

ADD COMMENT
0
Entering edit mode
8.9 years ago

This is really more of a python question than a bioinformatics one.

Only the server name should be included in HTTPConnection():

conn = httplib.HTTPConnection("www.imtech.res.in")
conn.request("POST", "/raghava/antibp/submit.html", params, headers)

I've not tested that, but it's at least closer to being correct.

ADD COMMENT
0
Entering edit mode

I have done this but it is leading to the submit.html. I have made changes to

conn = httplib.HTTPConnection("www.imtech.res.in")
conn.request("POST", "/cgibin/antibp/antibp1.pl", params, headers)

but still, it is not working.

ADD REPLY
0
Entering edit mode
  1. "Still not working" isn't something that anyone can help you with.
  2. Try a python forum.
ADD REPLY

Login before adding your answer.

Traffic: 3404 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6