Biostar Beta. Not for public use.
Using the blast api from Python (3)
0
Entering edit mode
24 months ago
Freek • 10
Netherlands

Hi all,

I'm trying to use the blast api (https://blast.ncbi.nlm.nih.gov/Blast.cgi) from Python using the requests module. My goal is to send a sequence and get genomic (Ensembl GRCh38) coordinates back.

request = 'https://blast.ncbi.nlm.nih.gov/Blast.cgi?QUERY=gagtctcctttggaactctgcaggttctatttgctttttcccagatgagctctttttctggtgtttgtct&DATABASE=nt&PROGRAM=blastn&CMD=Put&FORMAT_TYPE=JSON2'

(This sequence is part of the ACTB gene)

I sent it to the server like this:

response = requests.get(request)

The response looks like:

print(response)
<Response [200]="">
print(response.headers)
{'Server': 'Apache', 'Set-Cookie': 'BlastCubbyImported=passive; domain=ncbi.nlm.nih.gov, MyBlastUser=1lgZT_2PBCUePBfITK86610D67; domain=.ncbi.nlm.nih.gov; path=/, ncbi_sid=5AAB86A694B876A1_0000SID; domain=.nih.gov; path=/; expires=Fri, 22 Jun 2018 09:01:30 GMT', 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload', 'Content-Security-Policy': 'upgrade-insecure-requests', 'X-UA-Compatible': 'IE=Edge', 'Cache-Control': 'private', 'Referrer-Policy': 'origin-when-cross-origin', 'NCBI-SID': '5AAB86A694B876A1_0000SID', 'NCBI-PHID': '5AAB86A694B876A10000000000000001.m_1', 'Keep-Alive': 'timeout=1, max=10', 'X-XSS-Protection': '1; mode=block', 'Content-Type': 'text/html', 'Transfer-Encoding': 'chunked', 'Date': 'Thu, 22 Jun 2017 09:01:30 GMT', 'Connection': 'Keep-Alive'}

print(response.content)
b'http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">\n<html xmlns="&lt;a href=" http:="" www.w3.org="" 1999="" xhtml"="" rel="nofollow">http://www.w3.org/1999/xhtml">\n<head>\n<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>\n<meta name="jig" content="ncbitoggler ncbiautocomplete"/>\n<meta name="ncbi_app" content="static"/>\n<meta name="ncbi_pdid" content="blastformatreq"/>\n<meta name="ncbi_stat" content="false"/>\n<meta name="ncbi_sessionid" content="5AAB86A694B876A1_0000SID"/>\n<meta name="ncbi_phid" content="5AAB86A694B876A10000000000000001"/>\nNCBI Blast\n<link rel="stylesheet" type="text/css" href="css/header.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/google-fonts.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/footer.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/main.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/common.css" media="screen"/>\n<link rel="stylesheet" type="text/css" href="css/blastReq.css" media="screen"/>\n\n<link rel="stylesheet" type="text/css" href="css/print.css" media="print"/>\n\n\n\n<script type="text/javascript" src="/core/jig/1.14.8/js/jig.min.js             "></script>   \n<script type="text/javascript" src="js/utils.js"></script>\n<script type="text/javascript" src="js/blast.js"></script>\n<script type="text/javascript" src="js/format.js"></script>\n\n</head>\n\n<body id="type-a">\n\n
\n\t\t \t\n

Most of it is cut off because of the character limit of this post.

This is unexpected, not? The response is not JSON and difficult to parse, it looks like it get a webpage back somehow.

Any suggestions?

Best regards,

Freek.

ADD COMMENTlink
0
Entering edit mode
3.4 years ago
Bergen

How fixed are you on using JSON format for the response?
Have you considered using the Blast api from biopython?: http://biopython.org/DIST/docs/api/Bio.Blast-module.html

-> very easy to parse!

ADD COMMENTlink
0
Entering edit mode

Hi Gunnar, Thanx for your response.

Hmm, before asking to install such things on our compute cluster I prefer this minimal approach. I will investigate bio-python, still I would prefer minimal, self made, flexible code and an easy to parse JSON response for portability, if anybody can get it to work :)

I feel I'm missing a very small thing.

ADD REPLYlink
0
Entering edit mode

By the way, what ever FORMAT_TYPE I use I get the same html/website as a response.

ADD REPLYlink
0
Entering edit mode

Any tips on using Biopython then?

If I do this:

result_handle = NCBIWWW.qblast("blastn", "nt", 'gagtctcctttggaactctgcaggttctatttgctttttcccagatgagctctttttctggtgtttgtct')

I get some XML output, when I want to print result_handle again, it is empty! How to save result for example?

>

Never mind, I found this in the Biopython Cookbook:

"We need to be a bit careful since we can use result_handle.read() to read the BLAST output only once – calling result_handle.read() again returns an empty string."

I really don't understand the reason for this, it made me re-blast many, many times wondering what went wrong in the part of my script after the .read(). Anyway, thanx for the suggestion.

ADD REPLYlink

Login before adding your answer.

Similar Posts
Loading Similar Posts
Powered by the version 2.3.1