Why There Are A Lot Of Bioinformatic Tools Implemented In Python?
4
2
Entering edit mode
10.8 years ago

I know that Java programs work faster than python ones, so why there are a lot of programs written in python? is it because developing in python is faster or what?

or because there are some available libraries for python which are not available for a language like java?

I'm asking this because I'm trying to write a tool which involves BLAST and I care a lot about performance and I'm competent in both Java and python.

SO which one I should use to have a better performance?

python java • 4.2k views
ADD COMMENT
1
Entering edit mode

if you need to parse blast with java, just use xjc with the -dtd option and the DTD for blast.

ADD REPLY
0
Entering edit mode

I keep asking myself this question ... sigh

ADD REPLY
8
Entering edit mode
10.8 years ago
KCC ★ 4.1k

I will give a subjective list of advantages to using python:

  1. Learning python is quicker than learning Java. What I mean by this is if I have a class of students with limited programming experience and I attempt to teach 50% of them Java and 50% of them python, I expect those that use python to be writing more complex programs after a month.

  2. It's faster to write code in python, than Java.

  3. It's easier to install and run code written in python. I find that I am always having to tangle with my version of JAVA. On my mac for instance, the security issues with Java mean that it's always being broken by security patches somehow.

  4. Although Java is faster, some of the slowness of python can be overcome with tricks like cython.

ADD COMMENT
1
Entering edit mode

I'd also add that there are also far more Bioinformatics related libraries available for Python compared to Java. BioPython is ok, and AFAIK much better developed and extensive compared to BioJava. And, at least on the genomics side of things, you have pybedtools, PyVCF, and a whole host of other libraries meant to do common tasks well. Not to mention all Python's built in libraries like the csv module that make working with standard bioinformatics and genomics filetypes and datasets very straightforward.

ADD REPLY
2
Entering edit mode

One of the things I like a lot about python is the pythonic approach is to have one best way of doing things. I think this often extends to a tendency for the python community to concentrate on making a few good packages; often for any given application, the choice is relatively clear and usually the package is well-supported and actively being developed. eg. numpy, scipy, matplotlib, pandas, rpy/rpy2

ADD REPLY
1
Entering edit mode

Definitely. Also easy_install, pip, and virtualenv make installation of most packages and the construction of specific application and development environments very straightforward.

ADD REPLY
0
Entering edit mode

And now we got another "praise the python" post ;)

ADD REPLY
8
Entering edit mode
10.8 years ago
Leszek 4.2k

Most of the so-called bioinfo programmes are just simple wrappers running some other tools (usually quite computation-heavy) and processing their results. In such cases, it doesn't really matter which language is faster, but which is more handy to work with. For that, my choice is python.
This said I would go for java if there is need for GUI or multi-platform support.

ADD COMMENT
6
Entering edit mode
10.8 years ago
Asaf 10k

My guess is that BLAST will take most of the running time in your case and "gluing it together" with python or JAVA won't make a huge different, go with what is more convenient for you. I think you should check how easy will it be to parse the data and BLAST results in either of the languages (should be pretty easy with python/Biopython, I don't know about JAVA).
You should probably consider future or multiOS compatibility if this tool should be used in the future (in this case I think JAVA is better).
That's my opinion at least, hope I helped.

ADD COMMENT
6
Entering edit mode
10.8 years ago

Without getting into a ridiculous argument over which language is better. In most situations, practically, development in python (or any interpreted language) is just faster.

It also depends on what kind of tool you are developing. If you are writing a simple file converter, the speed difference will probably be negligible. If you are writing a computationally heavy piece of software, why not just do it in C.

And about the tool you are trying to develop that involves BLAST. I recommend identifying where the bottleneck will be first. Most likely it will be the blast process. If running blast takes 10 minutes and your python/java portion takes 10/20 seconds, does it really matter?

ADD COMMENT

Login before adding your answer.

Traffic: 2478 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6