Bio-Javascript?
7
16
Entering edit mode
13.2 years ago
Lee Katz ★ 3.1k

I am wondering if there is a bioinformatics JavaScript framework out there?

I use BioPerl almost all the time, and I have realized that there could be a use for at least a rudimentary JavaScript framework. I am aware of SMS2 but it is not object-oriented and is not always easy to use.

We'll never need an extensive Bio-JavaScript but I imagine it would be useful to be able to do basic things: parse sequence files, parse alignment files, etc.

edit I went ahead and spent a couple of days making a bio-js library that relies on strings. I realize it shouldn't be used for any heavy lifting but it might be useful for the front-ends of cloud computing. Let me know if you like it and would like to develop for it.

http://code.google.com/p/bio-js/

The code is accessible through Mercurial: hg clone https://bio-js.googlecode.com/hg/ bio-js

• 13k views
ADD COMMENT
1
Entering edit mode

I like the idea of a BioJS. I think it lowers the barrier to create fun interactive bioinformatic web apps. How could that be bad?! Also, a project could always add the server side stuff as they progress.

ADD REPLY
1
Entering edit mode

We have developed BioJS: an open source library for reusable components. BioJS is a community-driven open source project whose mission is to provide 1) a minimum set of guidelines (or standard) for biological JS development and 2) a library of reusable components for visualisation. The purpose of BioJS is to make it easy to create, share and reutilise components for biological visualisation.

BioJS allows individual scientists and rganisations to i) reuse components, avoiding reinvention of the wheel every time some visualisation widget is needed for a lab, company or institution, ii) have a central repository where to easily discover all available functionality, saving having to spend time and effort searching the Internet for specific functionality, iii) test the functionality of available components without needing to install them, again, saving time and effort to prospective users, iv) combine components into more complex lego-like pieces, creating new functionality tailored to the developer's needs, v) extend the functionality of available components in a standard manner - once the programmer learns how to extend a component, extension of other components is done the same way, vi) maintain component, sharing the tasks of bug fixes, support and documentation with the community and vii) develop new functionality following a predefined architecture common to all components - the programmer thus needs to learn how to create components only once.

ADD REPLY
0
Entering edit mode

FYI: I started thinking about this after commenting on this thread: Multiple Sequence Alignment In Javascript

ADD REPLY
15
Entering edit mode
13.2 years ago

Perhaps the most common use of Javascript with bioinformatics is in genome browsers. The common suspects (GBrowse, xGDB, et al.) use it alongside server side scripts, but don't really make reusable Javascript resources (as far as I'm aware). JBrowse is mostly (or all?) client-based code, but I don't think any of its code base is easily used elsewhere.

A group at Boston college recently released Scribl, a Javascript/HTML5-based library for drawing genome annotations and other genomic data. It's still quite rudimentary at this stage, but I think the idea is on target. If Javascript is to be used in bioinformatics, I can't think of a more appropriate use than visualization. I'm not sure how useful it would be to do intense file processing with Javascript...

ADD COMMENT
2
Entering edit mode

Scribl is a really neat site with a lot of functionality and potential. Good find.

ADD REPLY
0
Entering edit mode

JBrowser requires a server as I remember. Writing javascript is quite difficult due to the different DOM structures of web browsers. On graphics, the lack of canvas in IE6/7/8 is also a big concern. There is a library to emulate canvas in IE (scribl uses it), but it is 100X slower than canvas in Chrome/Firefox/Opera/Safari, making it impractical for graphics intensive applications. For this reason, Jbrowse does not rely on canvas at all, but this complicate implementation.

ADD REPLY
0
Entering edit mode

I haven't seen the JBrowse code but it really is an extensive project and is probably along the lines of what I was looking for too. You sir get my checkmark for answering the biostar question!

ADD REPLY
4
Entering edit mode
13.2 years ago
lh3 33k

I used to write a javascript-based treeviewer, purely out of curiosity. I quite like the syntax of javascript and also thought to apply it to Bioinformatics. However, the problem with javascript is too limited for general-purpose programming. For example, the worst thing is it does not have a concept like "file". A Javascript library is thus only useful for web development, but even in this case, the extreme inefficiency of the IE6/7/8 javascript engines and the diversity of implementations are big concerns. Serve-side scripting is better to guarantee a consistent performance.

The most comprehensive Javascript application in Bioinformatics is (you must know it):

http://www.ualberta.ca/~stothard/javascript/

which has been cited several times in biostar.

ADD COMMENT
0
Entering edit mode

Sorry, I just realize that SMS2 and the link I gave above are the same thing. In general, I doubt if good programmers are willing to invest time on something limited to a small user base.

ADD REPLY
0
Entering edit mode

Your treeviewer looks nice and I must try it when I have a tree on-hand! You also have a really good point about files.

ADD REPLY
3
Entering edit mode
13.2 years ago
Dror ▴ 280

I think that a javascript bioinformatic library will be needed:

the move into a "cloud" based environments like amazon EC2/app-engine and a more simplified operating systems such as chromeOS and android/iOS, will completely change the way people think about programming. So I predict that bioinformatics in the future will be carried out on a cluster of thousands of computers in the cloud, bundled together on the internet. The bioinformatic programming will be based mainly on using web-services and web-applications. These new concept will make a javascript-based bioinformatics very useful.
I know that it is sound crazy now, but in 2-3 years I predict that most people will want to perform their bioinfo analysis with their iPhone/iPad/netbook computers, doing a data-intensive tasks online, without buying expensive hardware.

I am dreaming on developing such an environment, and will work on it soon - an app-engine/jquery based bioinformatics platorm.

ADD COMMENT
1
Entering edit mode

Sounds like you want to use html+css+javascript for an front-end gui. Sure, that was how these language suppose to work well with, but that is very small part of the program/environment.

ADD REPLY
1
Entering edit mode

In that case, GWT plus Google AppEngine is probably the way you want to go. Html, CSS, and JS on the front end, designed in GWT with Java and tomcat or python scripts on the back end. I don't see any reason to do the computation in javascript though. The javascript will be there to parse the input and organise the output but not to do the math.

ADD REPLY
0
Entering edit mode

This is a bold prediction ;)

ADD REPLY
2
Entering edit mode
13.2 years ago
Mitch Skinner ▴ 660

The JBrowse code is maybe 85% JS. One of my longer-term goals is to take any generally-useful JS code in JBrowse and split it out into a library. I think some of the data structure stuff, the NCList code and the trie code, might be useful elsewhere.

That code might need some changes to be useful outside of JBrowse, but given that I have JBrowse deeply grooved into my brain, those changes are hard for me to see. If anyone has a non-JBrowse use case for any of the code in it, it would be nice to talk about how well the APIs work for you.

As for a "biojavascript", I'm occasionally frustrated with how monolithic bioperl is, so I'm not sure if there should be an actual "biojavascript" as in one software package. It would be nice to have some kind of umbrella project to coordinate through, though.

ADD COMMENT
2
Entering edit mode
9.2 years ago
keithwhor ▴ 60

I recently wrote NtSeq in JavaScript. It's a lightweight, zero-dependency library for nucleotide sequence manipulation (modification, complementation, translation, content determination) and ultra-fast (though CPU-bound) ungapped alignment using JavaScript available for both node.js and the browser. It's fully documented (with tests and benchmarks!) and provides a very simple abstraction layer that both veteran and novice bioinformaticians to jump in to easily.

The goal is to provide a fast, dependable library that larger tools can be built on top of. Yet to be published, but freely available open source (MIT licenced.) :)

ADD COMMENT
0
Entering edit mode

This is awesome. I've been wanting to see how fast javascript can be compared to python/perl as v8 is blazing fast. I found v8 to be a lot faster in many applications compared to python. The only problem I foresee with using nodejs for bioinformatic usage is it's memory limitation and lack of parallelism. I think nodejs is still bound by ~1.4 gig max memory limit? Do you know if io.js has the same issue?

ADD REPLY
0
Entering edit mode

I have written a few javascript based command-line tools such as hapdip and bwa helper scripts. When you have complex nested loops, V8 can be tens of times faster than perl/CPython. However, there are two standing issues that make V8/node.js not a replacement of perl/python. Firstly, the ~1.7GB memory limit. This is inherited from V8. Every wrapper around V8 (e.g. node.js, io.js, and my own one) has this problem. The second is that almost every wrapper lacks usable APIs on synchronized file I/O. Even the most basic task such as reading a file by line turns out to be challenging. It is surprising that the designers of dart, node.js and CommonJS APIs know little about system programming. It is also a pity given that V8 and dart have such a huge potential to replace the slow scripting languages we are using everyday.

ADD REPLY
0
Entering edit mode

Is the problem you are referring to with file IO due to the callback programming style? I found that to be really annoying to work with also.

Most of the cases where I found javascript to be a lot faster than python are recursive functions (I don't think python is tail call optimized).

ADD REPLY
0
Entering edit mode

The callback style is the result of asynchronous/non-block I/O. Others told me that non-block I/O is convenient for server applications. For our daily text processing, though, it is quite clumsy and even more problematic when we want to read multiple files in parallel (e.g. unix "paste").

ADD REPLY
0
Entering edit mode

If async I/O is an issue, you can run synchronous file I/O operations in node without problem. Every file operation has a sister "sync" method that runs synchronously, allowing you to avoid, erm, "callback heck."

Furthermore, if you do wish to read multiple files in "parallel" you can either use Promises to wait until each of your async operations have finished (though async ops in a single process won't be truly parallel), or use forked child processes.

ADD REPLY
0
Entering edit mode

I of course know node.js has synchronous I/O. It is just designed badly to be useful in our routine text processing. I also know we can implement unix "paste" with async I/O. It is just overcomplicated for what it is worth. Node.js/dart/commonJS should learn from other languages and provide synchronous stream (i.e. C stdio; C++ iostream; java BufferedReader; builtin line reading functions from perl/python/ruby/lua/...).

ADD REPLY
0
Entering edit mode

Not entirely certain on the memory limit issues. I looked it up and you can run: node --max-old-space-size=3072

When running a node process to set the memory limit in MB (here it's setting to 3GB).

Running node v.0.12.0, I had no problem allocating that much RAM to a process. (Set three variables using crypto.randomBytes(1024*1024*1024-1).)

< image not found >

Forking processes in node.js (and running parallel operations on multi-core machines) is entirely possible. (http://blog.carbonfive.com/2014/02/28/taking-advantage-of-multi-processor-environments-in-node-js/). Not to mention, due to the asynchronous nature of node it's very easy to design systems that scale. A child process could be launched just as easily on a separate server as it could as a forked process. I haven't played much with forking myself up until this point, but if you do, please let me know how it goes. :)

ADD REPLY
0
Entering edit mode

That's strange. I tried allocating 3-4 gigs few days ago with the --max-old-space-size=8192 tag couple days ago and it kept throwing memory errors at me. I am running 0.10 though, so maybe I just need to upgrade...

ADD REPLY
0
Entering edit mode

Some suggested that the 1.7GB limit is applied to one object. If so, you could allocate multiple 1.7GB objects. I have also tried --max-old-space-size=8192 and got an immediate segfault. 4096 worked. I don't know whether this implies that the total memory limit is ~4GB.

ADD REPLY
1
Entering edit mode
13.2 years ago
Bio_X2Y ★ 4.4k

Using JavaScript for any heavy lifting in bioinformatics strikes me as something you should never want to do, ever :) And I think anything with the word "file" in it counts as heavy.

The language is imprisoned in a web browser for good reason - it was designed to enhance the functionality of a web page, and should ideally have no interaction with anything else that might be going on in your machine (like biology files). There are plenty of other languages for general programming, and I don't think JavaScript offers any sufficiently magical features to justify its recruitment to bioinformatics.

As lh3 points out, the audience for a library of small javascript functions would probably be tiny. If I ever decide to reverse complement a sequence on the client side (and I don't see that happening), I think I would just write the few lines of code myself rather than learn an API and import a javascript dependency.

ADD COMMENT
3
Entering edit mode

JS is not imprisoned in the browser anymore: http://nodejs.org/ -- though I'm not saying it's a good idea to port Bio* libraries to JS.

ADD REPLY
0
Entering edit mode

The implementation of node.js, at least in some parts, is quite sloppy and substandard. For example, its File interface does not support streamed I/O as I read some of its source code a couple of months ago. Javascript can be extended for general purposes, but node.js is not the right way, although it is rapidly gaining popularity.

ADD REPLY
0
Entering edit mode

Node.js supports streams now. Though I am replying to a response that's four years old. :)

http://nodejs.org/api/fs.html

ADD REPLY
1
Entering edit mode
9.2 years ago

The BioJS project exists for quite some time now and aims mainly at visualisation of biological data. It has already quite a few components, like MSA viewer, tree viewer or protein feature viewer. I don't know the details but I suppose they must have developed a few common data structures, so I would say it is a good place to start if one wants to develop more JS bioinformatics tools.

The code is in github

ADD COMMENT

Login before adding your answer.

Traffic: 2937 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6