Blog:A list of Bioinformatics projects for volunteers
2
25
Entering edit mode
8.1 years ago
Zhilong Jia ★ 2.2k

Here is a list of Bioinformatics projects for volunteers. I have read some posts that lots of people, interested in computational biology, would like to participate in a bioinformatics project. I assume those wants bioinformatics ideas to start. Comments and answers are welcome, enabling me to update it irregularly.

Mainly for people who can program ( R, Python, Java etc.).

Mentored Projects from Bioconductor

ROSALIND contains bioinformatics problems (By @TriS) [ref: a biostar post]

DREAM Challenges pose fundamental questions about systems biology and translational medicine.

Mainly for people in Medicine or Biology

Stargeo aims to annotate disease samples (such as control and a certain disease) from GEO, enable powerful meta-analysis of a certain disease. (Medicine or Biology knowledge is necessary, especially related with diseases)

BD2K-LINCS-DCIC Crowdsourcing Portal includes crowdsourcing projects (lots of microtasks and megatasks) related with drugs, genes and diseases in Library of Integrated Cellular Signatures (LINCS) (mainly) and GEO.

ref:

  1. Khare, Ritu, et al. "Crowdsourcing in biomedicine: challenges and opportunities."Briefings in bioinformatics (2015): bbv021.

In my opinion, one of the main issues for both sides, except knowledge, is the contribution of continuity. Working few days / weeks and then giving up probably is suitable for crowdsourcing projects to an extent.

Update (24 March, 2016):

open innovation pavilion : One of the focus is on transnational medicine or transnational bioinformatics.

nature.com has teamed up with InnoCentive to offer its readers the opportunity to participate in research and development challenges. As a Solver, you can apply your expertise to important problems, stretch your creative boundaries, and win cash awards.

Update (4 June 2016)

NCI up for a challenge

volunteers project • 12k views
ADD COMMENT
0
Entering edit mode

Thank you for this wonderful resources.

ADD REPLY
0
Entering edit mode

Hello, thanks for this nice post. However, one thing probably worth to say about Stargeo is that they seem to define expression in a very incorrect way (at least in their fundamentals video).

ADD REPLY
0
Entering edit mode

Thank @Anima. Could you detail the issue? Thank you. Actually I'm involved in this project. I'm not sure what is your point. Each disease signature is a comparison between the disease samples and control samples.

ADD REPLY
3
Entering edit mode

Gladly. I specify that this is merely a matter of jargon. Still, I think it is important to use biological terms accurately, in order to avoid confusion.

For instance,

00:18 - "when RNA is used to make protein, it is said to be expressed" 00:13 - "when more protein is made, RNA is said to have a greater expression, and when less protein is made, RNA is said to have a smaller expression"

RNA is not expressed. Genes are, and when their RNA is transcribed, they are said to be expressed regardless them being protein-coding or not. Also, gene expression levels depend solely on the amount of RNA produced.

Another more subtle point is the use you make of the term "expression pattern". You seem to refer to it as if it indicates a particular state of a transcriptome for a given biological sample, but it is actually defined as the spatio-temporal location of the RNA for a given gene through the body of a certain organism. There is no such thing as an expression pattern of a disease.

Of course this is all mean to be constructive criticism, hope it helps!

ADD REPLY
1
Entering edit mode

"Merely a matter of jargon" indeed. That "RNA is not expressed" is also a matter of jargon. Large intergenic noncoding RNAs (lincRNAs) have been identified. Is that intergenic RNA not expressed by definition? As the concept of biology changes with our increased understanding of the genome, so does the "jargon" used to describe it.

That the term "expression pattern" subtly indicates a "spatio-temporal location of the RNA" is also striking to me. I would argue most don't share in that jargon. Given the overarching context of the Gene Expression Omnibus (GEO) in the video, I think it's quite obvious that "expression pattern" refers to gene expression patterns. See this NCBI tutorial on how to find "expression patterns" in the context of GEO: http://www.ncbi.nlm.nih.gov/guide/howto/find-exp-pat.

In any case, a medical student without any bioinformatics or technical experience made the video. That is the exact audience STARGEO is targeted towards. In context, I think the video is quite clear conveying what we are trying to do. Please let me know if you disagree. But we will tighten our technical prose in the next iteration. Thx. :)

ADD REPLY
1
Entering edit mode

Hi Dexter, expression is a prerogative of DNA: once a lincRNA is recognized as of some functional meaning, the genomic region that produces it is a gene by definition; when that lincRNA is found in a cell, it is its DNA template that is being expressed. Talking of RNA expression is at least improper.

You truncated my definition of expression pattern: I defined it as the "spatio-temporal location of the RNA for a given gene through the body of a certain organism", so I was talking in first place of gene expression pattern. This definition (expression pattern being an attribute of genes) was opposed to the use of the term that is made in the video: e.g. 00:46 - "our expression pattern is not always constant; when we get a disease, our expression pattern changes in such a way that is characteristic of that disease".

In my opinion, an inexperienced author is not a good choice for a tutorial, especially when newbies are the target audience: beginners are of course more prone to get confused or mislead, and are understandably less capable of critical analysis of the message provided.

ADD REPLY
0
Entering edit mode

@Anima Thank you. Based on the central dogma proposed by Crick, I think you are precise at some points. RNAs are transcribed from genes, while protein is translated from mRNA.

Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product, either RNA or Protein (from wikipedia), also pointed by @Dexter below. Generally, gene expression levels means the amount of expressed RNAs measured by Microarray or RNA-Seq (as you pointed) partially due to the poor availability of proteomics data.

Also, the mistaken is related with the video, but not the analyses themselves in stargeo. Our team will make an improved video to clearly explain what does Stargeo do based on this discussion. Thank you for your comments.

ADD REPLY
0
Entering edit mode

While gene expression might be finalized to the production of a polypeptide, I think you would not disagree on the fact that it pivots on the process of transcription; in fact, as you say, gene expression levels are purely a measure of RNA quantities.

I am happy this discussion was helpful and I wish you and Dexter all the best for the future of Stargeo ;).

ADD REPLY
4
Entering edit mode
8.1 years ago
Charles Plessy ★ 2.9k

Everybody is welcome to join the Debian Med project! We package bioinformatics tools in Debian, but we also increasingly work on metadata, regression tests, etc.

ADD COMMENT
1
Entering edit mode
7.9 years ago
roma ▴ 120

This list is a great idea, but I wish there were more projects.

For Bioconductor mentored projects, the link https://www.bioconductor.org/developers/mentored-projects/ does not work for me (404); this one works: http://bioconductor.fhcrc.org/developers/mentored-projects/; however, I am not sure if the list is still relevant. E.g. one of the projects listed there is marked with

  • Status: imminent (January 2013)

From what I understand, Rosalind is a collection of learning exercises, as opposed to something a volunteer could contribute to.

DREAM Challenges looks very interesting, though.

ADD COMMENT
0
Entering edit mode

I'm not sure what's the status of the Bioconductor mentored projects now. The link is update now. For people wanting to join in bioinformatics, Rosalind is a good project. Thank you.

ADD REPLY

Login before adding your answer.

Traffic: 2385 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6