Is Bioinformatics Data Indeed More Open Than Cheminformatics Data?
3
0
Entering edit mode
13.2 years ago

A running meme is that bioinformatics data is more open than bioinformatics (e.g. mentioned in this blog). It is my feeling too that this is the case, but looking at this spreadsheet with NAR-listed databases (resulting from this BioStar question), already shows many instances where the data cannot be downloaded. Moreover, the sheets give no clue on whether I can modify and redistribute the data, two core rights for Open Data. I, therefore, added two columns to allow annotation with these two aspects. Any non-commercial clause would make it non-open data too. General info can be found at Is It Open Data?

So, my question is basically how Open is bioinformatics data? What is the percentage of data that is in fact Open?

chemoinformatics comparison • 3.3k views
ADD COMMENT
2
Entering edit mode

Seeing as you can never quantify how much information is closed, tombed or silo'd away - I'd say this is almost impossible to answer.

ADD REPLY
0
Entering edit mode

I guess this NAR database spreadsheet would be a reasonable approximation, not?

ADD REPLY
0
Entering edit mode

Daniel is right in principle. The practical comparison we can make is between available but non-open data and real open data, which indeed is what the table is about.

ADD REPLY
0
Entering edit mode

I am not sure whether about your evaluation of a non-commercial clause. Larger database are expensive structures that are often paid from community paid research projects. They need to be maintained after those projects end, which still is expensive. It makes sense that you pay for the maintenance if you make a profit from using the data. Also it is just not fair to take free open data, wrap it up in nice colored website or tool and sell it. As long as these clauses allow fair usage I think that is fine.

ADD REPLY
0
Entering edit mode

Chris, I understand your arguments (and have them many times), but isn't the whole idea of making data freely available that people in fact use it? How does a fancy website make maintenance more difficult or more expensive for you, if others help you share it? What defines a profit? Profit is one of the virtues of western civilization; what's wrong with that? How does it hurt the community of the data becomes more accessible because others start distributing it? I don't understand your point... (if you really just worry about attribution, given you mention 'fair', that's a whole other clause.)

ADD REPLY
1
Entering edit mode
13.2 years ago

I work mostly with genomics data (plant genomics to be precise), and I have never had any problems accessing, using, or reusing genomics data. I guess an exception would be when I have been given access to data that is not yet published, but that is quite understandable. In each case when the data was published, everything I had and more became accessible to myself and the general public.

Maybe it's just the nature of the beast: as an academic, you can't spend that much money and effort sequencing, assembling, and annotating a genome unless you plan to make it available as a public information resource.

ADD COMMENT
1
Entering edit mode

@Egon I think I have to agree with you that (prote|metabol)omics data is much more scarce. Perhaps this scarcity is more of an issue than data licensing?

ADD REPLY
0
Entering edit mode

Yes, genomics data is OK... but how many proteomics and metabolomics databases are there around freely? Last is my field, really, and data are pretty scarce...

ADD REPLY
0
Entering edit mode

Well, there is plenty of metabolomics data around, just not freely... dunno so much how that applies to proteomics...

ADD REPLY
1
Entering edit mode
13.2 years ago

I think that Bioinformatics is more open because you cannot sell this information, but in Chemoinformatics is more expensive.

ADD COMMENT
0
Entering edit mode

You cannot sell biomarkers? The thing is indeed that chemical structures can be patented... I we will see biomarkers patented, if that is not already happening... they patent genes too...

ADD REPLY
0
Entering edit mode

I just want to say that chemoinformatics data is more valuable than the bioinformatics one, yep, we still cannot deal with huge amount of biodata up to now, we still new real bioinformatics tools.

Genes and biomarkers patenting is big issue. Gene patents had vague legal status in Europe and US: "It argues that isolated and altered DNA should be patentable, whereas DNA that is simply isolated should not be patentable."

Talking about biomarkers - yes, you can sell them, but in most of the cases this info is really odd. And in my opinion biomarker patent will be also have no legal status.

ADD REPLY
0
Entering edit mode
13.2 years ago

I am not sure if this is a fair comparison. Cheminformatics is a much younger field then Bioinformatics. Having said that resources like the Human Metabolome Database (http://www.hmdb.ca), ChEBI (http://www.ebi.ac.uk/chebi/), kegg compound (http://www.genome.jp/kegg/compound/) are open resources.

ADD COMMENT
1
Entering edit mode

Actually, no. See http://blog.rguha.net/?p=913 - pretty similar lineages

ADD REPLY

Login before adding your answer.

Traffic: 2453 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6