Does anyone on the list happen to know concrete/citable figures about
how much is spent on building or maintaining biological databases
(either worldwide or by particular countries or organizations)? I
have heard figures quoted of $25+million for the NIH in the US, or
"billions" in various places without links to sources.
Sorry if this is off topic. I imagine one motivation for using wikis
for biological databases is to cut down this cost. I'd also be very
interested in any figures about the relative costs of wiki vs. other
development models.
Thanks for any pointers,
--James
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
**************************************
Dr. Barend Mons
Scientific Director
Support and external relations
Netherlands Bioinformatics Centre (NBIC)
http://www.nbic.nl
and Biosemantics Group
Leiden University Medical Centre
http://www.biosemantics.org
Mobile: +31-624879779
E-mail: Baren...@nbic.nl
Phone: +31 (0)24 36 19 500
Fax: +31 (0)24 89 01 798
Mail: Netherlands Bioinformatics Centre
260 NBIC
P.O. Box 9101
6500 HB Nijmegen
Visiting address:
LUMC building 2, Einthovenweg 20
2333 ZC Leiden, The Netherlands
> --
> BioWiki mailing list:
> biow...@googlegroups.com
>
> Subscription & archives:
> http://groups.google.com/group/biowiki-l
>
> Unsubscribe:
> biowiki-l+...@googlegroups.com
**************************************
Dr. Barend Mons
Scientific Director
Support and external relations
Netherlands Bioinformatics Centre (NBIC)
http://www.nbic.nl
and Biosemantics Group
Leiden University Medical Centre
http://www.biosemantics.org
Mobile: +31-624879779
E-mail: Baren...@nbic.nl
Phone: +31 (0)24 36 19 500
Fax: +31 (0)24 89 01 798
Mail: Netherlands Bioinformatics Centre
260 NBIC
P.O. Box 9101
6500 HB Nijmegen
Visiting address:
LUMC building 2, Einthovenweg 20
2333 ZC Leiden, The Netherlands
On Apr 15, 2011, at 7:53 PM, James Cheney wrote:
I know the budget of Wikipedia is in the order of a few millions of
dollars a year, which obviously pales into insignificance when you
start talking about 'billions'... I would imagine that you can source
this particular figure from the Wikimedia Foundation.
TBH, I think the real advantage of wiki vs. 'conventional biological
database' isn't one of cost. Both are very cheap to maintain compared
to the cost of 'real' biological research. The difference is that wiki
databases are 'owned' by the community in some sense (in principle at
least), although traditional biological databases are 'owned' by
particular groups, departments, institutions or universities.
My favourite example is the BIND database of protein-protein
interaction, which essentially died when the funding ran out. i.e.
Group X fails to get funding for project y, database z dies. I think
wiki makes this less likely.
HTH,
Dan.
**************************************
Dr. Barend Mons
Scientific Director
Support and external relations
Netherlands Bioinformatics Centre (NBIC)
http://www.nbic.nl
and Biosemantics Group
Leiden University Medical Centre
http://www.biosemantics.org
Mobile: +31-624879779
E-mail: Baren...@nbic.nl
Phone: +31 (0)24 36 19 500
Fax: +31 (0)24 89 01 798
Mail: Netherlands Bioinformatics Centre
260 NBIC
P.O. Box 9101
6500 HB Nijmegen
Visiting address:
LUMC building 2, Einthovenweg 20
2333 ZC Leiden, The Netherlands
SRA was useful at the dawn of NGS. Among other things, it was a common
collection of real data data from various emerging formats. It was
cited in
http://nar.oxfordjournals.org/content/early/2009/12/16/nar.gkp1137.full
to show how the fastq format had been allowed to diverge. My
impression was that the SRA would also capture the raw data from the
Heliscope, Polonator, Nanopore, Zs, ... and whatever came next.
But for the major platform it's just become too cheap to produce the
data, storage costs become the new bottleneck. Many labs don't
permanently archive all of their own primary data. Providing a hot
backup every run of every sequencer, everywhere, forever is expensive
and of limited value. They seem to be interested in keeping the
existing content online, but stopping new submissions. Seems pretty
reasonable to me.
I'd like to see them also accept at least 1M sequences from any new
platforms, but I think it is right to close to submissions from the
well established platforms.
> --
> BioWiki mailing list:
> biow...@googlegroups.com
>
> Subscription & archives:
> http://groups.google.com/group/biowiki-l
>
> Unsubscribe:
> biowiki-l+...@googlegroups.com
>
>
>
>
--
--
Mike Cariaso
http://www.cariaso.com
Andrew
On our side the investment so far was about 1.5 full PhD students. Meaning 6 year of PhD salary. Counting salary, taxes and overhead that is around 75.000 euro /year or 450.000 total. In the US at least half of that and probably about the same was invested. So that brings the total so far on between 700K and 1M. Not counting investments by Google through GSoC and people from other groups contributing parts of the code (and not at all counting time invested by people to produce pathways).
Best wishes, Chris
The context of this question is that I am trying to find quantitative
evidence (ideally, citable) of the significant costs attached to
biological data and annotation that could be reduced by developing
well-targeted general purpose tools (including wikis, among other
possibilities). I've been finding that many fellow computer
scientists whose work might bear on the problem do not know the
conventional wisdom about the cost of biological data, and I have only
heard secondhand figures, making it hard to persuade others that this
is an important problem.
As other responses have pointed out, there are many different kinds of
data/databases, some of which (e.g. high-throughput sequencing data)
are no longer cost-effective to store given that the data can be
regenerated on demand more cheaply. I'm most interested in those for
which this is not the case (e.g. curated databases containing
community or expert annotations).
--James