The project is beginning somewhat modestly, but we hope to learn a lot
from it. Out of some 14 million prints, photographs and other visual
materials at the Library of Congress, more than 3,000 photos from two
of our most popular collections are being made available on our new
Flickr page, to include only images for which no copyright
restrictions are known to exist.
The real magic comes when the power of the Flickr community takes
over. We want people to tag, comment and make notes on the images,
just like any other Flickr photo, which will benefit not only the
community but also the collections themselves. For instance, many
photos are missing key caption information such as where the photo was
taken and who is pictured. If such information is collected via Flickr
members, it can potentially enhance the quality of the bibliographic
records for the images.
We're also very excited that, as part of this pilot, Flickr has
created a new publication model for publicly held photographic
collections called "The Commons." Flickr hopes—as do we—that the
project will eventually capture the imagination and involvement of
other public institutions, as well.
That's very cool and exciting but I would caution against
the expectation that the initial burst of success will be sustainable
or that the design pattern will be repeatable.
"Magic" in this context means "free labor." There is a lot
less of that available than people seem to think, and that's even
before we get to questions of quality, gaming, etc.
There is also a question of opportunity cost and mission.
An up-front consideration of how best to capture free labor
and what precisely to ask for, rather than taking what Flikr
offers, might have been worth it (even if, no doubt, much
harder to actually raise resources for).
In the open source software world, big names sometimes
do analogous things and achieve some middle of the road
success -- which then becomes the (largely unachievable)
baseline of success for less well known "crowd gatherers".
I'm not criticizing that this project has been started. It
will be an interesting experiment and I thank the good
folks at LoC and Flikr for it.
TANSTAFL + "measure twice, cut once"
This arrangement gives Yahoo certain commercial
privileges to user contributions. While non-exclusive,
it would be difficult for another firm to gain those
privileges. So, in some sense, this is a big gift to Yahoo.
When LoC catalogs a book, the new database record is uploaded
to the databases of the non-profit, public-interest organization OCLC.
Many other libraries do the same. When any member library of OCLC
needs a catalog entry for that same book, they can download that
record. Other libraries are free to improve the record and those
improvements are shared.
The LoC-on-Flikr records are different. People can add tags and
comments. These contributions are in some sense useful to the public,
sure, but there is a problem:
Contributors who add tags and comments are granting Yahoo a
perpetual, non-exclusive right to exploit the contributions commercially.
That may not sound so bad until you consider what "non-exclusive"
means in this context: each individual tag or comment author is
free to choose (or choose not) to license their contribution to others.
Nobody is free to simply scrape the site for this content and enjoy
it on terms like Yahoo's. Nor is extracting the data particularly
well supported by the APIs offered.
Moreover, the actual meta-data schema that lies behind the comment and
tag infrastructure is neither transparent nor the result of any obvious
application of library science. De facto, Yahoo has been given a role
in *defining* what the form and function of library meta-data will be.
A federally created and funded cultural institution is here enhancing
its collection by soliciting a public volunteer effort to improve the
private property of Yahoo. Yahoo is under no accountability for
the technical format of these improvements and has no obligation
to preserve this information, refrain from altering it, share it on
equal footing with all comers, or even continue to operate the service
with any specific form or function.
How is this in any way proper? Or for that matter legal?
> projects are mushrooming all over the country, but they are still
> time- consuming and for some institutions, cost prohibitive. The
Flikr is not "the commons". You can't graze your cows in their
back office, so to speak.
> gives a platform to smaller institutions to get their
> collections known. My company was actually hired by the Philadelphia
> Department of Records to create a platform that would do just that
> (=web-based geographic DAM). They have an estimated collection of 2
> million historic photos that no one could access or research before we
> put it online. I just hope that Flickr has thought of the implications
> of enabling multiple collections to be accessed through their
> platform. How will they deal with copyrighted photos? Will photos be
> watermarked to avoid being copied and reproduced? How will
> institutions keep their "branding identity"? In any case, we submitted
> PhillyHistory.org (<a
> href="www.phillyhistory.org">www.phillyhistory.org</a>) to The
> Commons... and we'll see what happens. I'm very curious...
A friend of mine works in a (state run) research library for a particular
specialty. Some of these photos and their meta-data might make fine
enhancements to their own emerging digital archives and catalogs.
Too bad about that, though.
While this strikes me as a great development, since increasing access to public information should only increase its usefullness and impact, this also raises questions to me.
It strikes me that this kind of cloud computing (which I learned about at Princeton's CITP Cloud Computing event) will start to affect the way we think about what is a public utility. New kinds of relationships will exist between established institutions and new "cloud" service providers, which come with new opportunities for gain, abuse, conflict of interest, unseen liabilities, etc.
For example, I expect that Google will be able to see all sorts of interesting metadata about who links to specific Hubble images, or who queries scientific databases, or how. The question, then, is whether that sort of information will be publicly available (or even if it could be). If not, then Google's benevolence starts to look a lot more like self interest, where they gain not only by becoming the arbiter of the public's access to their information stores, but also by gaining a privileged view of how we relate to our public data.
This isn't an isolated academic question, either. The way research data are cited and linked is itself the subject of scientific inquiry, will certainly continue to be invaluable.
Perhaps this is gift-horse-mouth looking, and we should be glad that someone wants to provide a free accessible home to public data. A little cynicism however, seems in order, and we might have to rethink what it means to provide a free public service.