Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

By Katie Hafner. Several major research libraries have rebuffed offers from Google and Microsoft to scan their books into computer databases, saying they were put off by restrictions these companies wanted to place on the new digital collections.

0 views

Skip to first unread message

Don Saklad

unread,

Oct 20, 2007, 8:12:49 AM10/20/07

By Katie Hafner
http://www.iht.com/articles/2007/10/19/business/19library.php
http://www.iht.com/articles/2007/10/19/business/19library.php?page=2

Home > Business

from iht.com
International Herald Tribune

[ photo ]
Bernard Margolis, president of the Boston Public
Library, rejected Google's offer. (Robert Spencer for
The New York Times)

Research libraries close their books to Google and
Microsoft

By Katie Hafner
Published: October 19, 2007
E-Mail Article
Listen to Article

Several major research libraries have rebuffed offers
from Google and Microsoft to scan their books into
computer databases, saying they were put off by
restrictions these companies wanted to place on the
new digital collections.

The research libraries, including a large consortium
in the Boston area, are instead signing on with the
Open Content Alliance, a nonprofit effort to make
digital material as widely accessible as possible.
Libraries that agree to work with Google do so on
Google's terms, which involve access to the material
only through the Google search engine, as well as
restrictions on how much of it can be downloaded.

Google pays to scan the books and does not directly
profit from the resulting Web pages, although the
additional material makes its search offering more
useful and thus more valuable. The libraries are free
to have their books scanned again by another
organization.

There are obvious financial benefits to libraries of
Google's wide-ranging offer, first announced in 2004.
Many prominent libraries have accepted the offer --
including the New York Public Library and libraries
at the University of Michigan, Harvard, Stanford and
Oxford. Google expects to scan 15 million books from
those collections.

But the resistance from some libraries suggests that
many in the academic and nonprofit world are intent
on pursuing a vision of the Web as a global
repository of knowledge that is free of business
interests or restrictions.

U.S. Library of Congress introduces plans for world
digital collection

"There are two opposed pathways being mapped out,"
said Paul Duguid, an adjunct professor at the School
of Information at the University of California at
Berkeley. "One is shaped by commercial concerns, the
other by a commitment to openness, and which one will
win is not clear." Last month, the Boston Library
Consortium of 19 research and academic libraries
throughout New England announced a plan to work with
the Open Content Alliance to begin digitizing the
libraries' 34 million volumes.

"We understand the commercial value of what Google is
doing, but we want to be able to distribute materials
in a way where everyone benefits from it," said
Bernard Margolis, president of the Boston Public
Library, which has in its collection roughly 3,700
volumes from the personal library of John Adams.

Margolis said his library had spoken with both Google
and Microsoft, and entirely rejected the idea of
working with them. Adam Smith, project management
director of Google Book Search, emphasized that the
company's deals with libraries were not exclusive,
and said the company welcomed other scanning
projects.

Smith said Google was "excited" that the Open Content
Alliance "has signed more libraries, and we hope they
sign many more." Google executives had hoped that the
Library of Congress would be one of its first major
partners when it embarked on its scanning effort. It
does have a pilot program with the library to
digitize some books.

But last January the Library of Congress announced a
project with a more open approach. With $2 million
from the Sloan Foundation, the library's first mass
digitization effort will scan 136,000 books and make
them accessible to any search engine through the Open
Content Alliance. The library declined to comment on
its future digitization plans. The Open Content
Alliance is the brainchild of Brewster Kahle, founder
of the Internet Archive, which was created in 1996
with the aim of preserving copies of Web sites and
other material. The group includes more than 80
libraries and research institutions and focuses on
works that are out of copyright.

"Google could be privatizing the library system by
offering a large, but private interface to millions
of books," Kahle said. The Open Content Alliance, he
said, "is fundamentally different, coming from a
community project to build joint collections that can
be used by everyone in different ways."

Kahle's group focuses on out-of-copyright books,
mostly those published in 1922 or earlier. Google
scans copyrighted works too, but it does not allow
users to read the full text of those books online,
and it allows publishers to opt out of the program.
Microsoft joined the Open Content Alliance at its
start in 2005, as did Yahoo, which also has a book
search project. That year, Google was also speaking
with Kahle about joining, but they did not reach an
agreement.

A year after joining, Microsoft added a restriction
that prohibits a book it has digitized from being
included in commercial search engines other than
Microsoft's.

"Unlike Google, there are no restrictions on the
distribution of these copies for academic purposes
across institutions," said Jay Girotto, group program
manager for Microsoft's Live Book Search.
Institutions working with Microsoft, he said,
included the University of California, the New York
Public Library, Cornell and the British Library. Some
in the research field view the issue as a matter of
principle.
1 | 2 Next Page

[ photo ]
Bernard Margolis, president of the Boston Public
Library, rejected Google's offer. (Robert Spencer for
The New York Times)

Research libraries close their books to Google and
Microsoft

By Katie Hafner
Published: October 19, 2007
E-Mail Article
Listen to Article

(Page 2 of 2)

"You don't want any for-profit company having control
of the world's knowledge," said Doron Weber, a
program director at the Sloan Foundation, which has
made several grants to libraries for digitization.

Weber said many institutions that have been
approached by Google have spoken to his organization
about their reservations. "Many are hedging their
bets," he said, "taking Google money for now while
realizing this is, at best, a short-term bridge to a
truly open universal library of the future." The
University of Michigan, a Google partner since 2004,
does not seem to share this view. "We have not felt
particularly restricted by our agreement with
Google," said Jack Bernard, a lawyer at the
university. "We have found Google very good to work
with."

The University of California, which started scanning
books with the Open Content Alliance, Microsoft and
Yahoo in 2005, has added Google as well. Robin
Chandler, director of data acquisitions at the
California Digital Library, the electronic library
for the University of California library system, said
working with everyone helps increase the volume of
the scanning.

But some have found Google to be an inflexible
partner. Tom Garnett, director of the Biodiversity
Heritage Library, a group of 10 prominent natural
history and botanical libraries that have agreed to
digitize their collections, said he had had
discussions with various people at both Google and
Microsoft. "Google had a very restrictive agreement,
and in all our discussions they were unwilling to
yield," he said.

Garnett said the most striking example of this came
when he asked the Google representatives about a
theoretical example.

"We asked, 'Suppose we allowed you to digitize all
our literature, and there was an ant researcher who
wanted to peel off 10,000 pages of ant literature and
load it on his own server and perform advanced
analysis to correlate it with climatological data
over the last 100 years, using software he had
developed to study trends in species research,'"
Garnett recalled.

He said the Google executives told him this would not
be possible. "They said, 'We'd be sympathetic but it
doesn't fit in with our model.'" Smith of Google said
this was not the case. "It's certainly something we
would work with libraries to do," he said.

The Boston Library Consortium's scanning project is
self-funded, with $845,000 for the next two years.
The consortium pays 10 cents a page to the Internet
Archive, which has installed ten scanners at the
Boston Public Library. Each scanned image will be
stored at the Internet Archive in San Francisco, and
anyone can download the material.

On Wednesday the Open Content Alliance announced,
together with the Boston Public Library and the Woods
Hole library, that it would start scanning
out-of-print but in-copyright works to be distributed
through a digital interlibrary loan system.

"God bless Google and Microsoft, and they'll do what
they do," said Weber of the Sloan Foundation. "But we
need to do the right thing, because we're in the
privileged position of thinking about what's good for
the country and society over the long- term."
Previous Page 1 | 2

Home > Business
http://www.iht.com/articles/2007/10/19/business/19library.php
http://www.iht.com/articles/2007/10/19/business/19library.php?page=2
By Katie Hafner

0 new messages