Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Scalable Web Servers and the Inktomi Search Engine

8 views
Skip to first unread message

Crystal Williams

unread,
Sep 28, 1995, 3:00:00 AM9/28/95
to

Colloquium
Computer Science Division
UC Berkeley

Weds., Oct. 4, 1995
4-5 p.m.
Soda Hall Auditorium, Room 306

"Scalable Web Servers and the Inktomi Search Engine"
Professor Eric Brewer

Abstract TBA

News Release:

We're happy to announce the world's fastest and largest full-text search
engine for the Web:

http://inktomi.berkeley.edu

It is part of Berkeley's Network of Workstations project (NOW); it currently
uses 4 SparcStation 10s as the server. The crawling, indexing, and merging
software is also parallelized.
--------

9/26/95--Inktomi
Contact: Robert Sanders
(510) 643-6998

Parallel computing brings a faster
and bigger search engine to the
internet -- UC Berkeley's Inktomi

FOR IMMEDIATE RELEASE

Berkeley -- Two computer scientists at UC Berkeley have
introduced parallel computing to the internet to create the fastest
and most comprehensive "engine" now available to search the World Wide
Web.
Called Inktomi, it searches a database of more than 1.3
million documents on the World Wide Web, a network that reaches around
the world to provide ready access to words, pictures, sound and
video. Inktomi is the largest index of Web documents.
Although internet surfers happily skip from Web site to Web
site in search of "cool" links, the internet's true potential will be
felt only when users can quickly search for and find desired sites. As
the number of documents on the Web has skyrocketed this has become a
technical challenge.
"It's getting increasingly difficult to find things on the
internet," says Eric Brewer, an assistant professor of computer
science at the University of California at Berkeley who developed
Inktomi with graduate student Paul Gauthier. "The problem is, it's
very hard to have a large database and get good performance. With
parallel computing you can have larger databases and high
performance. Because we use commodity workstations, we have a much
cheaper solution than anyone."
Inktomi, pronounced to "ink to me", which is the name of a
mythological trickster spider of the Plains Indians, can be found at
the web address http://inktomi.berkeley.edu. Brewer and Gauthier
announced the search engine this week, though it has been up and
running since August.
The UC Berkeley scientists are quick to distinguish their
directory of Web documents, which is more like a comprehensive index
of the web, from directories such as the popular Yahoo, which is an
edited list more akin to a table of contents. Yahoo, started a year
and a half ago by two Stanford University graduate students, maintains
addresses for perhaps 50,000 of the most useful documents on the Web.
"With Inktomi you can find a lot more things than with Yahoo,
but both are useful," Brewer says. "We're providing a more
comprehensive search engine for the Web without sacrificing speed."
An equivalent search engine is Infoseek, which is as fast as
Inktomi but can accomodate only one fifth the documents, or Lycos,
which indexes slightly more than a million documents but is
significantly slower than Inktomi.
The need for a fast and comprehensive search engine has become
more urgent as the number of documents on the Web has increased to the
point where it is difficult to index every one and time-consuming to
search the index.
The new search engine is one of the first fruits of a
collaborative project at UC Berkeley to tie common desktop computers
or workstations -- just your average PC -- into a powerful network of
workstations. Dubbed NOW, the project hopes to harness the power of
inexpensive PCs into a parallel computer with the capabilities of a
supercomputer -- at a fraction of the cost.
Brewer emphasizes that parallel computing brings a unique
power to search engines of any kind, whether they are searching a
database of WWW addresses or a library catalogue. The major advantage
is "scalability," that is, as the database increases he merely adds
more inexpensive computers to maintain the system's quick
response.
Gauthier and Brewer built Inktomi using four outdated Sun
workstations, and have designed it so that if three break down
Inktomi continues to have access to the entire index, although at
a reduced rate. This reliability is unmatched by search engines that
operate out of a single computer.
To find and catalogue all the addresses Brewer and Gauthier
developed a web crawler that looks for new addresses periodically. Here
too parallel computing is important. Taking advantage of 32
networked computers within the computer science building, Soda Hall,
they relegate to several at a time the task of discovering new web
sites, often while the computers are being used by others.
Access to Inktomi is supported by the NOW project in the
Division of Computer Science of the UC Berkeley College of
Engineering.


### Eric Brewer can be reached at (510) 642-8143, or bre...@cs.berkeley.edu.
Paul Gauthier can be reached at (510) 642-9435, or gaut...@cs.berkeley.edu.

Bob Sanders
Public Information Office
University of California
101 Sproul Hall #4202
Berkeley, CA 94720-4202
PHONE: (510) 643-6998
FAX: (510) 643-7461

0 new messages