Based on the comments I've seen and the two posting from readers on this
subject, there seems to be a consensus on a few issues:
1) The YP architecture as SUN designed it doesn't scale well, and people
are seeing this as a problem.
2) Something capable of supporting faster updates is needed.
I see this as arguing for the following architecture:
The "YP service" should be hierarchical at several levels:
1) Multi-enterprise. At this level the YP services for places like
Cornell, MIT, IBM, DEC talk to each other. It isn't clear what they
would "export", but obviously this will be a small subset of what YP
manages internally to each site.
2) Multi-LAN. At this level, one envisions multiple smaller YP service
groups concerned with providing services local to a small number of
machines.
3) Single YP server group. Any given client is supposed to be "bound" to
a particular YP server in the SUN architecture; rebinding is only done
if that server crashes and restarts. In our architecture we would
want this server to be the main repository for information used almost
exclusively by these clients, and to cache information imported from
other YP server groups.
On the positive side, YP has a nice organization for the information it
manages -- I like the idea of saying that it works with what seem to be
"files" with a query interface on each file. Some comments suggest that
other people see this as overly restrictive, but I need to understand
what alternatives are proposed...
So, getting concrete, we now want to design a YP server group with the
following characteristics:
1) It uses the YP protocols to talk to clients.
2) There are normally 3 or 4 YP servers in a group; they maintain
some "part" of the global YP database.
3) The YP servers at the mult-LAN level know about each other and
cache data for one another; jointly they have a location name like
"cs.cornell.edu" that covers them as a set.
4) YP servers can interact at the mutli-enterprise level for queries that
explicitly reference remote data, e.g.
"/etc/services/isis/bc...@cs.cornell.edu"
the default for a given client is to search within his local area...
Does this sound like a promising architecture? What limitations do people
see if we pursue this? Note that the idea is to extend the current YP
name interface to "hide" the locality of a reference in the name, so that
the current name structure won't need much redesign.
-- Ken
KB> Based on the comments I've seen and the two posting from readers on this
KB> subject, there seems to be a consensus on a few issues:
KB> 1) The YP architecture as SUN designed it doesn't scale well, and people
KB> are seeing this as a problem.
KB> [... stuff deleted]
KB> The "YP service" should be hierarchical at several levels:
KB> 1) Multi-enterprise. At this level the YP services for places like
KB> Cornell, MIT, IBM, DEC talk to each other. It isn't clear what they
KB> would "export", but obviously this will be a small subset of what YP
KB> manages internally to each site.
KB> 2) Multi-LAN. At this level, one envisions multiple smaller YP service
KB> groups concerned with providing services local to a small number of
KB> machines.
Let me throw a stupid idea here. I've been looking at improving a
Yellow Pages service using ISIS reliable and fast protocols a while
ago (job has currently returned to its ashes now :-(, sorry Ken). I
personnaly think that a faster service than Sun YP could be
implemented using ISIS. But I'm not sure about you mean with "scalability"
It seems to me that YP-like service is not suited at all for wide area
networks. Work has been done with X500, or X500-like, service. I admit
that X500 it far too complicated for handling "small" databases, but
it may be more accurate for huge volume of entries. I am not promoting
the use of X500 for solving all problems, but one may want someday to
use an ISIS-YP-based service for implementing kind of directory
services, collecting zillions (well, let say hundreds...) of entries
to implement a "domain name server".
My deep question is: do you think a YP-like service covering a campus
wide domain HAS to be the same as one covering a US wide area (this is
what I understand from your "Multi-Enterprise" environment -- unless
the Enterprise covers a galactic-wide domain, in which case you may
require some help from Spock:-)? I doubt. X500-like techniques seem to
be a better move in this latter direction (cf. NYSERnet experiments
with White Pages).
KB> Does this sound like a promising architecture? What
KB> limitations do people see if we pursue this? Note that the
KB> idea is to extend the current YP name interface to "hide" the
KB> locality of a reference in the name, so that the current name
KB> structure won't need much redesign.
I clearly a performance problem if the area covered by the service
goes to wide. The nature of the information which may be used by
wide areas services may also be of very different types (not only
/etc/passwd or /etc/services, but more sophisticated records): I don't
the YP is able to handle that correctly.
Sylvain
--
----------------
Sylvain Langlois "Dogmatic attachement to the supposed merits
(syl...@chorus.fr) of a particular structure hinders the search
(sylvain%chor...@mcsun.EU.net) of an appropriate structure" (Robert Fripp)
Let me comment that I have seen 3 comments along these lines; two came
from "people" in "industry" working on YP redesign efforts of one sort
or another, but neither had the appropriate permissions to post anything.
I'll summarize what I understand to be the criticism here in a moment,
but I also want to remind people that the idea here was to come up with
a cute ISIS problem that could be solved in spare time, more or less for
fun, but would also illustrate the power of the underlying system. So,
we aren't really trying to replace YP/X500 with some ISIS service but
rather to ask how far we can get with the least amount of effort possible
using ISIS as a tool in our work!
To summarize the criticisms I am hearing:
1) The scheme I am suggesting won't scale very well.
2) Any commercially viable product these days needs to be multilingual,
e.g. speaking DEC, IBM and Japanese.
3) Why should anyone use ISIS to build a YP server, anyhow? The real issues
are fault-tolerance, reconfiguration, etc, and unless ISIS makes these
easier, it won't address the complex aspect of the problem.
Sylvain only speaks about comment (1), my anonymous contacts raised
(2) and (3) in their messages.
A quick reaction
1) It isn't clear that the solution I suggest shouldn't scale. Recall that
I argue for small (say 3-process) servers that "own" chunks of the YP
data but with an import/export scheme whereby other servers can cache
parts of this information. My reasoning is that 3 is a small enough
set for fast updates, but a large enough one for fault-tolerance. The
cached copies can be updated using a lazy call-back scheme (slow cbcasts).
Viewed in the context of a huge system, this is a primary copy replication
scheme. We would need a clever way to propagate updates, but I see no
inherent reason that this should scale more poorly than other schemes.
One can learn a lot from the papers on the Andrew file system in this
regard -- they do a similar kind of caching.
The choice seems to be between refreshing the remote cached copies
and just invalidating them. Perhaps we can design a mechanism that
mixes these modes depending on the frequency of reference to cached data?
2) Well, this is just a homework problem. I agree, if I were trying to
commercial product I might have to worry about these things. But, we
could always jam some sort of gateway into our model after we get the
basic thing straight, so it (again) isn't clear to me that we can't
end up with a heterogeneous solution starting from what I propose.
Who knows, maybe the comp.sys.isis "product" can get to market first
and cover these issues too?
3) This argument was made by someone who actually pointed out how hard
it is to deal with failures, reconfiguration, replication and consistency
and hence said that we _shouldn't_ try to build the thing on ISIS. My
guess is that the writer actually knows a lot about YP and hence realized
that these are important issues, but doesn't know much about ISIS at all
and hence didn't realize that this is what makes the problem attractive
in an ISIS context! (In fact, the comment was prefaced by a remark
that the writer wasn't a regular reader of this newsgroup).
The whole point, in my view, is that our solution will have many of these
properties more or less for free. An "ad-hoc" solution is far less likely
to have these necessary characteristics.
I'll add one concern of my own:
4) What about network partitioning? A reasonablly scalable YP service
will need to span LAN's and hence it isn't reasonable to envision a single
ISIS environment spanning all servers.
My hope is that we can use the ISIS long-haul scheme to get around this.
More comments? Rebutals from our anoymous readers? (Forward them to
me and, if you so desire, I will be happy to repost without names or
affiliations). Ah, the pleasures of academic life!
Ken