An Update on UNIX*-Related Standards Activities
August, 1990
USENIX Standards Watchdog Committee
Jeffrey S. Haemer <j...@usenix.org>, Report Editor
IEEE 1003.4: Real-time Extensions
Rick Greer <ri...@ism.isc.com> reports on the July 16-20 meeting in
Danvers, Massachusetts:
Most of the action in the July dot four meeting centered around -- you
guessed it -- threads. The current threads draft (1003.4a) came very
close to going to ballot. An overwhelming majority of those present
voted to send the draft to ballot, but there were enough complaints
from the dot fourteen people (that's multiprocessing -- MP) about the
scheduling chapter to hold it back for another three months.
Volunteers from dot fourteen have agreed to work on the scheduling
sections so that the draft can go to ballot after the next meeting, in
October.
Actually, dot four voted to send the draft to ballot after that
meeting in any case, and the resolution was worded in such a way that
a consensus would be required to prevent the draft from going to
ballot in October. Thus, the MP folks have this one final chance to
clean up the stuff that's bothering them -- if it isn't fixed by
October, it will have to be fixed in balloting. Some of us in dot
fourteen felt the best way to fix the thread scheduling stuff was via
ballot objection anyway. Unfortunately, the threads balloting group
is now officially closed, and a number of MP people saw this as their
last chance to make a contribution to the threads effort. The members
of dot fourteen weren't the only ones to be taken by surprise by the
closure of the threads balloting group. It seems that many felt that
it would be a cold day in hell before threads made it to ballot and
weren't following the progress of dot four all that closely. [Editor:
I've bet John Gertwagen a beer that threads will finish balloting
before the rest of dot four. The bet's a long way from being decided,
but I still think I'll get my beer.]
Meanwhile, the ballot resolution process continues for the rest of dot
four, albeit rather slowly. There are a number of problems here, the
biggest being lack of resources. In general, people would much rather
argue about threads than deal with the day-to-day grunt work
associated with the IEEE standards process. [Editor: The meeting will
__________
* UNIXTM is a Registered Trademark of UNIX System Laboratories in
the United States and other countries.
August, 1990 Standards Update IEEE 1003.4: Real-time Extensions
- 2 -
be in Seattle, Washington. Go. Be a resource.] Many of the technical
reviewers have yet to get started on their sections. Nevertheless,
proposed resolutions to a number of objections were presented and
accepted at the Danvers meeting.
[Editor: Rick is temporarily unavailable, but Simon Patience of
the OSF has kindly supplied these examples:
The resolved objections were taken from the CRB: replacing the
event mechanism by ``fixed'' signals, replacing the shared memory
mechanism by mmap() and removing semaphore handles from the file
system name space. Replacing events by signals was accepted; no
problem here. Adopting mmap() got a mixed reception, partly
because some people thought you had to take all of mmap(), where
a subset might suffice. The final vote on this was not to ask
the reviewer to adopt mmap(), which may not not satisfy the
objectors. I'd guess the balloting group will eventually hold
sway here! Finally, the group accepted abandoning the use of
file descriptors for semaphore handles, but some participants
wanted to keep semaphore names pathnames. The reviewer was sent
off to rethink the implications of this suggestion. ]
We should be seeing a lot more of this in Seattle. Similar comments
apply to the real-time profile (AEP) work. The AEP group made more
progress over the last three months than the technical reviewers did,
although even that (the AEP progress) was less than I'd hoped. We're
expecting our first official AEP draft in October.
August, 1990 Standards Update IEEE 1003.4: Real-time Extensions
Volume-Number: Volume 21, Number 50
My personal opinion is that *anything* that can go into the file system name
space *should*. That's what makes UNIX UNIX... that it's all visible from the
shell...
---
Peter da Silva. `-_-'
+1 713 274 5180. 'U`
pe...@ferranti.com
Volume-Number: Volume 21, Number 57
> Finally, the group accepted abandoning the use of
> file descriptors for semaphore handles, but some participants
> wanted to keep semaphore names pathnames.
Aargh! Almost everyone realizes that System V IPC is a botch, largely
because it doesn't live in the filesystem. So what does IEEE do?
They take IPC out of the filesystem!
What sane reason could there be to introduce Yet Another Namespace?
--
Chip Salzenberg at Teltronics/TCT <ch...@tct.uucp>, <uunet!pdn!tct!chip>
Volume-Number: Volume 21, Number 65
In article <4...@usenix.ORG> ch...@tct.uucp (Chip Salzenberg) writes:
>From: ch...@tct.uucp (Chip Salzenberg)
>
>> Finally, the group accepted abandoning the use of
>> file descriptors for semaphore handles, but some participants
>> wanted to keep semaphore names pathnames.
>
>Aargh! Almost everyone realizes that System V IPC is a botch, largely
>because it doesn't live in the filesystem. So what does IEEE do?
>They take IPC out of the filesystem!
>
>What sane reason could there be to introduce Yet Another Namespace?
The reason for semaphores not being in the file system is twofold. Some
realtime embedded systems do not have a file system but do want semaphores
So this allows them to have them without having to bring in the baggage a
file system would entail. Secondly, as far as threads, which are supposed to
be light weight, are concerned it allows semaphores to be implmented in user
space rather than forcing them into the kernel for the file system.
A good reason for *not* having IPC handles in the file system is to allow
network IPC to use the same interfaces. If you have IPC handles in the
file system then two machines who have applications trying to communicate
would also have to have at least part of their file system name space to
be shared. This is non trivial to arrange for two machines so can you
imaging the problem of doing this for 100 (or 1000?) machines.
I am just the messenger for these views and do not necessarily hold them
myself. They were the reasons that came up during the discussion.
Simon.
Simon Patience Phone: (617) 621-8736
Open Software Foundation FAX: (617) 225-2782
11 Cambridge Center Email: s...@osf.org
Cambridge MA 02142 uunet!osf.org!sp
Volume-Number: Volume 21, Number 68
>>>>> On 28 Aug 90 11:58:40 GMT, s...@mysteron.osf.org (Simon Patience) said:
>> Finally, the group accepted abandoning the use of
>> file descriptors for semaphore handles, but some participants
>> wanted to keep semaphore names pathnames.
>>
>Aargh! Almost everyone realizes that System V IPC is a botch, largely
>because it doesn't live in the filesystem. So what does IEEE do?
>They take IPC out of the filesystem!
>
>What sane reason could there be to introduce Yet Another Namespace?
Simon> The reason for semaphores not being in the file system is twofold.
Simon> Some realtime embedded systems do not have a file system but do want
Simon> semaphores...
Simon> A good reason for *not* having IPC handles in the file system is to
Simon> allow network IPC to use the same interfaces.
How about adding non-file-system-based "handles" to an mmap-like interface?
(e.g. shmmap(host,porttype,portnum,addr,len,prot,flags)?) This could
allow the same interface to be used for network and non-network IPC,
without the overhead of a trap for every non-network IPC transaction.
`Scuse me while I don my flame retardant suit... :-)
#include <std/disclaimer.h>
--
Chuck Phillips MS440
NCR Microelectronics Chuck.Phillips%FtCollins.NCR.com
2001 Danfield Ct.
Ft. Collins, CO. 80525 uunet!ncrlnk!ncr-mpd!bach!chuckp
Volume-Number: Volume 21, Number 72
According to s...@mysteron.osf.org (Simon Patience):
>Some realtime embedded systems do not have a file system but do want
>semaphores. So this allows them to have them without having to bring
>in the baggage a file system would entail.
I was under the impression that POSIX was designing a portable Unix
interface. Without a filesystem, you don't have Unix, do you?
Besides, a given embedded system's library could easily emulate a
baby-simple filesystem.
>Secondly, as far as threads, which are supposed to be light weight,
>are concerned it allows semaphores to be implmented in user space
>rather than forcing them into the kernel for the file system.
The desire for user-space support indicates to me that there should be
some provision for non-filesystem (anonymous) IPCs that can be created
and used without kernel intervention. This need does not reduce the
desirability of putting global IPCs in the filesystem.
>A good reason for *not* having IPC handles in the file system is to allow
>network IPC to use the same interfaces.
Filesystem entities can be used to trigger network activity by the
kernel (or its stand-in), even if they do not reside on shared
filesystems.
--
Chip Salzenberg at Teltronics/TCT <ch...@tct.uucp>, <uunet!pdn!tct!chip>
Volume-Number: Volume 21, Number 74
| From: s...@mysteron.osf.org (Simon Patience)
| The reason for semaphores not being in the file system is twofold. Some
| realtime embedded systems do not have a file system but do want semaphores
| So this allows them to have them without having to bring in the baggage a
| file system would entail.
---
I don't care whether they have something that looks like UNIX filesystem
code or not, or whether they have disk drives or not, but I don't think
it's unreasonable to require them to handle semaphore names as though
they were in a filesystem namespace. Otherwise you're going to end up
with a collection of independent features, each minimally specified so
that it can work without assuming anything else, and anyone with any
sense is going to say "Yuck" and use a real operating system that
provides reasonable integration and for a uniform notion of, among other
things, naming.
---
| ... Secondly, as far as threads, which are supposed to
| be light weight, are concerned it allows semaphores to be implmented in user
| space rather than forcing them into the kernel for the file system.
---
Eh? I don't know what the group has proposed since the ballot, but it
would seem that using a filesystem name only makes a difference when you
first specify you're going to be looking at a particular semaphore,
which shouldn't be a critical path event. After that you use a file
descriptor, which I think could be handled in user space about as well
as anything else. In either case you're going to have to go to the
kernel when scheduling is required (when you block or when you release
the semaphore).
---
| A good reason for *not* having IPC handles in the file system is to allow
| network IPC to use the same interfaces. If you have IPC handles in the
| file system then two machines who have applications trying to communicate
| would also have to have at least part of their file system name space to
| be shared. This is non trivial to arrange for two machines so can you
| imaging the problem of doing this for 100 (or 1000?) machines.
---
You're going to have to synchronize *some* namespace anyway, why
shouldn't it be a piece of the filesystem namespace?
A consistent approach to naming and name resolution for ALL global
objects should be one of the basic requirements for any new POSIX (or
UNIX!) functionality. We should have *one* namespace so that we can
write general tools that only need to know about one namespace.
--
scott preece
motorola/mcd urbana design center 1101 e. university, urbana, il 61801
uucp: uunet!uiucuxc!udc!preece, arpa: pre...@urbana.mcd.mot.com
Volume-Number: Volume 21, Number 75
One obvious (if a little wishy-washy) solution is to not specify
whether the namespaces are the same. That is, applications are
required to use a valid path, and have to be prepared for things like
unwritable directories, but implementations are not required to check
for those things.
This makes sense in light of the fact that there seems to be a general
lack of consensus about which is best. Even though there is existing
practice for both ways of doing things, it may be premature to
standardize either behavior now.
Volume-Number: Volume 21, Number 76
1003.13 is working on real-time AEP's, including one for small embedded
real-time systems which does not have a file system. So the POSIX answer
is yes, without the filesystem you still can have a POSIX-compliant
interface.
Doug Jensen
Concurrent Computer Corp.
e...@westford.ccur.com
Volume-Number: Volume 21, Number 78
>>>>> On 24 Aug 90 03:28:06 GMT, pe...@ficc.ferranti.com (peter da silva) said:
peter> My personal opinion is that *anything* that can go into the file system name
peter> space *should*. That's what makes UNIX UNIX... that it's all visible from the
peter> shell...
I'm not sure which Unix you've been running for the past five or more
years, but a lot of stuff doesn't live in the file system name space
under various BSD derived systems, nor do the networking types believe
it belongs there. IMHO neither does a process handle, nor a
semaphore, and don't even talk to me about "named pipes" as an IPC
mechanism.
(Gee, I guess reasonable men might differ on what belongs in the name
space ;-)
Marty
--
Martin Fouts
UUCP: ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts
ARPA: apd!fo...@ingr.com
PHONE: (415) 852-2310 FAX: (415) 856-9224
MAIL: 2400 Geng Road, Palo Alto, CA, 94303
Moving to Montana; Goin' to be a Dental Floss Tycoon.
- Frank Zappa
Volume-Number: Volume 21, Number 83
In article <4...@usenix.ORG> fo...@bozeman.bozeman.ingr (Martin Fouts) writes:
>I'm not sure which Unix you've been running for the past five or more
>years, but a lot of stuff doesn't live in the file system name space
>under various BSD derived systems, nor do the networking types believe
>it belongs there.
Excuse me, but the "networking types" I talk to believe that sockets
were a botch and that network connections definitely DO belong within
a uniform UNIX "file" name space. Peter was quite right to note that
this is an essential feature of UNIX's design. In fact there are UNIX
implementations that do this right, 4BSD is simply not among them yet.
Volume-Number: Volume 21, Number 85
In article <4...@usenix.ORG> fo...@bozeman.bozeman.ingr (Martin Fouts) writes:
> > My personal opinion is that *anything* that can go into the file system
> > name space *should*. That's what makes UNIX UNIX... that it's all visible
> > from the shell...
> I'm not sure which Unix you've been running for the past five or more
> years, but a lot of stuff doesn't live in the file system name space
> under various BSD derived systems,
Yes, and there's even more stuff in System V that doesn't live in that
name space. In both cases it's *wrong*.
> nor do the networking types believe
> it belongs there.
Some more details on this subject would be advisable. I'm aware that not
everything *can* go in the file system name space, by the way...
> IMHO neither does a process handle, nor a
> semaphore, and don't even talk to me about "named pipes" as an IPC
> mechanism.
An active semaphore can be implemented any way you want, but it should
be represented by an entry in the name space. The same goes for process
handles and so on.
Named pipes are an inadequate mechanism for much IPC, but they work quite
well for many simple cases. If you're looking at them as some sort of
paragon representing the whole concept, you're sadly mistaken.
Anyway... what is it that makes "dev/win" more worthy of having an entry
in "/dev" than "dev/socket"?
--
Peter da Silva. `-_-'
+1 713 274 5180. 'U`
pe...@ferranti.com
Volume-Number: Volume 21, Number 87
According to fo...@bozeman.bozeman.ingr (Martin Fouts):
>I'm not sure which Unix you've been running for the past five or more
>years, but a lot of stuff doesn't live in the file system name space ...
The absense of sockets (except UNIX domain), System V IPC, etc. from
the file system is, in the opinion of many, a bug. It is a result of
Unix being extended by people who do not understand Unix.
Research Unix, which is the result of continued development by the
creators of Unix, did not take things out of the filesystem. To the
contrary, it put *more* things there, including processes (via the
/proc pseudo-directory).
It is true that other operating systems get along without devices,
IPC, etc. in their filesystems. That's fine for them; but it's not
relevant to Unix. Unix programming has a history of relying on the
filesystem to take care of things that other systems handle as special
cases -- devices, for example. The idea that devices can be files but
TCP/IP sockets cannot runs counter to all Unix experience.
The reason why I continue this discussion here, in comp.std.unix, is
that many Unix programmers hope that the people in the standardization
committees have learned from the out-of-filesystem mistake, and will
rectify it.
--
Chip Salzenberg at Teltronics/TCT <ch...@tct.uucp>, <uunet!pdn!tct!chip>
Volume-Number: Volume 21, Number 89
I believe in putting lots of interesting stuff in the file system name
space but I don't believe that semaphores belong there. The reason
I don't want to put semaphores in the name space is the same reason
I don't want to put my program variables in the name space: I want
to have lots of them, I want to create and destroy them very quickly
and I want to operate on them even more quickly. In other words, the
granularity is wrong.
The purpose of a semaphore is to synchronize actions on an object.
What kinds of objects might one want to synchronize? Generally the
objects are either OS supplied like devices or files, or user defined
data structures. The typical way of synchronizing files and devices
is to use advisory locks or the "exclusive use" mode on the device.
The more difficult case and the one for which semaphores were invented,
and later added to Unix, is that of synchronizing user data structures.
In Unix, user data structures may live either in a process's private
memory or in a shared memory segment. In both cases there are probably
many different data structures in that memory and many of these data
structures may need to be synchronized. For maximum concurrency the
programmer may wish to synchronize each data structure with its own
semaphore. In many applications these data structures may come and
go very quickly and the expense of creating a semaphore to synchronize
the data can be important factor in the performance of the application.
It thus seems more natural to allow semaphores to be efficiently
allocated along with the data that they are designed to synchronize.
That is, allow them to be allocated in a process's private address
space or in a mapped shared memory segment. A shared memory segment
is a much larger grain object, creating, destroying and mapping them
can be much more expensive than creating, destroying or using a
semaphore and these segments are generally important enough to the
application to have sensible names. Thus putting a shared memory
segment in the name space seems reasonable.
For example, a data base library may use a shared member segment named
/usr/local/lib/dbm/personnel/bufpool to hold the buffer pool for the
personnel department's data base. The data base library would map
the buffer pool into each client's address space allowing many data
base client programs to efficiently access the data base. Each page
in the buffer pool and each transaction would have its own set of
semaphores used to synchronize access to the page in the pool or the
state of a transaction. Giving the buffer pool a name is no problem,
but giving each semaphore a name is much more of a hassle.
[Aside: Another way of structuring such a data base library is as
an RPC style multi-threaded server. This allows access to the data
base from remote machines and allows easier solutions to the security
and failure problems inherent in the shared memory approach. However
the shared memory approach has a major performance advantage for systems
that do not support ultra-fast RPCs. Another approach is to run the
library in an inner mode. (Unix has one inner mode called the kernel,
VMS has 3, Multics had many.) This solves the security and failure
problems of the shared segments but it is generally difficult for mere
mortals to write their own inner mode libraries.]
One other issue that may cause one to want to unify all objects in
the file system, at least at the level of using file descriptors to
refer to all objects if not going so far as to put all objects in the
name space, is the fact that single threaded programming is much nicer
if there is a single primitive that will wait for ANY event that the
process may be interested in (e.g. the 4.2BSD select call.) This call
is useful if one is to write a single threaded program that doesn't
busy wait when it has nothing to do but also won't block when an event
of interest has occurred. With the advent of multi-threaded programming
the single multi-way wait primitive is no longer needed as instead
one can create a separate thread each blocking for an event of interest
and processing it. Multi-way waiting is a problem if single threaded
programs are going to get maximum use out of the facility.
I've spoken to a number of people in 1003.4 about these ideas. I am
not sure whether it played any part in their decision.
Just to prove that I am a pro-name space kind of guy, I am currently
working on and using an experimental file system called Echo that
integrates the Internet Domain name service for access to global names,
our internal higher performance name service for highly available
naming of arbitrary objects, our experimental fault tolerant, log based,
distributed file service with read/write consistency and universal
write back for file storage, and auto-mounting NFS for accessing other
systems.
Objects that are named in our name space currently include:
hosts, users, groups, network servers, network services (a fault
tolerant network service is generally provided by several servers),
any every version of any source or object file known by our source
code control system
Some of these objects are represented in the name space as a directory
with auxiliary information, mount points or files stored underneath.
This subsumes much of the use of special files like /etc/passwd,
/etc/services and the like in traditional Unix. Processes are not
currently in the name space, but they will/should be. (Just a "simple
matter of programming.")
For example /-/com/dec/src/user/swart/home/.draft/6.draft is the name
of the file I am currently typing, /-/com/dec/src/user/swart/shell
is a symbolic link to my shell, /-/com/dec/prl/perle/nfs/bin/ls is
the name of the "ls" program on a vanilla Ultrix machine at DEC's Paris
Research Lab..
[Yes, I know we are using "/-/" as the name of the super root and not
either "/../" or "//" as POSIX mandates, but those other strings are
so uhhgly and /../ is especially misleading in a system with multiple
levels of super root, e.g. on my machine "cd /; pwd" types
/-/com/dec/src.]
Things that we don't put in the name space are objects that are passed
within or between processes by 'handle' rather than by name. For
example, pipes created with the pipe(2) call, need not be in the name
space. [At a further extreme, pipes for intra-process communication
don't even involve calling the kernel.]
I personally don't believe in overloading file system operations on
objects for which the meaning is tenuous (e.g. "unlink" => "kill -TERM"
on objects of type process); we tend to define new operations for
manipulating objects of a new type. But that is even more of a
digression than I wanted to get into!
Sorry for the length of this message, I seem to have gotten carried
away.
Happy trails,
Garret Swart
DEC Systems Research Center
130 Lytton Avenue
Palo Alto, CA 94301
(415) 853-2220
decwrl!swart.UUCP or sw...@src.dec.com
Volume-Number: Volume 21, Number 91
Date: 7 Sep 90 15:23:19 GMT
From: ch...@tct.uucp (Chip Salzenberg)
[Most of quoted message deleted. -mod]
It is true that other operating systems get along without devices,
IPC, etc. in their filesystems. That's fine for them; but it's not
relevant to Unix. Unix programming has a history of relying on the
filesystem to take care of things that other systems handle as special
cases -- devices, for example....
What defineds `true Unix?' Don't forget that Multics had all this and
more in the filesystem; this stuff was REMOVED when Unix was written.
Is this `continued development by the creators of Unix' just going
back to what Unix rejected 20 years ago?
Or for a pun for Multics fans: what goes around comes around...
Volume-Number: Volume 21, Number 92
Other operating systems have learned from UNIX in this respect, in fact!
AmigaOS puts all manner of interesting things in the file name space,
including pipes (PIPE:name), windows (CON:Left/Top/Width/Height/Title/Flags),
and the environment (ENV:varname). Other things have been left out but are
being filled in by users (it's relatively easy to wite device handlers on
AmigaOS). There are some really odd things like PATH:. This can be opened
as a file and looks like a list of directory names, or used as a directory
in which case it looks like the concatenation of all the named directories.
--
Peter da Silva. `-_-'
+1 713 274 5180. 'U`
pe...@ferranti.com
Volume-Number: Volume 21, Number 93
In article <4...@usenix.ORG> sw...@src.dec.com (Garret Swart) writes:
>I believe in putting lots of interesting stuff in the file system name
>space but I don't believe that semaphores belong there. The reason
>I don't want to put semaphores in the name space is the same reason
>I don't want to put my program variables in the name space: I want
>to have lots of them, I want to create and destroy them very quickly
>and I want to operate on them even more quickly. In other words, the
>granularity is wrong.
There is no requirement that you bind every semaphore handle to
a file system name. Only that the ability to take a semaphore
handle and create a file system name or take a file system name
entry and retreive a semaphore handle. This would permit you to
rapidly create and destroy semaphore for private use, as well as
provide an external interface for public use.
There is no restriction in either case as to the speed which you
can perform operations on the handle - file descriptors are
associated with file system name entries in many cases and I've
not seen anyone complain that file descriptors slow the system
down.
--
John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832 Domain: j...@rpp386.cactus.org
"SCCS, the source motel! Programs check in and never check out!"
-- Ken Thompson
Volume-Number: Volume 21, Number 96
In article <4...@usenix.ORG>, sw...@src.dec.com (Garret Swart) writes:
> I believe in putting lots of interesting stuff in the file system name
> space but I don't believe that semaphores belong there. The reason
> I don't want to put semaphores in the name space is the same reason
> I don't want to put my program variables in the name space: I want
> to have lots of them, I want to create and destroy them very quickly
> and I want to operate on them even more quickly. In other words, the
> granularity is wrong.
So why not choose a different granularity? Have the thing that goes in
the file system name space be an (extensible) *array* of semaphores.
To specify a semaphore, one would use a (descriptor, index) pair.
To create a semaphore in a semaphore group, just use it.
If you want to have a semaphore associated with a data structure in
mapped memory, just use a lock on the appropriate byte range of the
mapped file.
(Am I hopelessly confused, or aren't advisory record locks *already*
equivalent to binary semaphores? Trying to lock a range of bytes in
a file is just a multi-wait, no? Why do we need two interfaces? (I
can see that two or more _implementations_ behind the interface might
be a good idea, but that's another question.)
--
Heuer's Law: Any feature is a bug unless it can be turned off.
Volume-Number: Volume 21, Number 97
According to gu...@Cygnus.COM (David Vinayak Wallace):
>Is this `continued development by the creators of Unix' just going
>back to what Unix rejected 20 years ago?
They threw away what wouldn't fit. Then they added features, but
piece by piece, and only as they observed a need.
This cycle has started again with Plan 9, which borrows heavily from
Unix -- almost everything lives in the filesystem -- but which is in
fact a brand new start.
Unix owes much to Multics, and we can learn from it, but we needn't be
driven by it.
--
Chip Salzenberg at Teltronics/TCT <ch...@tct.uucp>, <uunet!pdn!tct!chip>
Volume-Number: Volume 21, Number 102
>>>>> On 7 Sep 90 15:23:19 GMT, ch...@tct.uucp (Chip Salzenberg) said:
Chip> According to fo...@bozeman.bozeman.ingr (Martin Fouts):
>I'm not sure which Unix you've been running for the past five or more
>years, but a lot of stuff doesn't live in the file system name space ...
Chip> The absense of sockets (except UNIX domain), System V IPC, etc. from
Chip> the file system is, in the opinion of many, a bug. It is a result of
Chip> Unix being extended by people who do not understand Unix.
^-------------------------------^
My aren't we superior. (;-) At one time, I believed that sockets
belonged in the filesystem name space. I spent a long time arguing
this point with members of the networking community before they
convinced me that certain transient objects do not belong in that name
space. (See below)
Chip> Research Unix, which is the result of continued development by the
Chip> creators of Unix, did not take things out of the filesystem. To the
Chip> contrary, it put *more* things there, including processes (via the
Chip> /proc pseudo-directory).
The value of proc in the file system are debatable. Certain debugging
tools are easier to hang on an fcntl certain others are not. However, the
presences of the proc file system is not a strong arguement for the
inclusion of othere features in the file system.
Chip> It is true that other operating systems get along without devices,
Chip> IPC, etc. in their filesystems. That's fine for them; but it's not
Chip> relevant to Unix. Unix programming has a history of relying on the
Chip> filesystem to take care of things that other systems handle as special
Chip> cases -- devices, for example. The idea that devices can be files but
Chip> TCP/IP sockets cannot runs counter to all Unix experience.
Unix programming has a history of using the filesystem for some things
and not using it for others. For example, I can demonstrate a
semantic under which it is possible to put the time of day clock into
the file system and reference it by opening the i.e. /dev/timeofday
file. Each time I read from that file, I would get the current time.
Via fcntls, I could extend this to handle timer functions. It wasn't
done in Unix. (I've done similar things in other OSs I've designed,
though.)
The whole point of the response which you partially quoted was to
remind the poster I was responding to that not all functions which
might have been placed in the filesystem automatically have.
Chip> The reason why I continue this discussion here, in comp.std.unix, is
Chip> that many Unix programmers hope that the people in the standardization
Chip> committees have learned from the out-of-filesystem mistake, and will
Chip> rectify it.
Chip> --
The reason I respond is that it is not automatically safe to assume
that something belongs in the file system because something else is
already there. There is also an explicit problem not mentioned in
this discussion which is the distinction between filesystem name space
and filesystem semantics. Sometimes there are objects which would be
reasonable to treat with filesystem semantics for which there is no
reasonable mechanism for introducing them into the filesystem name
space. Because of the way network connections are made, I have been
convinced by networking experts (who are familiar with the "Unix
style") that the filesystem namespace does not have a good semantic
match for the network name space.
Chip> Chip Salzenberg at Teltronics/TCT <ch...@tct.uucp>, <uunet!pdn!tct!chip>
Chip> Volume-Number: Volume 21, Number 89
Marty
--
Martin Fouts
UUCP: ...!pyramid!garth!fouts (or) uunet!ingr!apd!fouts
ARPA: apd!fo...@ingr.com
PHONE: (415) 852-2310 FAX: (415) 856-9224
MAIL: 2400 Geng Road, Palo Alto, CA, 94303
Moving to Montana; Goin' to be a Dental Floss Tycoon.
- Frank Zappa
Volume-Number: Volume 21, Number 114
In article <5...@usenix.ORG> fo...@bozeman.bozeman.ingr (Martin Fouts) writes:
> At one time, I believed that sockets
> belonged in the filesystem name space. I spent a long time arguing
> this point with members of the networking community before theyy
> convinced me that certain transient objects do not belong in that name
> space.
In contrast, I've found it quite easy to get people to agree that
practically every object should be usable as an open *file*. The beauty
and power of UNIX is the abstraction of files---not filesystems. I'd say
that the concept of an open file descriptor is one of the most important
reasons that UNIX-style operating systems are taking over the world.
ch...@tct.uucp (Chip Salzenberg) writes:
> The reason why I continue this discussion here, in comp.std.unix, is
> that many Unix programmers hope that the people in the standardization
> committees have learned from the out-of-filesystem mistake, and will
> rectify it.
I am a UNIX programmer who strongly hopes that standards committees will
never make the mistake of putting network objects into the filesystem.
Although the semantics of read() and write() fit network connections
perfectly, the semantics of open() most certainly do not. I will readily
support passing network connections as file descriptors. I will fight
tooth and nail to make sure that they need not be passed as filenames.
---Dan
Volume-Number: Volume 21, Number 115
According to brn...@kramden.acf.nyu.edu (Dan Bernstein):
>The beauty and power of UNIX is the abstraction of files---
>not filesystems.
The filesystem means that anything worth reading or writing can be
accessed by a name in one large hierarchy. It means a consistent
naming scheme. It means that any entity can be opened, listed,
renamed or removed.
Both the filesystem and the file descriptor are powerful abstractions.
Do not make the mistake of minimizing either one's contribution to the
power and beauty of UNIX.
--
Chip Salzenberg at Teltronics/TCT <ch...@tct.uucp>, <uunet!pdn!tct!chip>
Volume-Number: Volume 21, Number 118
According to fo...@bozeman.bozeman.ingr (Martin Fouts):
>According to ch...@tct.uucp (Chip Salzenberg):
>> Research Unix [...] put *more things [in the filesystem],
>> including processes (via the /proc pseudo-directory).
>
>The value of proc in the file system are debatable. Certain debugging
>tools are easier to hang on an fcntl certain others are not.
With /proc, some things are much easier. (Getting a list of all
active pids, for example.) Nothing, however, is harder. A big win.
>However, the presences of the proc file system is not a strong arguement
>for the inclusion of othere features in the file system.
I disagree. I consider it an excellent example of how the designers
of Unix realize that all named objects potentially visible to more
than one process belong in the filesystem namespace.
>Unix programming has a history of using the filesystem for some things
>and not using it for others. For example, I can demonstrate a
>semantic under which it is possible to put the time of day clock into
>the file system ...
Of course. But in the absense of remotely mounted filesystems --
which V7 Unix was not designed to support -- there is only one time of
day, so it needs no name. (I wouldn't be surprised if Plan 9 has a
/dev/timeofday, however.)
>... not all functions which might have been placed in the
>filesystem automatically have.
This observation is correct. But it is clear that the designers of
Research Unix have used the filesystem for everything that needs a
name, and they continue to do so. Their work asks, "Why have multiple
namespaces?" Plan 9 asks the question again, and with a megaphone.
>Because of the way network connections are made, I have been
>convinced by networking experts (who are familiar with the "Unix
>style") that the filesystem namespace does not have a good semantic
>match for the network name space.
Carried to its logical conclusion, this argument would invalidate
special files and named pipes, since they also lack a "good semantic
match" with flat files. In fact, the only entities with a "good
semantic match" for flat files are -- you guessed it -- flat files.
So, how do we program in such a system? We use its elegant interface
-- or should I say "interfaces"? Plain files, devices, IPCs, and
network connections each have a semantically accurate interface, which
unfortunately makes it different from all others.
This is progress? "Forward into the past!"
--
Chip Salzenberg at Teltronics/TCT <ch...@tct.uucp>, <uunet!pdn!tct!chip>
Volume-Number: Volume 21, Number 119
In article <5...@usenix.ORG> fo...@bozeman.bozeman.ingr (Martin Fouts) writes:
> My aren't we superior. (;-) At one time, I believed that sockets
> belonged in the filesystem name space. I spent a long time arguing
> this point with members of the networking community before they
> convinced me that certain transient objects do not belong in that name
> space. (See below)
You mean things that don't operate like a single bidirectional stream, like
pipes? It's funny that the sockets that *do* behave that way are not in the
file system, while UNIX-domain sockets (which have two ends on the local box)
are.
> Unix programming has a history of using the filesystem for some things
> and not using it for others.
UNIX programming has a history of using whatever ad-hoc hacks were needed
to get things working. It's full of evolutionary dead-ends... some of which
have been discarded (multiplexed files) and some of which have been patched
up and overloaded (file protection bits). But where things have moved closer
to the underlying principles (everything is a file, for example) it's become
the better for it.
> Sometimes there are objects which would be
> reasonable to treat with filesystem semantics for which there is no
> reasonable mechanism for introducing them into the filesystem name
> space.
This seems reasonable, but the rest is a pure argument from authority.
Could you repeat these arguments for the benefit of hose of us who don't
have the good fortune to know these networking experts you speak of?
[ Everyone involved in this discussion, please try to keep it in a
technical, not a personal, vein. -mod ]
--
Peter da Silva. `-_-'
+1 713 274 5180. 'U`
pe...@ferranti.com
Volume-Number: Volume 21, Number 127
In article <5...@usenix.ORG> ch...@tct.uucp (Chip Salzenberg) writes:
> According to brn...@kramden.acf.nyu.edu (Dan Bernstein):
> >The beauty and power of UNIX is the abstraction of files---
> >not filesystems.
> Both the filesystem and the file descriptor are powerful abstractions.
On the contrary: Given file descriptors, the filesystem is an almost
useless abstraction.
Programs fall into two main classes. Some (such as diff) take a small,
fixed number of filename arguments and treat each one specially. They
become both simpler and more flexible if they instead use file
descriptors. I'll propose multitee as an example of this.
Others (such as sed or compress) take many filenames and perform some
action on each file in turn. They also become both simpler and more
flexible if they instead take input and output from a couple of file
descriptors, perhaps with a simple protocol for indicating file
boundaries. I'll propose the new version of filterfile as a
demonstration of how this can simplify application development.
In both cases, the application need know absolutely nothing about the
filesystem. A few utilities deal with filenames---shell redirection and
cat. A few utilities do the same for network connections---authtcp and
attachport. A few utilities do the same for pipes---the shell's piping.
But beyond these two or three programs per I/O object, the filesystem
contributes *nothing* to the vast majority of applications.
There is one notable exception. Some programs depend on reliable,
static, local or virtually local storage, usually for what amounts to
interprocess communication. (login needs /etc/passwd. cron reads crontab.
And so on.) This is exactly what filesystems were designed for, and a
program that wants reliable, static, local storage is perfectly within
its rights to demand the sensible abstraction we call a filesystem.
Most applications that use input and output, though, don't care that
it's reliable or static or local. For them, the filesystem is pointless.
Many of us are convinced that open() and rename() and unlink() and so on
are an extremely poor match for unreliable or dynamic or remote I/O. We
also see the sheer uselessness of forcing all I/O into the filesystem.
You must convince us that open() makes sense for everything that might
be a file descriptor, and that it provides a real benefit for future
applications, before you destroy what we see as the beauty and power of
UNIX.
---Dan
Volume-Number: Volume 21, Number 128
In article <5...@usenix.ORG> pe...@ficc.ferranti.com (Peter da Silva) writes:
> But where things have moved closer
> to the underlying principles (everything is a file, for example) it's become
> the better for it.
The underlying principle is that everything is a file *descriptor*.
> > Sometimes there are objects which would be
> > reasonable to treat with filesystem semantics for which there is no
> > reasonable mechanism for introducing them into the filesystem name
> > space.
> This seems reasonable, but the rest is a pure argument from authority.
> Could you repeat these arguments for the benefit of hose of us who don't
> have the good fortune to know these networking experts you speak of?
The filesystem fails to deal with many (most?) types of I/O that aren't
reliable, static, and local. Here's an example: In reality, you initiate
a network stream connection in two stages. First you send off a request,
which wends its way through the network. *Some time later*, the response
arrives. Even if you aren't doing a three-way handshake, you must wait a
long time (in practice, up to several seconds on the Internet) before
you know whether the open succeeds.
In the filesystem abstraction, you open a filename in one stage. You
can't do anything between initiating the open and finding out whether or
not it succeeds. This just doesn't match reality, and it places a huge
restriction on programs that want to do something else while they
communicate.
You can easily construct other examples, but one should be enough to
convince you that open() just isn't sufficiently general for everything
that you might read() or write().
---Dan
Volume-Number: Volume 21, Number 129
In article <5...@usenix.ORG> ch...@tct.uucp (Chip Salzenberg) writes:
> According to fo...@bozeman.bozeman.ingr (Martin Fouts):
> >However, the presences of the proc file system is not a strong arguement
> >for the inclusion of othere features in the file system.
> I disagree. I consider it an excellent example of how the designers
> of Unix realize that all named objects potentially visible to more
> than one process belong in the filesystem namespace.
I disagree. I consider it an excellent example of how the designers of
UNIX realize that all *reliable*, *static*, *local* (or virtually local)
I/O objects potentially visible to more than one process belong in the
filesystem namespace.
/dev/proc, for example, is reliable---there's no chance of arbitrary
failure. It's static---processes have inertia, and stick around until
they take the positive action of exit()ing. And it's local---you don't
have an arbitrary delay before seeing the information. So it's a
perfectly fine thing to include in the filesystem without hesitation.
Objects that aren't reliable, or aren't static, or aren't local, also
aren't necessarily sensible targets of an open(). Some of them might fit
well, but each has to be considered on its own merits.
> So, how do we program in such a system? We use its elegant interface
> -- or should I say "interfaces"? Plain files, devices, IPCs, and
> network connections each have a semantically accurate interface, which
> unfortunately makes it different from all others.
The single UNIX interface is the file descriptor. You can read() or
write() reasonable I/O objects through file descriptors. Very few
programs---the shell is a counterexample---need to worry about what it
takes to set up those file descriptors. Very few programs---stty is a
counterexample---need to know the ioctl()s or other functions that
control the I/O more precisely. What is your complaint?
---Dan
Volume-Number: Volume 21, Number 136
In article <5...@usenix.ORG> he...@zoo.toronto.edu (Henry Spencer) writes:
> In article <5...@usenix.ORG> brn...@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >In the filesystem abstraction, you open a filename in one stage. You
> >can't do anything between initiating the open and finding out whether or
> >not it succeeds. This just doesn't match reality, and it places a huge
> >restriction on programs that want to do something else while they
> >communicate.
> Programs that want to do two things at once should use explicit parallelism,
> e.g. some sort of threads facility. In every case I've seen, this yielded
> vastly superior code, with clearer structure and better error handling.
I agree that programs that want to do two things at once should use
threads. However, a program that sends out several connection requests
is *not*, in fact, doing several things at once. open() forces it into
an unrealistic local model; surely you agree that this is not a good
semantic match for what actually goes on.
That example shows what goes wrong when locality disappears. As another
example, NFS (as it is currently implemented) shows what goes wrong when
reliability disappears. Have you ever run ``df'' on a Sun, only to have
it hang and lock up your terminal? Your process is stuck in kernel mode,
waiting for an NFS server that may be flooded with requests or may have
crashed. Programs that use the filesystem for IPC assume that their
files won't just disappear; this isn't true under NFS.
I am not saying that networked filesystems are automatically a bad
thing. Quite the contrary: a distributed filesystem with caching and
other forms of replication can easily be local and reliable, and I'll
gladly see standard UNIX make provisions for it. But something that's
not local, or not reliable, or not static, is also not necessarily
appropriate for the filesystem.
---Dan
Volume-Number: Volume 21, Number 132
In article <5...@usenix.ORG> brn...@kramden.acf.nyu.edu (Dan Bernstein) writes:
>In the filesystem abstraction, you open a filename in one stage. You
>can't do anything between initiating the open and finding out whether or
>not it succeeds. This just doesn't match reality, and it places a huge
>restriction on programs that want to do something else while they
>communicate.
Programs that want to do two things at once should use explicit parallelism,
e.g. some sort of threads facility. In every case I've seen, this yielded
vastly superior code, with clearer structure and better error handling.
--
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday| he...@zoo.toronto.edu utzoo!henry
Volume-Number: Volume 21, Number 131
In article <5...@usenix.ORG> s...@pkmab.se (Kristoffer Eriksson) writes:
> In article <5...@usenix.ORG> brn...@kramden.acf.nyu.edu (Dan Bernstein) writes:
[ file descriptors are general; the filesystem is not ]
> What prevents us from inventing a few additional filesystem operations
> that ARE general enough?
That's a good question. I am willing to believe that a somewhat
different kind of filesystem could sensibly handle I/O objects that are
neither reliable nor local. I find it somewhat harder to believe that
the concept of a filesystem can reasonably reflect dynamic I/O:
information placed into a filesystem should stick around until another
explicit action.
In any case, you'll have to invent those operations first.
> I think the important thing about the filesystem abstraction that is being
> debated here, is the idea of a common name space,
Here's what I thought upon reading this.
First: ``A common name space is irrelevant to the most important
properties of a filesystem.''
Second: ``A common name space is impossible.''
And finally: ``We already have a common name space.''
Let me explain. My first thought was that the basic purpose of a
filesystem---to provide reliable, static, local I/O---didn't require a
common name space. As long as there's *some* way to achieve that goal,
you have a filesystem. UNIX has not only some way, but a uniform,
consistent, powerful way: file descriptors.
But that's dodging your question. Just because a common name space is
irrelevant to I/O doesn't mean that it may not be helpful for some other
reason. My second thought was that the kind of name space you want is
impossible. You want to include network objects, but no system can
possibly keep track of the tens of thousands of ports under dozens of
protocols on hundreds of thousands of computer. It's just too big.
But that's not what you're looking for. Although the name space is huge,
any one computer only looks at a tiny corner of that space. You only
need to see ``current'' names. My third thought: We already have that
common name space! (file,/bin/sh) is in that space. (host,128.122.142.2)
is in that space. (proc,1) is in that space. No system call uses this
common name space, but it's there. Use it at will.
---Dan
Volume-Number: Volume 21, Number 137