[9fans] Ephase question.

Roman V. Shaposhnick

unread,

Aug 12, 2002, 9:27:24 PM8/12/02

to

Hi everybody,

digging inside 4th edition gave me some very unexpected results
in terms of file access semantics in user space. But let me show
a scenario first:

first-user$ cat > /shared-directory/file
blah-blah-blah

second-user$ rm /shared-directory/file

[first user after hitting <CR> ]
"phase error -- directory entry not allocated"

I was a little bit shocked at first, mainly because I've got so used to
UNIX semantics of "once you get it -- it's yours", that I've been taking
it for granted in Plan9 as well.

Suddenly I can't remember how 3nd and 2nd editions behaved.

Before now I was under the impression that regular unopened fids are mostly
used for reference counting and once you grab a fid nobody can kill the
actual object it refers to, but 4th edition proved me wrong. Even though
I still can't understand why it behaves this way. Could somebody explain
the rationale behind that to me, please ? And I'm really curios now about
what obligations server is supposed to have when it accepts a new fid from
a client for a given object.

Thanks,
Roman.

pres...@plan9.bell-labs.com

unread,

Aug 12, 2002, 9:40:17 PM8/12/02

to

This isn't new semantics. If you remove a file that someone
else is using, too bad for him. There's nothing sacred about
having a file open. If someone else has permissions to do
nasty and nefarious things to it, they can.

This is very different than Unix.

Roman V. Shaposhnick

unread,

Aug 12, 2002, 11:15:22 PM8/12/02

to

On Mon, Aug 12, 2002 at 09:39:40PM -0400, pres...@plan9.bell-labs.com wrote:
> This isn't new semantics. If you remove a file that someone
> else is using, too bad for him. There's nothing sacred about
> having a file open.

Indeed. Same applies to any fid, not just opened ones.

> If someone else has permissions to do nasty and nefarious things to it,
> they can.
>
> This is very different than Unix.

I see. But can you give me any insight into why it was implemented this
way. Again, it seems so obvious to use fids for reference counting and it
shouldn't be of a significant overhead. Moreover it's entirely up to
the FileServer to support this feature -- kernel is not supposed to
care. You should've had some reason for not supporting this in all
your FileServers.

Thanks,
Roman.

> Received: from plan9.cs.bell-labs.com ([135.104.9.2]) by plan9; Mon Aug 12 21:27:18 EDT 2002
> Received: from mail.cse.psu.edu ([130.203.4.6]) by plan9; Mon Aug 12 21:27:17 EDT 2002
> Received: from psuvax1.cse.psu.edu (psuvax1.cse.psu.edu [130.203.8.6])
> by mail.cse.psu.edu (CSE Mail Server) with ESMTP
> id 04B4D199B9; Mon, 12 Aug 2002 21:27:07 -0400 (EDT)
> Delivered-To: 9f...@cse.psu.edu
> Received: from unicorn.math.spbu.ru (unicorn.math.spbu.ru [195.19.226.166])
> by mail.cse.psu.edu (CSE Mail Server) with ESMTP id 4D5C41998C
> for <9f...@cse.psu.edu>; Mon, 12 Aug 2002 21:26:20 -0400 (EDT)
> Received: (from vugluskr@localhost)
> by unicorn.math.spbu.ru (8.9.3/8.9.3) id FAA10626
> for 9f...@cse.psu.edu; Tue, 13 Aug 2002 05:26:18 +0400
> From: "Roman V. Shaposhnick" <vugl...@unicorn.math.spbu.ru>
> To: 9f...@cse.psu.edu
> Message-ID: <2002081305...@unicorn.math.spbu.ru>
> Mime-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> X-Mailer: Mutt 1.0pre3i
> Subject: [9fans] Ephase question.
> Sender: 9fans...@cse.psu.edu
> Errors-To: 9fans...@cse.psu.edu
> X-BeenThere: 9f...@cse.psu.edu
> X-Mailman-Version: 2.0.11
> Precedence: bulk
> Reply-To: 9f...@cse.psu.edu
> List-Id: Fans of the OS Plan 9 from Bell Labs <9fans.cse.psu.edu>
> List-Archive: <https://lists.cse.psu.edu/archives/9fans/>
> Date: Tue, 13 Aug 2002 05:26:18 +0400

Russ Cox

unread,

Aug 12, 2002, 11:32:19 PM8/12/02

to

> I see. But can you give me any insight into why it was implemented this
> way. Again, it seems so obvious to use fids for reference counting and it
> shouldn't be of a significant overhead. Moreover it's entirely up to
> the FileServer to support this feature -- kernel is not supposed to
> care. You should've had some reason for not supporting this in all
> your FileServers.

If someone has a big runaway log file open that is running you out of
disk space, and you remove it, then it goes away. That's a feature.
(Of course, we don't have this problem on the worm drive, but the
point stands.) This way is simpler, and I've yet to see a compelling
argument against it.

You're right that the kernel doesn't care -- the file server is giving
you the phase error. Many of the non-disk file systems do reference
count their files.

Russ

pres...@plan9.bell-labs.com

unread,

Aug 12, 2002, 11:34:20 PM8/12/02

to

We chose it because it was easier to implement and we couldn't see
that doing so would cause undo hardship. Rsc's observation is good
but wasn't really a design goal.

There are 3 obvious alternatives:
- Unix' solution of making the remove fail with "file busy"; it was
always inconvenient and confusing. They use that one for in use
executables. The fs doesn't really know when a file is executing so that
one isn't really that useful to us.
- Have the remove work but not really remove the file from the directory
till the current opener goes away. That's just too confusing.
- Disassociate the file with the directory, but leave it around for anyone
that has it open to keep playing with. This is easy to do when the
file is really represented by an inode that doesn't have anything to
do with a directory. It's a lot harder without that indirection. We
didn't have inodes. The best we could do is copy it somewhere else
and fudge up pointers to the somewhere else (a special invisible
directory perhaps). It also leads to cleaning up orphaned files
during a reboot of the file server, fsck's job (or one of many) in
Unix. It gets messy quick without inodes being the one true
representation.

Clearly its a matter of taste; with enough code you can do most
anything. If it were a goal, Ken would proabably have designed
his fs a bit differently. Our taste, like our minds, tends to
favor the simple. Of course, we're gradually losing our sense
of taste due to exigency. About time for a new simple operating
system.

rob pike, esq.

unread,

Aug 12, 2002, 11:38:20 PM8/12/02

to

> I see. But can you give me any insight into why it was implemented this
> way. Again, it seems so obvious to use fids for reference counting and it
> shouldn't be of a significant overhead. Moreover it's entirely up to
> the FileServer to support this feature -- kernel is not supposed to
> care. You should've had some reason for not supporting this in all
> your FileServers.

There was an implementation reason, which I don't remember. I prefer
this argument:

When you remove a file, it's removed.

That's the definition of remove, as I understand it.

Unix has a weird property that you can remove files and they're still not
removed until some unfindable process dies. We used to run out of disk
space because an editor (mine) unlinked its /tmp file so it wouldn't clutter the
disk if it exited prematurely. If someone edited a big file, /tmp would fill
up but ls /tmp wouldn't tell you anything. Not to mention what happens
if the kernel crashed with a file in that half-made state.

open(ORCLOSE) is a much cleaner solution to the /tmp problem.

But the real argument is that Unix's semantics are an accident of the way
it implemented its file system. Plan 9 has different semantics. Whether
or not it's what you want, it's hard to argue with:

When you remove a file, it's removed.

-rob

Alexander Viro

unread,

Aug 13, 2002, 12:11:18 AM8/13/02

to

On Mon, 12 Aug 2002 pres...@plan9.bell-labs.com wrote:

> We chose it because it was easier to implement and we couldn't see
> that doing so would cause undo hardship. Rsc's observation is good
> but wasn't really a design goal.

[snip the list of alternatives]

The really interesting question is how much pain did that cause when
porting/rewriting software from Unix. creat()/unlink()/work with fd
you'd got from creat() is definitely a common idiom. OTOH, most of
its uses are for situations when you either want remove-on-close
or are messing with shared directories...

How bad it had it actually been?

Russ Cox

unread,

Aug 13, 2002, 12:22:21 AM8/13/02

to

APE simulates the creat()/unlink()/work idiom,
so APE-ported Unix programs would tend not to notice.
The programs I've natively ported have never cared.

Russ

Ronald G Minnich

unread,

Aug 13, 2002, 1:40:19 AM8/13/02

to

On Mon, 12 Aug 2002 pres...@plan9.bell-labs.com wrote:

> - Unix' solution of making the remove fail with "file busy"; it was
> always inconvenient and confusing. They use that one for in use
> executables.

I haven't seen a version of unix do this one for a while (as in decades).
The remove succeeds, the file goes away when the last reference does (but
you have to have inodes ...). But maybe there is some version of Unix
you're referencing I'm not familiar with -- there's a lot of possibilities
out there nowadays ...

I just tried this 'cat > /tmp/a' and 'rm' sequence on Linux and it works
as the original poster expected. In the case of executables I think the
principle of least surprise dictates that removing an executable shouldn't
cause running instances of that executable to toss chunks and die. This
kind of behaviour was always a source of major pain for people running
executables from NFS servers.

On the other hand, removing a log file that is growing without bound and
having it just go away is really nice. The old search-and-destroy
technique you have to use on Unix for this type of thing is pretty ugly.

> - Have the remove work but not really remove the file from the directory
> till the current opener goes away. That's just too confusing.

Ick. No argument there.

I wouldn't give up on your current simple OS just yet. Linux is going to
cross the 250-system-call mark soon, so you have a long way to go before
you're that crazy.

ron

Russ Cox

unread,

Aug 13, 2002, 1:43:16 AM8/13/02

to

> > - Unix' solution of making the remove fail with "file busy"; it was
> > always inconvenient and confusing. They use that one for in use
> > executables.
>
> I haven't seen a version of unix do this one for a while (as in decades).
> The remove succeeds, the file goes away when the last reference does (but
> you have to have inodes ...). But maybe there is some version of Unix
> you're referencing I'm not familiar with -- there's a lot of possibilities
> out there nowadays ...

i've seen it recently on either freebsd or linux,
in the case of trying to remove or perhaps overwrite
binaries that were being executed at the time.
it was definitely a binary rather than a normal file.

Scott Schwartz

unread,

Aug 13, 2002, 1:54:17 AM8/13/02

to

rsc writes:
| i've seen it recently on either freebsd or linux,
| in the case of trying to remove or perhaps overwrite

It's overwrite. (Solaris, on the other hand, blows away the process.)
So to safely (atomically) install a new executable or shared library,
you have to write it to a temp file, and then rename it into place.

Ronald G Minnich

unread,

Aug 13, 2002, 2:06:20 AM8/13/02

to

On Tue, 13 Aug 2002, Russ Cox wrote:

> > I haven't seen a version of unix do this one for a while (as in decades).
> > The remove succeeds, the file goes away when the last reference does (but
> > you have to have inodes ...). But maybe there is some version of Unix
> > you're referencing I'm not familiar with -- there's a lot of possibilities
> > out there nowadays ...
>
> i've seen it recently on either freebsd or linux,
> in the case of trying to remove or perhaps overwrite
> binaries that were being executed at the time.

overwrite, yeah. That EBUSY will definitely occur for overwrite in
different ways on different unices. Now I see what you meant.

On freebsd and Linux, exec happens via an mmap (more or less). Possibly
the behavioural difference you saw between binary and normal file was due
to how the kernels handle mmap for exec vs. file I/O, not due to it being
a binary vs. normal file.

ron

Alexander Viro

unread,

Aug 13, 2002, 2:14:16 AM8/13/02

to

On Tue, 13 Aug 2002, Russ Cox wrote:

Overwrite - sure. Remove - nope.

Alexander Viro

unread,

Aug 13, 2002, 2:23:15 AM8/13/02

to

On Tue, 13 Aug 2002, Ronald G Minnich wrote:

> overwrite, yeah. That EBUSY will definitely occur for overwrite in
> different ways on different unices. Now I see what you meant.
>
> On freebsd and Linux, exec happens via an mmap (more or less). Possibly
> the behavioural difference you saw between binary and normal file was due
> to how the kernels handle mmap for exec vs. file I/O, not due to it being
> a binary vs. normal file.

Yes. Notice that both Linux and FreeBSD have few reasons even for that
protection (and it's not too consistent - e.g. shared libraries are
not protected). It is, indeed, result of mmap() - there is a flag
(MAP_DENYWRITE) used when we map the binary and it prevents opening
file for write (and having file opened from write makes such mmap()
call fail). IIRC, it's a compatibility with fairly old systems and
it could be killed if anyone cared enough.

There's absolutely no protection against unlink()/rename() and there's
nothing to protect against - both would keep file alive until the
final close()/munmap(), so there's nothing to break.

Charles Forsyth

unread,

Aug 13, 2002, 2:27:15 AM8/13/02

to

so if i completely remove a file that someone is executing, it continues to work,
but if i copy into a file that remains and someone is executing it, it fails.

makes perfect sense to me.

Andrew Lynch

unread,

Aug 13, 2002, 2:59:16 AM8/13/02

to

On Aug 12, 11:33pm, pres...@plan9.bell-labs.com wrote:
>
> We chose it because it was easier to implement and we couldn't see
> that doing so would cause undo hardship.

So how easily could first-user undo the delete?
Maybe Freud can tell us...

Andrew.

Douglas A. Gwyn

unread,

Aug 13, 2002, 5:31:20 AM8/13/02

to

"rob pike, esq." wrote:
> Unix's semantics are an accident of the way
> it implemented its file system.

"Accident" in this case need not imply "unintentional".
When Canaday et al. invented the inode-based system,
it is conceivable that they thought the semantics were
just what they wanted.

I actually *like* being able to have scratch files that have
no directory entries, because they are *guaranteed* to go
away when the processes using them all terminate.

> Plan 9 has different semantics.

Indeed.

David Gordon Hogan

unread,

Aug 13, 2002, 7:44:19 AM8/13/02

to

> On freebsd and Linux, exec happens via an mmap (more or less). Possibly
> the behavioural difference you saw between binary and normal file was due
> to how the kernels handle mmap for exec vs. file I/O, not due to it being
> a binary vs. normal file.

So [gigantic leap here], not only does Linux have ~250 system
calls, but most of them can be emulated with mmap?

pres...@plan9.bell-labs.com

unread,

Aug 13, 2002, 8:17:15 AM8/13/02

to

> - Unix' solution of making the remove fail with "file busy"; it was
> always inconvenient and confusing. They use that one for in use
> executables.

I'm clearly misremembering. I was thinking of overwrite. Don't know
why, had to remove a file last week on an SGI system to do just that.

> The really interesting question is how much pain did that cause when
> porting/rewriting software from Unix. creat()/unlink()/work with fd
> you'd got from creat() is definitely a common idiom. OTOH, most of
> its uses are for situations when you either want remove-on-close
> or are messing with shared directories...
>
> How bad it had it actually been?

Russ addressed the ape library.

As someone, Rob?, has already mentioned, the close on remove
flag (ORCLOSE) flag on open/create does the same thing as unlinking
immediately after opening.

My biggest headache has been replacing running binaries. Since we
can't remove them or overwrite them without disasterous consequences,
we end up with a 'safeinstall' option in all our mkfiles. The safeinstall
moves the file to an unlikely name (e.g. x -> _x) and copies in the
new file. Of course, since we have dozens of machines all running off
the same file system, something is probably running off the _x that
was there. So we sometimes have to move _x to __x, etc. It's a
royal pain. We often forget and just install with the result that
someone an hour after the fact in some other part of the building
sends you a snap or pointer to a broken process. Since I don't have
to implement the fs, I'ld have preferred the Unix semantics in this
case. It's caused me a lot of inconvenience over the last 10+ years.

> "Accident" in this case need not imply "unintentional".
> When Canaday et al. invented the inode-based system,
> it is conceivable that they thought the semantics were
> just what they wanted.

Since the same 'et al' implemented both the first Unix fs
and the first Plan 9 fs, I'll ask him and see what he says.

r...@vitanuova.com

unread,

Aug 13, 2002, 9:07:19 AM8/13/02

to

> My biggest headache has been replacing running binaries. Since we
> can't remove them or overwrite them without disasterous consequences,
> we end up with a 'safeinstall' option in all our mkfiles. The safeinstall
> moves the file to an unlikely name (e.g. x -> _x) and copies in the
> new file. Of course, since we have dozens of machines all running off
> the same file system, something is probably running off the _x that
> was there. So we sometimes have to move _x to __x, etc. It's a
> royal pain. We often forget and just install with the result that
> someone an hour after the fact in some other part of the building
> sends you a snap or pointer to a broken process. Since I don't have
> to implement the fs, I'ld have preferred the Unix semantics in this
> case. It's caused me a lot of inconvenience over the last 10+ years.

i was going to mention this, but you did it for me.

would it be too nasty to make the fileserver refuse writes
on files that are currenty open with OEXEC?

that would alleviate somewhat the most common (and hard to find)
problem: overwriting a running binary.
(which can also be a problem for shellscripts, note)

rog.

rob pike, esq.

unread,

Aug 13, 2002, 11:44:25 AM8/13/02

to

> would it be too nasty to make the fileserver refuse writes
> on files that are currenty open with OEXEC?

We talked a lot about this in the early design days. If I recall
right we decided to allow the write, given permission, because
sometimes you really do want to update a binary and it's annoying when
you can't: build scripts fail, installs abort, that sort of thing.
There are times when we install everything and it's nice not to
worry about it. Although it's far from perfect, I'm comfortable
enough with the safeinstall notion to leave things as they are.

I realize that's not much of an argument.

-rob

Ronald G Minnich

unread,

Aug 13, 2002, 11:46:18 AM8/13/02

to

On Tue, 13 Aug 2002, David Gordon Hogan wrote:

> So [gigantic leap here], not only does Linux have ~250 system
> calls, but most of them can be emulated with mmap?

Hmm, I'm hoping you're joking ... I just woke up but this made me want to
go hide under my bed.

I'm actually thinking I need a slide:

slide 1: linux system call list

Slide 2: plan 9 system call list to the same scale. The problem with slide
2 to the same scale as slide 1 is that it will look like fuzz on the
slide, not a system call list ... to get all those linux calls on 1 slide
I'll need a 1-pt font.

ron

Ronald G Minnich

unread,

Aug 13, 2002, 11:54:20 AM8/13/02

to

On Tue, 13 Aug 2002 pres...@plan9.bell-labs.com wrote:

> My biggest headache has been replacing running binaries. Since we
> can't remove them or overwrite them without disasterous consequences,
> we end up with a 'safeinstall' option in all our mkfiles. The safeinstall
> moves the file to an unlikely name (e.g. x -> _x) and copies in the
> new file. Of course, since we have dozens of machines all running off
> the same file system, something is probably running off the _x that
> was there. So we sometimes have to move _x to __x, etc. It's a
> royal pain.

yes, NFS has a similar problem with this but in a different place. When
clients remove files, NFS servers don't really remove files, they rename
them. See SILLYRENAME. In NFS servers never know if someone is using a
file, so they don't want to remove it, hence all that .nfsxxyyzzww crud
you start to see in NFS over time (if the rm happens on client).

Of course if the rm happens on server, and a remote process is exec'ing
the file, bad things happen. This latter problem has led people over the
years to do the same kind of thing you describe above. Or, as in one place
I worked, the sysadmins upgrade executables on the server and then reboot
everybody's machine.

> We often forget and just install with the result that
> someone an hour after the fact in some other part of the building
> sends you a snap or pointer to a broken process. Since I don't have
> to implement the fs, I'ld have preferred the Unix semantics in this
> case. It's caused me a lot of inconvenience over the last 10+ years.

I've just got v9fs back and mostly working on Linux (uses 9p still but 3e
... 4e is next). But I'm going to have the Unix semantics. The shared exec
over a network case is too common not to use the Unix semantics.

ron

Russ Cox

unread,

Aug 13, 2002, 11:58:21 AM8/13/02

to

> I've just got v9fs back and mostly working on Linux (uses 9p still but 3e
> ... 4e is next). But I'm going to have the Unix semantics. The shared exec
> over a network case is too common not to use the Unix semantics.

And arguably you _should_ have the Unix semantics. No one here is
saying that one way is required. 9P doesn't say anything about when
the remove happens. It's a choice made by some file servers.
U9fs just does a Unix remove too, so you'd get Unix semantics there as well.

Russ

Russ Cox

unread,

Aug 13, 2002, 12:00:17 PM8/13/02

to

I don't know. Plan 9 has 48 system calls these days,
10 of which are deprecated. So 38. That's still a lot.

ano...@cosym.net

unread,

Aug 13, 2002, 12:56:20 PM8/13/02

to

are these 10 depricated calls used anywhere? if i'm
reading things right, they're all from the switch to
9p2000 (mm, plus ERRSTR change?), so i gather not.
does any plan exist for removing these?

and on the same topic, were write9p and read9p
eliminated because they were considered redundant
with pwrite and pread?

and, out of curiosity, were there ever syscalls for
slots 48 and 49? if so, what'd they do?
ア

rob pike, esq.

unread,

Aug 13, 2002, 1:02:19 PM8/13/02

to

> are these 10 depricated calls used anywhere? if i'm
> reading things right, they're all from the switch to
> 9p2000 (mm, plus ERRSTR change?), so i gather not.
> does any plan exist for removing these?

Some ancient binaries we have still contain them.
They do not appear in the libraries.

> and on the same topic, were write9p and read9p
> eliminated because they were considered redundant
> with pwrite and pread?

They became unnecessary because of the encapsulation
possible with thew new 9P, not because of pread and
pwrite.

> and, out of curiosity, were there ever syscalls for
> slots 48 and 49? if so, what'd they do?

I believe a summer student had something planned
there.

-rob

Russ Cox

unread,

Aug 13, 2002, 1:29:18 PM8/13/02

to

> and, out of curiosity, were there ever syscalls for
> slots 48 and 49? if so, what'd they do?

I think the gap is historical. We split the 9P2000 kernel and libraries
from the main sources and added all the new 9P2000 code
in a separate tree. At some point while the trees were split,
we added pread and pwrite to both systems. In the 9P2000
system, they were 37 and 38 (previously occupied by read9p
and write9p). The calls got added to the 9P1 systems some time
later, because they were too convenient not to have, and
I think 50 and 51 got used just to be out of the way as a temporary
slot until the 9P2000 kernels got installed. (We couldn't replace
read9p and write9p in the 9P1 system, so we needed new numbers.
At the time, the biggest 9P1 syscall number was seek at 39,
while the 9P2000 syscalls were slowly eating the 40s. I think 50
was just a safe bet.)

One compatibility measure taken in the 9P2000 kernels is that
all the old system calls are emulated, so that old binaries (at least
those that don't read directories) continue to work. This implies
that we had to allow 50 and 51 as pread/pwrite, so we just
changed the numbers rather than have two for each.

Hence the gap. There was in fact a summer student project using
40 and 41, but they got recycled for 9P2000 calls.

Pedantically,
Russ

Roman V. Shaposhnick

unread,

Aug 13, 2002, 6:08:27 PM8/13/02

to

Thanks for the answer ( special thanks to Ken for providing historical
background ). The only thing I still wish for is couple of lines in
9P specification or any other document saying that, generally speaking,
Servers have no obligations whatsoever regarding returned fids. Fids can
be dangling.

As for simplicity -- I can't agree more. That's what make 9P and Plan9 so
appealing to me.

Roman.

> Received: from plan9.cs.bell-labs.com ([135.104.9.2]) by plan9; Mon Aug 12 23:15:17 EDT 2002
> Received: from mail.cse.psu.edu ([130.203.4.6]) by plan9; Mon Aug 12 23:15:15 EDT 2002
> Received: from psuvax1.cse.psu.edu (psuvax1.cse.psu.edu [130.203.6.6])
> by mail.cse.psu.edu (CSE Mail Server) with ESMTP
> id DC934199DD; Mon, 12 Aug 2002 23:15:06 -0400 (EDT)
> Delivered-To: 9f...@cse.psu.edu
> Received: from unicorn.math.spbu.ru (unicorn.math.spbu.ru [195.19.226.166])
> by mail.cse.psu.edu (CSE Mail Server) with ESMTP id 661B919999
> for <9f...@cse.psu.edu>; Mon, 12 Aug 2002 23:14:24 -0400 (EDT)
> Received: (from vugluskr@localhost)
> by unicorn.math.spbu.ru (8.9.3/8.9.3) id HAA12130
> for 9f...@cse.psu.edu; Tue, 13 Aug 2002 07:14:22 +0400
> From: "Roman V. Shaposhnick" <vugl...@unicorn.math.spbu.ru>
> To: 9f...@cse.psu.edu
> Subject: Re: [9fans] Ephase question.
> Message-ID: <2002081307...@unicorn.math.spbu.ru>
> References: <edbdeed0efb39dc5...@plan9.bell-labs.com>
> Mime-Version: 1.0
> Content-Type: text/plain; charset=us-ascii
> X-Mailer: Mutt 1.0pre3i
> In-Reply-To: <edbdeed0efb39dc5...@plan9.bell-labs.com>
> Sender: 9fans...@cse.psu.edu
> Errors-To: 9fans...@cse.psu.edu
> X-BeenThere: 9f...@cse.psu.edu
> X-Mailman-Version: 2.0.11
> Precedence: bulk
> Reply-To: 9f...@cse.psu.edu
> List-Id: Fans of the OS Plan 9 from Bell Labs <9fans.cse.psu.edu>
> List-Archive: <https://lists.cse.psu.edu/archives/9fans/>
> Date: Tue, 13 Aug 2002 07:14:22 +0400
>
> On Mon, Aug 12, 2002 at 09:39:40PM -0400, pres...@plan9.bell-labs.com wrote:
> > This isn't new semantics. If you remove a file that someone
> > else is using, too bad for him. There's nothing sacred about
> > having a file open.
>
> Indeed. Same applies to any fid, not just opened ones.
>
> > If someone else has permissions to do nasty and nefarious things to it,
> > they can.
> >
> > This is very different than Unix.

>
> I see. But can you give me any insight into why it was implemented this
> way. Again, it seems so obvious to use fids for reference counting and it
> shouldn't be of a significant overhead. Moreover it's entirely up to
> the FileServer to support this feature -- kernel is not supposed to
> care. You should've had some reason for not supporting this in all
> your FileServers.
>

> Thanks,
> Roman.
>
> > Received: from plan9.cs.bell-labs.com ([135.104.9.2]) by plan9; Mon Aug 12 21:27:18 EDT 2002
> > Received: from mail.cse.psu.edu ([130.203.4.6]) by plan9; Mon Aug 12 21:27:17 EDT 2002
> > Received: from psuvax1.cse.psu.edu (psuvax1.cse.psu.edu [130.203.8.6])
> > by mail.cse.psu.edu (CSE Mail Server) with ESMTP
> > id 04B4D199B9; Mon, 12 Aug 2002 21:27:07 -0400 (EDT)
> > Delivered-To: 9f...@cse.psu.edu
> > Received: from unicorn.math.spbu.ru (unicorn.math.spbu.ru [195.19.226.166])
> > by mail.cse.psu.edu (CSE Mail Server) with ESMTP id 4D5C41998C
> > for <9f...@cse.psu.edu>; Mon, 12 Aug 2002 21:26:20 -0400 (EDT)
> > Received: (from vugluskr@localhost)
> > by unicorn.math.spbu.ru (8.9.3/8.9.3) id FAA10626
> > for 9f...@cse.psu.edu; Tue, 13 Aug 2002 05:26:18 +0400
> > From: "Roman V. Shaposhnick" <vugl...@unicorn.math.spbu.ru>
> > To: 9f...@cse.psu.edu
> > Message-ID: <2002081305...@unicorn.math.spbu.ru>
> > Mime-Version: 1.0
> > Content-Type: text/plain; charset=us-ascii
> > X-Mailer: Mutt 1.0pre3i
> > Subject: [9fans] Ephase question.
> > Sender: 9fans...@cse.psu.edu
> > Errors-To: 9fans...@cse.psu.edu
> > X-BeenThere: 9f...@cse.psu.edu
> > X-Mailman-Version: 2.0.11
> > Precedence: bulk
> > Reply-To: 9f...@cse.psu.edu
> > List-Id: Fans of the OS Plan 9 from Bell Labs <9fans.cse.psu.edu>
> > List-Archive: <https://lists.cse.psu.edu/archives/9fans/>
> > Date: Tue, 13 Aug 2002 05:26:18 +0400
> >
> > Hi everybody,
> >
> > digging inside 4th edition gave me some very unexpected results
> > in terms of file access semantics in user space. But let me show
> > a scenario first:
> >
> > first-user$ cat > /shared-directory/file
> > blah-blah-blah
> >
> > second-user$ rm /shared-directory/file
> >
> > [first user after hitting <CR> ]
> > "phase error -- directory entry not allocated"
> >
> > I was a little bit shocked at first, mainly because I've got so used to
> > UNIX semantics of "once you get it -- it's yours", that I've been taking
> > it for granted in Plan9 as well.
> >
> > Suddenly I can't remember how 3nd and 2nd editions behaved.
> >
> > Before now I was under the impression that regular unopened fids are mostly
> > used for reference counting and once you grab a fid nobody can kill the
> > actual object it refers to, but 4th edition proved me wrong. Even though
> > I still can't understand why it behaves this way. Could somebody explain
> > the rationale behind that to me, please ? And I'm really curios now about
> > what obligations server is supposed to have when it accepts a new fid from
> > a client for a given object.
> >
> > Thanks,
> > Roman.

r...@vitanuova.com

unread,

Aug 13, 2002, 7:43:20 PM8/13/02

to

> > would it be too nasty to make the fileserver refuse writes
> > on files that are currenty open with OEXEC?

> sometimes you really do want to update a binary and it's annoying when
> you can't

well, safeinstall would still have its place:

cp $prereq $stem || mv $prereq _$stem && cp $prereq $stem

...it would just be less exercised.

round here we tend to

mv /bin/prog /bin/prog.`{date -n}

which leads to less clashes (but does require garbage collection
every so often).

i don't have such a problem doing this with publicly installed
executables; it's when i'm in the usual compile/edit cycle, and
accidentally overwrite a running executable and spend half an hour
looking for the non-existent bug that gets my goat. it also means
that i have developed a tendency to ignore hard-to-explain problems:
"oh, it must have been overwritten".

rog.

Douglas A. Gwyn

unread,

Aug 14, 2002, 4:42:44 AM8/14/02

to

Russ Cox wrote:
> I don't know. Plan 9 has 48 system calls these days,
> 10 of which are deprecated. So 38. That's still a lot.

It would be an interesting academic exercise to determine the
minimal set. E.g. close() doesn't seem to be needed; whenever
there's no reference to the object, any connection to it can be
cleaned up. At least for Unix devices, it was only the *last*
close that did anything interesting. And I've sometimes had
the feeling that bind and mount aren't dissimilar enough to
require separation. The trick is to combine function in some
natural way, not merely tunnel through the syscall port then
expand again on the other side.

Would it be possible for a 9P-like system to make all service
calls just access a server within the protocol? E.g., instead
of open() there would be a channel that handles 9P-open packets
(as well as other 9P packets). (Apologies if I've garbled the
details; the general idea is all I'm wondering about.)

rob pike, esq.

unread,

Aug 14, 2002, 9:20:20 AM8/14/02

to

> It would be an interesting academic exercise to determine the
> minimal set. E.g. close() doesn't seem to be needed; whenever
> there's no reference to the object, any connection to it can be
> cleaned up.

What about releasing the resources attached to the file descriptor
/ fid? A long-running command using hundreds of files in
sequence - for instance a network server - would be a resource
hog without close. Of course you could provide a garbage collector
but that opens a whole new world of trouble.

> At least for Unix devices, it was only the *last*
> close that did anything interesting.

Except for releasing the fd.

> And I've sometimes had
> the feeling that bind and mount aren't dissimilar enough to
> require separation.

The earliest version of Plan 9 had a different setup than bind
and mount; it was mount and fmount. It was a mess. I don't
remember much about it but the current scheme was a huge
improvement - huge. I think for instance exportfs might have
been impossible in the old scheme.

But there well may be a unified approach that we just missed.

In any case, who cares? As you said, it's academic. I can do
everything with a single system call, syscall, that takes as its
first argument the call to make. If you call that cheating, I
respond that rfork is really a set of calls encoded with a bitvector.

Nobody asks what the minimum libc interface is, or even counts
the calls. For some reason system calls are seen as magical. They're
just one way to implement a library. In Plan 9 in particular, we've
tried to blur the difference between syscall and library, with
things moving back and forth over time. Stat, wait, and read (!) are
no longer system calls.

What matters is expressibility without bloat, not finding the
criteria under which to claim a lower count of functions of type
T, for some T.

-rob

Russ Cox

unread,

Aug 14, 2002, 10:13:18 AM8/14/02

to

> What about releasing the resources attached to the file descriptor
> / fid? A long-running command using hundreds of files in
> sequence - for instance a network server - would be a resource
> hog without close. Of course you could provide a garbage collector
> but that opens a whole new world of trouble.

It'd be just like NFS!

Douglas A. Gwyn

unread,

Aug 15, 2002, 4:59:02 AM8/15/02

to

"rob pike, esq." wrote:
> Of course you could provide a garbage collector
> but that opens a whole new world of trouble.

That's the sort of thing I had in mind. Unused resources
can *in principle* be reclaimed without the resource client's
explicit participation. How practical it would be is another
question, and I'm not proposing that real systems be boiled
down to minimal possible facilities that are hard to use.
(Otherwise we'd all be given just gate arrays to program.)

> In any case, who cares? As you said, it's academic.

Doesn't mean that it's not worth thinking about. Sometimes
good ideas emerge that can be adapted into useful practice.

> ... rfork is really a set of calls encoded with a bitvector.

Indeed there have been several composite "spawn" implementations;
while simplicity might be hard to measure accurately, a single
call with lots of parameters doesn't seem as simple as a couple
of calls with very few parameters.

> Nobody asks what the minimum libc interface is, ...

To some degree we do. How much *has* to be provided by each
specific C implementation in order for a user to be able to
provide his own portable implementation of all the standard
library? In the embedded-processor world I find that such
questions can be of urgent practical importance.

A similar question is, what is the minimum synchronization
mechanism needed to coordinate threads? Add "efficiently".
Mutexes seem to be the answer, but if someone knows better
I'd appreciate hearing about it.

> What matters is expressibility without bloat, not finding
> the criteria under which to claim a lower count of
> functions of type T, for some T.

It can depend on one's goal. For example, if a primary goal
is proof of security, it seems intractible unless the number
of primitive functions is fairly small and each has a fairly
clean specification. Reliability and correctness, ditto.
English is very expressive, but problematic for purposes of
precise specification. It would be meaningful and potentially
useful for linguists to consider minimal requirements for an
equally expressive but precise natural language. (Leaving
aside any requirement for intentional ambiguity, which for
the most part is also a working assumption for OS services.)

Ronald G Minnich

unread,

Aug 15, 2002, 12:23:23 PM8/15/02

to

On Thu, 15 Aug 2002, Douglas A. Gwyn wrote:

> > What matters is expressibility without bloat, not finding
> > the criteria under which to claim a lower count of
> > functions of type T, for some T.
>
> It can depend on one's goal. For example, if a primary goal
> is proof of security, it seems intractible unless the number
> of primitive functions is fairly small and each has a fairly
> clean specification. Reliability and correctness, ditto.

Why I care about "how many system calls"; check out the Unix (Linux)
system call list nowadays. There are lots of different resource types
(pathnames, sysctl names, fds, pids, etc. etc) and consequently lots of
different difficulties. Just watching the freebsd 'jail' discussion has
been interesting. How do you ever secure an interface this complicated?
Seems very hard, and has proven to be hard in practice.

It's not just "reduce T for some T". Exploding system call counts can
indicate a problem with the design of the system (see some of the later
Linux system calls ...).

ron

Boyd Roberts

unread,

Aug 19, 2002, 12:24:26 PM8/19/02

to

Ronald G Minnich wrote:

>I haven't seen a version of unix do this one for a while (as in decades).
>The remove succeeds, the file goes away when the last reference does (but
>you have to have inodes ...).
>

Yes, removal was fixed with BSD so you no longer got ETXTBUSY; the
directory entry would
disappear but the inode would persist.

>But maybe there is some version of Unix
>you're referencing I'm not familiar with -- there's a lot of possibilities
>out there nowadays ...
>

However, this did not fix the over-write problem, which most/all unix'
still suffer from.