Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Non Destructive Version of rm

2 views
Skip to first unread message

Mathieu Federspiel

unread,
May 2, 1991, 6:35:59 PM5/2/91
to
In article <1...@larry.UUCP> stoc...@larry.UUCP (Jeff Stockett) writes:
>
>I'm looking for a version of rm (or a script) that will move deleted files to a
>temporary location like .wastebasket, so that novice users who accidentally
>delete files, can redeem themselves. I've considered writing a script to
>do this, but I thought one might already exist.
>

Following are Bourne shell scripts I implemented on our systems.
I install the scripts in /usr/local/bin, and then give everyone an
alias of "rm" to this script.
What happens is, say, you "rm testfile". The script moves
"testfile" to ".#testfile". You then have a period of time to
"unrm testfile" to get the file back. The period of time is
determined by the system administrator, who sets up a job to run
periodically to remove all files with names starting with ".#".
For this removing process, the administrator must, of course,
warn users not to name files as ".#". Since this is a hidden file,
there should be no problem. Note that this preserves the directory
structure of files, which makes life easier than moving everything
to ".wastebasket". Also note that directories will be moved, and
special handling of directories in your removing job may be
required.
Enjoy!

--
Mathieu Federspiel mcf%statwa...@cs.orst.edu
Statware orstcs!statware!mcf
260 SW Madison Avenue, Suite 109 503-753-5382
Corvallis OR 97333 USA 503-758-4666 FAX

#---------------------------------- cut here ----------------------------------
# This is a shell archive. Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by Mathieu Federspiel <mcf@statware> on Tue Jun 12 17:17:26 1990
#
# This archive contains:
# rm#.1 rmrm#.1 rm lsrm rmrm unrm
#
# Modification/access file times will be preserved.
# Error checking via wc(1) will be performed.

unset LANG

echo x - rm\#.1
cat >rm\#.1 <<'@EOF'
.TH RM 1 "LOCAL"
.tr ^
.SH NAME
rm, lsrm, unrm, rmrm \-
temporary file removal system
.SH SYNOPSIS
.B rm
[rm options] files

.B lsrm
[ls options]

.B unrm
files

.B rmrm
directory

.SH DESCRIPTION
The temporary file removal system will use
.B mv(1)
to move the specified \fBfiles\fR to the same name with the
prefix \fB.#\fR.
Files with this prefix are deleted from the system after they are
one day old.
.B Unrm
may be used to restore temporarily removed files, if they have not been
deleted.
.B Lsrm
is used to list files in the current directory which are temporarily removed.
.B Rmrm
is used to remove all files which begin with \fB.#\fR from the
directory which is specified, and all directories thereunder.
.B Find(1)
is used to identify and remove those files.

.SH CAVEATS
The use of options with rm will cause different results.
The rm(1) option -i is recognized and the script will prompt for a "y"
or "Y" response before moving the file specified.
Any other response will not move the file.
Options other than -i are not recognized by rm.
The use of any option other than -i will result in the option and
the list of files being passed to rm(1) unchanged.
Only the first item in the command list is checked for "-i".

While \fBlsrm\fR will correctly use \fBls(1)\fR options,
it will only list
temporarily removed files in the current directory.

.SH "SEE ALSO"
rm(1),
ls(1),
mv(1)

.SH AUTHOR
Mathieu Federspiel, Statware.
@EOF
set `wc -lwc <rm\#.1`
if test $1$2$3 != 572511434
then
echo ERROR: wc results of rm\#.1 are $* should be 57 251 1434
fi

touch -m 1219172089 rm\#.1
touch -a 0605040190 rm\#.1
chmod 664 rm\#.1

echo x - rmrm\#.1
cat >rmrm\#.1 <<'@EOF'
.TH RM 1 "LOCAL"
.tr ^
.SH NAME
rm, lsrm, unrm, rmrm \-
temporary file removal system
.SH SYNOPSIS
.B rm
[rm options] files

.B lsrm
[ls options]

.B unrm
files

.B rmrm
directory

.SH DESCRIPTION
The temporary file removal system will use
.B mv(1)
to move the specified \fBfiles\fR to the same name with the
prefix \fB.#\fR.
Files with this prefix are deleted from the system after they are
one day old.
.B Unrm
may be used to restore temporarily removed files, if they have not been
deleted.
.B Lsrm
is used to list files in the current directory which are temporarily removed.
.B Rmrm
is used to remove all files which begin with \fB.#\fR from the
directory which is specified, and all directories thereunder.
.B Find(1)
is used to identify and remove those files.

.SH CAVEATS
The use of options with rm will cause different results.
The rm(1) option -i is recognized and the script will prompt for a "y"
or "Y" response before moving the file specified.
Any other response will not move the file.
Options other than -i are not recognized by rm.
The use of any option other than -i will result in the option and
the list of files being passed to rm(1) unchanged.
Only the first item in the command list is checked for "-i".

While \fBlsrm\fR will correctly use \fBls(1)\fR options,
it will only list
temporarily removed files in the current directory.

.SH "SEE ALSO"
rm(1),
ls(1),
mv(1)

.SH AUTHOR
Mathieu Federspiel, Statware.
@EOF
set `wc -lwc <rmrm\#.1`
if test $1$2$3 != 572511434
then
echo ERROR: wc results of rmrm\#.1 are $* should be 57 251 1434
fi

touch -m 1219172089 rmrm\#.1
touch -a 0612171790 rmrm\#.1
chmod 664 rmrm\#.1

echo x - rm
cat >rm <<'@EOF'
#!/bin/sh
# rm temporary
# script to do an mv rather than rm to .# file
# By Mathieu Federspiel, 1987
#
# recognizes -i option
#
# Modified to print if file deleted/not with -i. --- MCF, Jan 1989
# Modified to test for write permission. --- MCF, Aug 1989
# Modified to touch saved file. This helps with backups. --- MCF, Aug 1989

case "$1" in
-i) shift
for arg in $*
do
if [ \( -f $arg -o -d $arg \) -a -w $arg ]
then
echo "$arg [yn](n) ? \c"
read yesno
if [ "$yesno" = "y" -o "$yesno" = "Y" ]
then
base=`basename $arg`
dir=`dirname $arg`
mv $arg ${dir}/.#$base && touch ${dir}/.#$base
echo "$arg removed."
else
echo "$arg not removed."
fi
else
echo "$arg: write permission denied"
fi
done ;;
-*) /bin/rm "$@" ;;
*) for arg
do
if [ -w $arg ]
then
base=`basename $arg`
dir=`dirname $arg`
mv $arg ${dir}/.#$base && touch ${dir}/.#$base
else
echo "$arg: write permission denied"
fi
done ;;
esac
@EOF
set `wc -lwc <rm`
if test $1$2$3 != 45169984
then
echo ERROR: wc results of rm are $* should be 45 169 984
fi

touch -m 0824171389 rm
touch -a 0612163190 rm
chmod 775 rm

echo x - lsrm
cat >lsrm <<'@EOF'
ls -a $* .#*
@EOF
set `wc -lwc <lsrm`
if test $1$2$3 != 1413
then
echo ERROR: wc results of lsrm are $* should be 1 4 13
fi

touch -m 0825120387 lsrm
touch -a 0611201390 lsrm
chmod 775 lsrm

echo x - rmrm
cat >rmrm <<'@EOF'
#
# rmrm#: to rm files moved with rm#

USAGE="Usage: $0 <directory>"
case $# in
1 ) ;;
* )
echo $USAGE >&2 ; exit 1 ;;
esac

find $1 -name '.#*' -exec /bin/rm -f {} \;
@EOF
set `wc -lwc <rmrm`
if test $1$2$3 != 1137172
then
echo ERROR: wc results of rmrm are $* should be 11 37 172
fi

touch -m 0415135888 rmrm
touch -a 0604074090 rmrm
chmod 775 rmrm

echo x - unrm
cat >unrm <<'@EOF'
for arg
do
base=`basename $arg`
dir=`dirname $arg`
mv ${dir}/.#$base $arg
done
@EOF
set `wc -lwc <unrm`
if test $1$2$3 != 61196
then
echo ERROR: wc results of unrm are $* should be 6 11 96
fi

touch -m 0825120587 unrm
touch -a 0517152390 unrm
chmod 775 unrm

exit 0
--
Mathieu Federspiel mcf%statwa...@cs.orst.edu
Statware orstcs!statware!mcf
260 SW Madison Avenue, Suite 109 503-753-5382
Corvallis OR 97333 USA 503-758-4666 FAX

Jeff Stockett

unread,
Apr 29, 1991, 8:37:47 PM4/29/91
to
Greetings!

I'm looking for a version of rm (or a script) that will move deleted files to a
temporary location like .wastebasket, so that novice users who accidentally
delete files, can redeem themselves. I've considered writing a script to
do this, but I thought one might already exist.

Thanks in advance,

Jeffrey M. Stockett
Tensleep Design, Inc.

UUCP: ..cs.utexas.edu!ut-emx!jeanluc!larry!stockett
Internet: stoc...@titan.tsd.arlut.utexas.edu

John 'tms' Navarra

unread,
May 3, 1991, 5:26:19 PM5/3/91
to
In article <11...@statware.UUCP> m...@statware.UUCP ( Mathieu Federspiel) writes:
>
> Following are Bourne shell scripts I implemented on our systems.
>I install the scripts in /usr/local/bin, and then give everyone an
>alias of "rm" to this script.
> What happens is, say, you "rm testfile". The script moves
>"testfile" to ".#testfile". You then have a period of time to
>"unrm testfile" to get the file back. The period of time is
>determined by the system administrator, who sets up a job to run
>periodically to remove all files with names starting with ".#".
> For this removing process, the administrator must, of course,
>warn users not to name files as ".#". Since this is a hidden file,
>there should be no problem. Note that this preserves the directory
>structure of files, which makes life easier than moving everything
>to ".wastebasket". Also note that directories will be moved, and
>special handling of directories in your removing job may be
>required.
> Enjoy!

I am not to sure about this one. Why would you want to make a script
which does not allow users to name a file .# something when you can just make
an script to put ALL removed files into a directory /var/preserve/username
and remove all files in that directory older than two days? Then you can tell
users that they can get into that directory and get a copy of the file they
just removed, -- no matter what the name of it is.
Also, whatever script you write that searches thru EVERYONE's dir
looking for files beginning with a .# would be MUCH slower than doing a
find -mtime on a previously specified dir like /var/preserve and then removing
those files older than 2 days.
Also, when you remove a file from say your home directory, is there a
file .#file made in your home dir? and if you are in your bin directory there
is a .#file made there? That means of course that whatever script you write
to remove these files has to traverse EVERY damn directory on the planet lookin for .# files!
Also, when you say hidden, you mean from ls and not ls -las. Well I do
a ls -las all the time and I wouldn't want a whole bunch of .# files looking
me in the face when I ls my directories.

This is what I do:

I have a program called rm that moves all files I remove into $HOME/tmp. Then
I have a program called night-clean which is run from crontab that looks
SPECIFICALLY in $HOME/tmp and removes files older than 2 days. Night-clean
reports what files it removes to $HOME/adm/rmlog so I can look periodically
at what files crontab has removed in case I forget or something.
Of coure, rmlog grows to a considerable size after a while so I have
another program called skim which I run to make sure it is not too big :-)

Note though, that this is MUCH more efficient than looking a GOD knows how
many directories looking for .# files.


>
>--
>Mathieu Federspiel mcf%statwa...@cs.orst.edu
>Statware orstcs!statware!mcf
>260 SW Madison Avenue, Suite 109 503-753-5382
>Corvallis OR 97333 USA 503-758-4666 FAX
>

--
From the Lab of the MaD ScIenTiST:

nav...@casbah.acns.nwu.edu

Jonathan I. Kamens

unread,
May 6, 1991, 12:15:12 AM5/6/91
to
John Navarra suggests a non-destructive version of 'rm' that either
moves the deleted file into a directory such as
/var/preserve/username, which is periodically reaped by the system,
and from which the user can retrieve accidentally deleted files, or
uses a directory $HOME/tmp and does a similar thing.

He points out two drawbacks with the approach of putting the deleted
file in the same directory as before it was deleted. First of all,
this requires that the entire directory tree be searched in order to
reap deleted files, and this is slower than just having to search one
directory. Second, the files show up when the "-a" or "A" flag to ls
is used to list the files in a directory.

A design similar to his was considered when we set about designing
the non-destructive rm currently in use (as "delete") at Project
Athena and available in the comp.sources.misc archives. There were
several reasons why we chose the approach of leaving files in the same
directory, rather than Navarra's approach. They include:

1. In a distributed computing environment, it is not practical to
assume that a world-writeable directory such as /var/preserve will
exist on all workstations, and be accessible identically from all
workstations (i.e. if I delete a file on one workstation, I must be
able to undelete it on any other workstation; one of the tenet's of
Project Athena's services is that, as much as possible, they must
not differ when a user moves from one workstation to another).
Furthermore, the "delete" program cannot run setuid in order to
have access to the directory, both because setuid programs are a
bad idea in general, and because setuid has problems in remote
filesystem environments (such as Athena's). Using $HOME/tmp
alleviates this problem, but there are others....

2. (This is a big one.) We wanted to insure that the interface for
delete would be as close as possible to that of rm, including
recursive deletion and other stuff like that. Furthermore, we
wanted to insure that undelete's interface would be close to
delete's and as functional. If I do "delete -r" on a directory
tree, then "undelete -r" on that same filename should restore it,
as it was, in its original location.

Navarra's scheme cannot do that -- his script stores no information
about where files lived originally, so users must undelete files by
hand. If he were to attempt to modify it to store such
information, he would have to either (a) copy entire directory
trees to other locations in order to store their directory tree
state, or (b) munge the filenames in the deleted file directory in
order to indicate their original locationa, and search for
appropriate patterns in filenames when undeleting, or (c) keep a
record file in the deleted file directory of where all the files
came from.

Each of these approaches has problems. (a) is slow, and can be
unreliable. (b) might break in the case of funny filenames that
confuse the parser in undelete, and undelete is slow because it has
to do pattern matching on every filename when doing recursive
undeletes, rather than just opening and reading directories. (c)
introduces all kinds of locking problems -- what if two processes
try to delete files at the same time.

3. If all of the deleted files are kept in one directory, the
directory gets very large. This makes searching it slower, and
wastes space (since the directory will not shrink when the files
are reaped from it or undeleted).

4. My home directory is mounted automatically under /mit/jik. but
someone else may choose to mount it on /mnt, or I may choose to do
so. The undeletion process must be independent of mount point, and
therefore storing original paths of filenames when deleting them
will fail if a different mount point is later used. Using the
filesystem hierarchy itself is the only way to insure mount-point
independent operation of the system.

5. It is not expensive to scan the entire tree for deleted files to
reap, since most systems already run such scans every night,
looking for core files *~ files, etc. In fact, many Unix systems
come bundled with a crontab that searches for # and .# files every
night by default.

6. If I delete a file in our source tree, why should the deleted
version take up space in my home directory, rather than in the
source tree? Furthermore, if the source tree is on a different
filesystem, the file can't simply be rename()d to put it into my
deleted file directory, it has to be copied. That's slow. Again,
using the filesystem hierarchy avoids these problems, since
rename() within a directory always works (although I believe
renaming a non-empty directory might fail on some systems, they
deserve to have their vendors shot :-).

7. Similarly, if I delete a file in a project source tree that many
people work on, then other people should be able to undelete the
file if necessary. If it's been put into my home directory, in a
temporary location which presumably is not world-readable, they
can't. They probably don't even know who delete it.

Jonathan Kamens USnail:
MIT Project Athena 11 Ashford Terrace
j...@Athena.MIT.EDU Allston, MA 02134
Office: 617-253-8085 Home: 617-782-0710

John 'tms' Navarra

unread,
May 6, 1991, 3:24:47 AM5/6/91
to


The fact that among Athena's 'tenets' is that of similarity from
workstation to workstation is both good and bad in my opinion. True, it
is reasonable to expect that Unix will behave the same on similar workstations
but one of the fundamental benifits of Unix is that the user gets to create
his own environment. Thus, we can argue the advantages and disadvantages of
using an undelete utililty but you seem to be of the opinion that non-
standard changes are not beneficial and I argue that most users don't use
a large number of different workstations and that we shouldn't reject a
better method just because it isn't standard.
I don't understand your setuid argument. All you do is have a directory
called /var/preserve/navarra and have each persons directory unaccessible to
others (or possibily have the sticky bit set on too) so that only a the owner
of the file can undelete it.


>
>2. (This is a big one.) We wanted to insure that the interface for
> delete would be as close as possible to that of rm, including
> recursive deletion and other stuff like that. Furthermore, we
> wanted to insure that undelete's interface would be close to
> delete's and as functional. If I do "delete -r" on a directory
> tree, then "undelete -r" on that same filename should restore it,
> as it was, in its original location.
>
> Navarra's scheme cannot do that -- his script stores no information
> about where files lived originally, so users must undelete files by
> hand. If he were to attempt to modify it to store such
> information, he would have to either (a) copy entire directory
> trees to other locations in order to store their directory tree
> state, or (b) munge the filenames in the deleted file directory in
> order to indicate their original locationa, and search for
> appropriate patterns in filenames when undeleting, or (c) keep a
> record file in the deleted file directory of where all the files
> came from.

Ahh, we can improve that. I can write a program called undelete that
will look at the filename argument and by default undelete it to $HOME
but can also include a second argument -- a directory -- to move the
undeleted material. I am pretty sure I could (or some better programmer
than I) could get it to move more than one file at a time or even be
able to do something like: undelete *.c $HOME/src and move all files
in /var/preserve/username with .c extensions to your src dir.
And if you don't have an src dir -- it will make one for you. Now this
if done right, shouldn't take much longer than removing a directory
structure. So rm *.c on a dir should be only a tiny bit faster than
undelete *.c $HOME/src. I think the wait is worth it though -- esp
if you consider the consequnces of looking thru a tape backup or gee
a total loss of your files!
As far as rm -r and undelete -r go, perhaps the best way to handle
this is when the -r option is called, the whole dir in which you are
removing files is just moved to /preserve. And then an undelete -r dir dir2 where dir2 is a destination dir, would restore all those files. HOwever, you would run into
problems if /preserve is not mounted on the same tree as the dir you wanted
to remove. This can be resolved by allowing undelete to run suid but
I agree that is not wise. You wouldn't want users being able to mount
and unmount filesystems they had remove privledges on -- perhaps there
is another solution that I am overlooking but there are limits to any
program. Just because there might not be any information about where
the files orginally were is not good enough reason to axe its use.

>
> Each of these approaches has problems. (a) is slow, and can be
> unreliable. (b) might break in the case of funny filenames that
> confuse the parser in undelete, and undelete is slow because it has
> to do pattern matching on every filename when doing recursive
> undeletes, rather than just opening and reading directories. (c)
> introduces all kinds of locking problems -- what if two processes
> try to delete files at the same time.

Assuming I can write a program which could look thru this preserve
dir and grab a file(s) that matches the argument undelete would be slow
if there were a vast number of files in there. However, assuming you don't
remove HUGE numbers of files over a two day period (the period the files would
be deleted.) I bet that would be faster than undeleting a file in a number
of directories that have a .# extension because many directories would be
bigger than the /preserve dir in which case you would have to be digging thru
a bigger list of files.
Here are some more problems. Like rm, undelete would operate by looking
thru /preserve. But if rm did not store files in that dir but instead stored
them as .# in the current directory, then undelete would likewise have to
start looking in the current dir and work its way thru the directory structure
looking for .# files that matched a filename argument UNLESS you gave it
a starting directory as an argument in which case it would start there. That
seems like alot of hassle to me.
As far as funny filenames and such -- that I am not sure about but
it seems like it could be worked out.


>
>3. If all of the deleted files are kept in one directory, the
> directory gets very large. This makes searching it slower, and
> wastes space (since the directory will not shrink when the files
> are reaped from it or undeleted).

You get a two day grace period -- then they are GONE! This is still faster
than searchin thru the current directory (in many cases) looking for .# files
to undelete.

>
>4. My home directory is mounted automatically under /mit/jik. but
> someone else may choose to mount it on /mnt, or I may choose to do
> so. The undeletion process must be independent of mount point, and
> therefore storing original paths of filenames when deleting them
> will fail if a different mount point is later used. Using the
> filesystem hierarchy itself is the only way to insure mount-point
> independent operation of the system.
>
>5. It is not expensive to scan the entire tree for deleted files to
> reap, since most systems already run such scans every night,
> looking for core files *~ files, etc. In fact, many Unix systems
> come bundled with a crontab that searches for # and .# files every
> night by default.

if that is the case -- fine -- you got me there. Do it from crontab
and remove them every few days. I just think it is a waste to infest many directories with *~ and # and .# files when 99% of the time when someone
does rm filename -- THEY WANT IT REMOVED AND NEVER WANT TO SEE IT AGAIN!
SO now when I do an ls -las -- guess what! There they are again! Well
you tell me "John, don't do an ls -las" -- well how bout having
to wait longer on various ls's because my directory size is bigger now.
Say I did delete a whole mess of files, now I have all those files in
my current dir, now I want to see all my .files as well. So I do an ls -las
and when I come back from lunch I might see them . -- ever try to ls -las
/dev!?

>
>6. If I delete a file in our source tree, why should the deleted
> version take up space in my home directory, rather than in the
> source tree? Furthermore, if the source tree is on a different
> filesystem, the file can't simply be rename()d to put it into my
> deleted file directory, it has to be copied. That's slow. Again,
> using the filesystem hierarchy avoids these problems, since
> rename() within a directory always works (although I believe
> renaming a non-empty directory might fail on some systems, they
> deserve to have their vendors shot :-).
>
>7. Similarly, if I delete a file in a project source tree that many
> people work on, then other people should be able to undelete the
> file if necessary. If it's been put into my home directory, in a
> temporary location which presumably is not world-readable, they
> can't. They probably don't even know who delete it.

I admit you have pointed out some flaws. Some of which can be corrected,
others you just have to live with. I have made a few suggestions to improve
the program. In the end though, I think the one /preserve directory is
much better. But here is another suggestion which you might like:

make a shell variable RMPATH and you can set it to whatever PATH
you want. The default will be /var/preserve but you can set it to $HOME/tmp
or maybe perhaps it could work like the PS1 variable and have a $PWD
options in which case it is set to your current directory. Then when you
rm something or undelete something, the RMPATH will be checked.

>
>Jonathan Kamens USnail:
>MIT Project Athena 11 Ashford Terrace
>j...@Athena.MIT.EDU Allston, MA 02134
>Office: 617-253-8085 Home: 617-782-0710
>
>

The Grand Master

unread,
May 6, 1991, 12:58:26 PM5/6/91
to
In article <1991May6.0...@casbah.acns.nwu.edu> nav...@casbah.acns.nwu.edu (John 'tms' Navarra) writes:
}In article <JIK.91Ma...@pit-manager.mit.edu> j...@athena.mit.edu (Jonathan I. Kamens) writes:
}>
[First a brief history]

}> John Navarra suggests a non-destructive version of 'rm' that either
}>moves the deleted file into a directory such as
}>/var/preserve/username, which is periodically reaped by the system, or

}>uses a directory $HOME/tmp and does a similar thing.
}>
}> He points out two drawbacks with the approach of putting the deleted
}>file in the same directory as before it was deleted. First of all,
}>this requires that the entire directory tree be searched in order to
}>reap deleted files, and this is slower than just having to search one
}>directory. Second, the files show up when the "-a" or "A" flag to ls
}>is used to list the files in a directory.
}>
}> A design similar to his was considered when we set about designing
}>the non-destructive rm currently in use (as "delete") at Project Athena
}>1. In a distributed computing environment, it is not practical to
}> assume that a world-writeable directory such as /var/preserve will
}> exist on all workstations, and be accessible identically from all
}> workstations (i.e. if I delete a file on one workstation, I must be
}> able to undelete it on any other workstation; one of the tenet's of
}> Project Athena's services is that, as much as possible, they must
}> not differ when a user moves from one workstation to another).

Explain something to me Jon - first you say that /var/preserve will not
exist on all workstations, then you say you want a non-differing
environment on all workstations. If so, /var/preserve SHOULD
exist on all workstations if it exists on any. Maybe you should make
sure it does.



}> Furthermore, the "delete" program cannot run setuid in order to
}> have access to the directory, both because setuid programs are a
}> bad idea in general, and because setuid has problems in remote
}> filesystem environments (such as Athena's). Using $HOME/tmp
}> alleviates this problem, but there are others....

Doesn't need to run suid. Try this:
$ ls -ld /var/preserve
rwxrwxrwt preserve preserve /var/preserve
$ ls -l /var/preserve
rwx------ navarra navarra /var/preserve/navara
rwx------ jik jik /var/preserve/jik

hmm, doesn't look like you need anything suid for that!


}
} The fact that among Athena's 'tenets' is that of similarity from
} workstation to workstation is both good and bad in my opinion. True, it
} is reasonable to expect that Unix will behave the same on similar workstations
} but one of the fundamental benifits of Unix is that the user gets to create
} his own environment. Thus, we can argue the advantages and disadvantages of
} using an undelete utililty but you seem to be of the opinion that non-
} standard changes are not beneficial and I argue that most users don't use
} a large number of different workstations and that we shouldn't reject a
} better method just because it isn't standard.


It is bad in no way at all. It is reasonable for me to expect that my
personaly environment, and the shared system environment will be the
same on different workstations. And many users at a university sight
use several different workstations (I do). I like to know that i can
do things the same way no matter where I am when I log in.

}>2. (This is a big one.) We wanted to insure that the interface for
}> delete would be as close as possible to that of rm, including
}> recursive deletion and other stuff like that. Furthermore, we
}> wanted to insure that undelete's interface would be close to
}> delete's and as functional. If I do "delete -r" on a directory
}> tree, then "undelete -r" on that same filename should restore it,
}> as it was, in its original location.

Therre is not a large problem with this either. Info could be added to
the file, or a small record book could be kept. And /userf/jik could
be converted to $HOME in the process to avoid problems with diferent
mount points.


}>
}> Navarra's scheme cannot do that -- his script stores no information
}> about where files lived originally, so users must undelete files by
}> hand. If he were to attempt to modify it to store such
}> information, he would have to either (a) copy entire directory
}> trees to other locations in order to store their directory tree

What about $HOME/tmp???? - Then you would have only to mv it.


}> state, or (b) munge the filenames in the deleted file directory in
}> order to indicate their original locationa, and search for
}> appropriate patterns in filenames when undeleting, or (c) keep a
}> record file in the deleted file directory of where all the files
}> came from.

Again - these last two are no problem at all.


}
} Ahh, we can improve that. I can write a program called undelete that
} will look at the filename argument and by default undelete it to $HOME
} but can also include a second argument -- a directory -- to move the
} undeleted material. I am pretty sure I could (or some better programmer
} than I) could get it to move more than one file at a time or even be
} able to do something like: undelete *.c $HOME/src and move all files
} in /var/preserve/username with .c extensions to your src dir.
} And if you don't have an src dir -- it will make one for you. Now this
} if done right, shouldn't take much longer than removing a directory
} structure. So rm *.c on a dir should be only a tiny bit faster than
} undelete *.c $HOME/src. I think the wait is worth it though -- esp
} if you consider the consequnces of looking thru a tape backup or gee
} a total loss of your files!

This is not what Jon wants though. He does not want the user to have to
remember where in the directory tree the file was undeleted from.
However, what Jon fails to point out is that one must remember
where they deleted a file from with his method too. Say for example I do
the following.
$ cd $HOME/src/zsh2.00/man
$ delete zsh.1
Now later, when I want to retrieve zsh.1 - I MUST CHANGE DIRECTORIES
to $HOME/src/zsh2.00/man. I STILL HAVE TO REMEMBER WHAT DIRECTORY I
DELETED THE FILE FROM!!!! So you gain NOTHING by keeping the file in
the directory it was deleted from. Or does your undelete program also
search the entire damn directory structure of the system?

} As far as rm -r and undelete -r go, perhaps the best way to handle
} this is when the -r option is called, the whole dir in which you are
} removing files is just moved to /preserve. And then an undelete -r dir dir2 where dir2 is a destination dir, would restore all those files. HOwever, you would run into
} problems if /preserve is not mounted on the same tree as the dir you wanted

Again, that is why you should use $HOME/tmp.

}>3. If all of the deleted files are kept in one directory, the
}> directory gets very large. This makes searching it slower, and
}> wastes space (since the directory will not shrink when the files
}> are reaped from it or undeleted).

This is much better than letteng EVERY DAMN DIRECTORY ON THE SYSTEM
GET LARGER THAN IT NEEDS TO BE!!

Say I do this
$ ls -las
14055 -rw------- 1 wines 14334432 May 6 11:31 file12.dat
21433 -rw------- 1 wines 21860172 May 6 09:09 file14.dat
$ rm file*.dat
$ cp ~/new_data/file*.dat .
[ note at this point, my directory will probably grow to a bigger
size since therre is now a fill 70 Meg in one directory as opposed
to the 35 meg that should be there using John Navarra's method]
[work deleted]
$ rm file*.dat
(hmm, I want that older file12 back - BUT I CANNOT GET IT!)


}
} You get a two day grace period -- then they are GONE! This is still faster
} than searchin thru the current directory (in many cases) looking for .# files
} to undelete.

You are correct sir.


}>
}>4. My home directory is mounted automatically under /mit/jik. but
}> someone else may choose to mount it on /mnt, or I may choose to do
}> so. The undeletion process must be independent of mount point, and
}> therefore storing original paths of filenames when deleting them
}> will fail if a different mount point is later used. Using the
}> filesystem hierarchy itself is the only way to insure mount-point
}> independent operation of the system.

Well most of us try not to go mounting filesystems all over the place.
Who would be mounting your home dir on /mnt?? AND WHY???


}>
} if that is the case -- fine -- you got me there. Do it from crontab
} and remove them every few days. I just think it is a waste to infest many directories with *~ and # and .# files when 99% of the time when someone
} does rm filename -- THEY WANT IT REMOVED AND NEVER WANT TO SEE IT AGAIN!
} SO now when I do an ls -las -- guess what! There they are again! Well

John, how about trying (you use bash right?) ;-)
bash$ ls() {
> command ls $@ | grep -v \.\#
> }

} you tell me "John, don't do an ls -las" -- well how bout having
} to wait longer on various ls's because my directory size is bigger now.

This point is still valid however, because there will be overhead
associated with piping billions of files starting with .# through grep -v
(as well as the billions of files NOT starting with .# that must be piped
through)


}>
}>6. If I delete a file in our source tree, why should the deleted
}> version take up space in my home directory, rather than in the
}> source tree? Furthermore, if the source tree is on a different
}> filesystem, the file can't simply be rename()d to put it into my
}> deleted file directory, it has to be copied. That's slow. Again,
}> using the filesystem hierarchy avoids these problems, since
}> rename() within a directory always works (although I believe
}> renaming a non-empty directory might fail on some systems, they
}> deserve to have their vendors shot :-).

Is this system source code? If so, I really don't think you should be
deleting it with your own account. But if that is what you wish, how about
a test for if you are in your own directory. If yes, it moves the
deleted file to $HOME/tmp, if not, it moves it to ./tmp (or ./delete, or
./wastebasket or whatever)


}>
}>7. Similarly, if I delete a file in a project source tree that many
}> people work on, then other people should be able to undelete the
}> file if necessary. If it's been put into my home directory, in a
}> temporary location which presumably is not world-readable, they
}> can't. They probably don't even know who delete it.

Shouldn't need to be world readable (that is assuming that to have
permission to delete source you have to be in a special group - or
can just anyone on your system delete source?)


}
} I admit you have pointed out some flaws. Some of which can be corrected,
} others you just have to live with. I have made a few suggestions to improve
} the program. In the end though, I think the one /preserve directory is
} much better. But here is another suggestion which you might like:
}

}>Jonathan Kamens USnail:


Well Jon, I have a better solution for you - ready?
rm:
# Safe rm script
at -o 2.0.0 rm $*

That seems to be what you want.

Look - there is no perfect method for doing this. But the best way seems
to me to be the following
1) move files in the $HOME tree to $HOME/tmp
2) Totally delete files in /tmp
3) copy personally owned files from anywhere other than $HOME or /tmp
to $HOME/tmp (with a -r if necessary). Do this in the background.
Then remove them of course (cp -r $dir $HOME/tmp ; rm -r $dir) &
4) If a non-personally owned file is deleted, place it in ./delete,
and place a notification in file as to who deleted it when. Then spawn
an at job to delete the file in 2 days, and the notification in whatever
number of days you wish.
an example of 4:
jik> ls -las
drwxrwxr-x source source 1024 .
-rwxrwxr-x source source 5935 fun.c
jik> rm fun.c
jik> ls -las
drwxrwxr-x source source 1024 .
drwxrwxr-x source source 1024 .delete
-rwxrwxr-x source source 69 fun.c
jik> cat fun.c
File: fun.c
Deleted at: Mon May 6 12:41:31 EDT 1991
Deleted by: jik

Another possibility for 4:
I assume that the source tree is all one filesystem no? If so then
have filse removed in the source tree moved to /src/.delete. Have a
notification then placed in fun.c and spawn an at job to delete it, or
place the notification in fun.c_delete and have the src tree searched
for *_delete files (or whatever you wanna call them).

}From the Lab of the MaD ScIenTiST:
}
}nav...@casbah.acns.nwu.edu

Have fun.
Oh and by the way - I think doing this with a shell script is a complete
waste of resources. You could easily make mods to th eacual code to
rm to do this, or use the PUCC entombing library and not even have to
change the code to rm (just have to link to the aforementioned PUCC entombing
library when compiling rm).
culater
Bruce
Varney

---------
### ##
Courtesy of Bruce Varney ### #
aka -> The Grand Master #
a...@sage.cc.purdue.edu ### ##### #
PUCC ### #
;-) # #
;'> # ##

Jonathan I. Kamens

unread,
May 7, 1991, 5:33:46 AM5/7/91
to
In article <1991May6.0...@casbah.acns.nwu.edu>, nav...@casbah.acns.nwu.edu (John 'tms' Navarra) writes:
|> The fact that among Athena's 'tenets' is that of similarity from
|> workstation to workstation is both good and bad in my opinion. True, it
|> is reasonable to expect that Unix will behave the same on similar workstations
|> but one of the fundamental benifits of Unix is that the user gets to create
|> his own environment.

Our approach in no way prevents the user from creating his own environment.

|> Thus, we can argue the advantages and disadvantages of
|> using an undelete utililty but you seem to be of the opinion that non-
|> standard changes are not beneficial

No. What I am arguing is that users should have *access* to a similar
environment on all workstations. They can do with that environment whatever
the hell they want with it when they log in. They can use X, or not use X.
They can use mwm, or twm, or uwm, or gwm, or whatever-the-hell-wm they want.
they can use /bin/csh, or /bin/sh, or (more recently zsh), or a shell
installed in a contributed software locker or in their home directory. They
can configure their accounts as much as anyone at any Unix site, if not more.

|> and I argue that most users don't use
|> a large number of different workstations

There are over 1000 workstations at Project Athena. Most users will log
into a different workstation every time they log in. The biggest cluster has
almost 100 workstations in it.

Please remember that your environment is not everyone's environment. I am
trying to explain why the design chosen by Project Athena was appropriate for
Project Athena's environment; your solution may be appropriate for your
environment (although I still believe that it does have problems).
Furthermore, I still believe that Project Athena's approach is more
generalized than yours, for the simple reason that our approach will work in
your environment, but your approach will not work in our environment.

|> and that we shouldn't reject a
|> better method just because it isn't standard.

The term "standard" has no meaning here, since we're talking about
implementing something that doesn't come "standard" with Unix.

|> I don't understand your setuid argument. All you do is have a directory
|> called /var/preserve/navarra and have each persons directory unaccessible to
|> others (or possibily have the sticky bit set on too) so that only a the owner
|> of the file can undelete it.

In order to be accessible from multiple workstations, the /var/preserve
filesystem has to be a remote filesystem (e.g. NFS or AFS) mounted on each
workstation.

Mounting one filesystem, from one fileserver, on over 1000 workstations is
not practical. Furthermore, it does not scale (e.g. what if there are 10000
workstations rather than 1000?), and another of Project Athena's main design
goals was scalability. Finally, since all of the remote file access at Athena
is authenticated using Kerberos (because both NFS and AFS are insecure when
public workstations can be rebooted by users without something like Kerberos),
all users would have to authenticate themselves to /var/preserve's fileserver
in order to access it (to delete or undelete files). Storing authentication
for every user currently logged in is quite difficult for one fileserver to
deal with.

We have over 10000 users at Project Athena. This means that either (a)
there will have to be over 10000 subdirectories of /var/preserve, or (b) the
directories will have to be created as they are needed, which means either a
world-writeable /var/preserve or a setuid program that can create directories
in a non-world-writeable directory. And setuid programs don't work with
authenticated remote filesystems, which was my original point.

Yes, many of these concerns are specific to Project Athena. But, as I said,
what I'm trying to explain is not why all of the problems with your scheme I
mentioned are problems everywhere (although some of them are), but rather why
all of them are problems at Project Athena.

|> Ahh, we can improve that. I can write a program called undelete that
|> will look at the filename argument and by default undelete it to $HOME
|> but can also include a second argument -- a directory -- to move the
|> undeleted material. I am pretty sure I could (or some better programmer
|> than I) could get it to move more than one file at a time or even be
|> able to do something like: undelete *.c $HOME/src and move all files
|> in /var/preserve/username with .c extensions to your src dir.
|> And if you don't have an src dir -- it will make one for you.

I'm sorry, but this does nothing to address my concerns. Leaving the files
in the directory in which they were deleted preserves the state indicating
where they were originally, so that they can be restored to exactly that
location without the user having to specify it.

Your way of accomplishing the same thing is a kludge at best and does *not*
accomplish the same thing, but rather a crude imitation of it.

|> As far as rm -r and undelete -r go, perhaps the best way to handle
|> this is when the -r option is called, the whole dir in which you are
|> removing files is just moved to /preserve. And then an undelete -r dir
|> dir2 where dir2 is a destination dir, would restore all those files.

What if I do "delete -r foo" and then realize that I want to restore the
file "foo/bar/baz/frelt" without restoring anything else. My "delete" deletes
a directory recursively by renaming the directory and all of its contents with
".#" prefixes, recursively. Undeleting a specific file several levels deep is
therefore trivial, and my delete does it using only rename() calls, which are
quite fast.

Once again your system runs into the problem of /preserve being on a
different filesystem (if it can't be, then you have restricted all of your
files to reside on one filesystem), in which case copying directory structures
is slow as hell and can be unreliable. Since my system does no
inter-filesystem copying, it is fast (which was another requirement of the
design -- delete cannot be significantly faster than /bin/rm).

Let's see what your system has to do to undelete "foo/bar/baz/frelt".
First, it has to create the undeleted directory "foo". It has to give it the
same permissions as the deleted "foo", but it can't just rename() the "foo" in
/preserve, since that might be across filesystems and since it doesn't want
all of the *other* deleted files in /preserve/foo to show up undeleted. Then,
it has to do the same thing with "foo/bar" and "foo/bar/baz". Then, it has to
put "foo/bar/baz/frelt" back, copying it (slowly).

It seems to me that your system can reap deleted files quickly, but can
delete or undelete files rather slowly. My system reaps files slowly (using a
nightly "find" that many Unix sites already run), but runs very quickly from
the user's point of view. Tell me, whose time is more important at your site,
the user's or the computer's (late at night)?

|> Here are some more problems. Like rm, undelete would operate by looking
|> thru /preserve. But if rm did not store files in that dir but instead stored
|> them as .# in the current directory, then undelete would likewise have to
|> start looking in the current dir and work its way thru the directory structure
|> looking for .# files that matched a filename argument UNLESS you gave it
|> a starting directory as an argument in which case it would start there. That
|> seems like alot of hassle to me.

Um, "undelete" takes exactly the same syntax as "delete". If you give it an
absolute pathname, it looks in that pathname. If you don't, it looks relative
to the current path. If it can't find a file in the current directory, then
the file cannot be undeleted.

This functionality is identical to the functionality of virtually every
other Unix file utility. The system is not expected to be able to find a file
in an entire filesystem, given just its name. The user is expected to know
where the file is. That's how Unix works. Furthermore, the state is in the
filesystem, so that if the user forgets where something is, he can use "find"
or something to find it. It seems to me that Athena's design conforms more to
the Unix paradigm than yours.

|> You get a two day grace period -- then they are GONE! This is still faster
|> than searchin thru the current directory (in many cases) looking for .# files
|> to undelete.

The speed of searching is negligible. The speed of copying the file,
possibly very large, from another filesystem, is not. My program will
*always* run in negligible speed, yours will not.

|> SO now when I do an ls -las -- guess what!

You are one of the few people who has ever told me that he regularly uses
the "-a" flag to ls. Most people don't -- that's why ls doesn't display
dotfiles by default. Renaming files with a ".#" prefix to indicate that they
can be removed and to hide them is older than Athena's delete program; that's
why many Unix sites already search for ".#" files.

If you use "ls -a" so often that it is a problem for you, *and* if you
delete so many files that you will often see deleted files when you do "ls
-a", then don't do delete. You can't please all of the people all of the
time. But I would venture to say that new users, inexperienced users, the
users that "delete" is (for the most part) intended to protect, are not going
to have your problems.

|> make a shell variable RMPATH and you can set it to whatever PATH
|> you want. The default will be /var/preserve but you can set it to $HOME/tmp
|> or maybe perhaps it could work like the PS1 variable and have a $PWD
|> options in which case it is set to your current directory. Then when you
|> rm something or undelete something, the RMPATH will be checked.

This solves pretty much none of the problems I mentioned, and introduces
others. What if you delete something in one of your accounts that has a weird
RMPATH, and then want to undelete it later and can't remember who you were
logged in as when you deleted it? You've then got deleted files scattered all
over your filespace, and in fact they can be in places totally unrelated to
where they were originally. It makes much more sense to leave them where they
were when they were deleted -- if you know what the file is about, you
probably know in general where to look for it.

--

Jonathan I. Kamens

unread,
May 7, 1991, 5:59:12 AM5/7/91
to

(I have addressed some of Bruce's points in my last posting, so I will not
repeat here any point I have made there.)

In article <11...@mentor.cc.purdue.edu>, a...@sage.cc.purdue.edu (The Grand Master) writes:
|> Explain something to me Jon - first you say that /var/preserve will not
|> exist on all workstations, then you say you want a non-differing
|> environment on all workstations. If so, /var/preserve SHOULD
|> exist on all workstations if it exists on any. Maybe you should make
|> sure it does.

The idea of mounting one filesystem from one fileserver (which is what
/var/preserve would have to be, if it were to look the same from any
workstation so that any file could be recovered from any workstation) on all
workstations in a distributed environment does not scale well to even 100
workstations, let alone the over 1000 workstations that we have, and our
environment was designed to scale well to as many as 10000 workstations or
more.

If it doesn't scale, then it doesn't work in our environment. So we can't
"make sure" that /var/preserve appears on all workstations.

|> However, what Jon fails to point out is that one must remember
|> where they deleted a file from with his method too. Say for example I do
|> the following.
|> $ cd $HOME/src/zsh2.00/man
|> $ delete zsh.1
|> Now later, when I want to retrieve zsh.1 - I MUST CHANGE DIRECTORIES
|> to $HOME/src/zsh2.00/man. I STILL HAVE TO REMEMBER WHAT DIRECTORY I
|> DELETED THE FILE FROM!!!! So you gain NOTHING by keeping the file in
|> the directory it was deleted from. Or does your undelete program also
|> search the entire damn directory structure of the system?

Um, the whole idea of Unix is that the user knows what's in the file
hierarchy. *All* Unix file utilities expect the user to remember where files
are. This is not something new, nor (in my opinion) is it bad. I will not
debate that issue here; if you wish to discuss it, start another thread. I
will only say that our "delete" was designed in conformance with the Unix
paradigm, so if you wish to criticize this particular design decision, you
must be prepared to criticize and defend your criticism of every other Unix
utility which accepts the same design criterion.

|> This is much better than letteng EVERY DAMN DIRECTORY ON THE SYSTEM
|> GET LARGER THAN IT NEEDS TO BE!!

How many deleted files do you normally have in a directory in any three-day
period, or seven-day period, or whatever?

|> Say I do this
|> $ ls -las
|> 14055 -rw------- 1 wines 14334432 May 6 11:31 file12.dat
|> 21433 -rw------- 1 wines 21860172 May 6 09:09 file14.dat
|> $ rm file*.dat
|> $ cp ~/new_data/file*.dat .
|> [ note at this point, my directory will probably grow to a bigger
|> size since therre is now a fill 70 Meg in one directory as opposed
|> to the 35 meg that should be there using John Navarra's method]

First of all, the size of a directory has nothing to do with the size of the
files in it. Only with the number of files in it. Two extra file entries in
a directory increase its size negligibly, if at all (since directories are
sized in block increments).

Second, using John Navarra's method, assuming a separate partition for
deleted files, I could do this:

1. Copy 300meg of GIF files into /tmp.

2. "rm" them all.

3. Every day or so, "undelete" them into /tmp, touch them to update the
modification time, and then delete them.

Now I'm getting away with using the preservation area as my own personal file
space, quite possibly preventing other people from deleting files.

Using $HOME/tmp avoids this problem, but (as I pointed out in my first
message in this thread), you can't always use $HOME/tmp, so there is probably
going to be a way for a user to spoof the program into putting the files
somewhere nifty.

You could put quotas on the preserve directory. But the user's home
directory already has a quota on it (if you're using quotas), so why not just
leave the file in whatever filesystem it was in originally? Better yet, in
the degenerative case, just leave it in the same directory it was in
originally, with the same owner, thus guaranteeing it will be counted under
the correct quota until it is permanently removed! That's a design
consideration I neglected to mention in my previous messages....

|> [work deleted]
|> $ rm file*.dat
|> (hmm, I want that older file12 back - BUT I CANNOT GET IT!)

You can't get it back in the other system suggested either.

I have been considering adding "version control" to my package for a while
now. I haven't gotten around to it. It would not be difficult. But the
issue of version control is equivalent in both suggested solutions, and is
therefore not an issue.

|> Well most of us try not to go mounting filesystems all over the place.
|> Who would be mounting your home dir on /mnt?? AND WHY???

In a distributed environment of over 1000 workstations, where the vast
majority of file space is on remote filesystems, virtually all file access
happens on mounted filesystems. A generalized solution to this problem must
therefore be able to cope with filesystems mounted in arbitrary locations.

For example, let's say I have a NFS home directory that usually mounts on
/mit/jik. But then I log into one of my development machines in which I have
a local directory in /mit/jik, with my NFS home directory mounted on
/mit/jik/nfs. This *happens* in our environment. A solution that does not
deal with this situation is not acceptable in our environment (and will
probably run into problems in other environments as well).

|> Is this system source code? If so, I really don't think you should be
|> deleting it with your own account.

First of all, it is not your prerogative to question the source-code access
policies at this site. For your information, however, everyone who has write
access to the "system source code" must authenticate that access using a
separate Kerberos principal with a separate password. I hope that meets with
your approval.

Second, this is irrelevant.

|> But if that is what you wish, how about
|> a test for if you are in your own directory. If yes, it moves the
|> deleted file to $HOME/tmp, if not, it moves it to ./tmp (or ./delete, or
|> ./wastebasket or whatever)

How do you propose a 100% foolproof test of this sort? What if I have a
source filesystem mounted under my home directory? For all intents and
purposes, it will appear to be in my home directory. What if I have a source
tree in my home directory, and I delete a file in it, then tar up the source
directory and move it into another project directory, and then realize a
couple of days later that I need to undelete the file, but it's not there
anymore because it was deleted in my home directory and not in the project
directory?

How do you propose to move state about deleted files when hierarchies are
moved in that manner?

Your suggested alternate solutions to this problem, which I have omitted,
all save state in a way that degenerates into saving the state in each
directory by leaving the files there. Furthermore, something that has not yet
been mentioned, the implementation of a set of utilities which leaves the
files in place is far less complex than any other implementation. And the
less complex an implementation is, the easier it is to get it right (and
optomize it, and fix any bugs that do pop up, etc.).

David Dick

unread,
May 7, 1991, 9:44:11 AM5/7/91
to

>In article <11...@statware.UUCP> m...@statware.UUCP ( Mathieu Federspiel) writes:

[description of a renaming scheme for tentative file removal elided]

[desc. of tentative-removal directory scheme elided]

These two schemes seem to have their own advantages and disadvantages.

Renaming (prefixing filenames with something special-- ".#" was suggested)
has the advantage of leaving files in place in their filesystem
hierarchy, but reserves a class of names in the namespace, and
makes scavenging for too-old files slow (because the whole filesystem
must be searched).

Moving (copying the files with their original names into a fixed
directory) has the advantage of preserving original names and
not cluttering up the namespace, but full-path information is lost
and collisions of filenames can still occur.

How about a .deleted sub-directory in any directory where one
of these commands has been used? Then a tentatively-deleted file
can be moved there (very efficient, since only linking and unlinking
is necessary), the original name can be used, and full-path information
is preserved. The scavenger still needs to do a full filesystem search,
but I don't think it should be continuously running, anyway.

One additional thing that these schemes need, however hard it may be
to provide, is emergency deletion. That is, just as Macintosh
wastebasket contents get reclaimed if more blocks are needed, it
would be really nice if the same thing could happen, automatically,
if a filesystem ran out of space. On most bigger machines, this is
of little concern. But, for individual systems, barely scraping by,
this could be a real life-saver.

David Dick
Software Innovations, Inc. [the Software Moving Company (sm)]

The Grand Master

unread,
May 7, 1991, 4:58:41 PM5/7/91
to
In article <1991May7.0...@athena.mit.edu> j...@athena.mit.edu (Jonathan I. Kamens) writes:
}
} (I have addressed some of Bruce's points in my last posting, so I will not
}repeat here any point I have made there.)
}
}In article <11...@mentor.cc.purdue.edu>, a...@sage.cc.purdue.edu (The Grand Master) writes:
}|> environment on all workstations. If so, /var/preserve SHOULD
}|> exist on all workstations if it exists on any. Maybe you should make
} The idea of mounting one filesystem from one fileserver (which is what
}/var/preserve would have to be, if it were to look the same from any
Are you telling me that when you log in, you have to wait for your home
directory to be mounted on the workstation you log in on? - This is
absolutely Horid!! However, My suggestion of PUCC's entomb (with a
coupla mods) is very useful. Here it goes
First, you have a user named charon (yeah, the boatkeeper) which will
be in control of deleted files.
Next, at the top level of each filesystem you but a directory named
tomb - in other words, instead of the jik directory being the only
directory on your partition, there are two - jik and tomb.
Next, you use PUCC's entomb library. you wilol need to make a few mods,
but there should be little problem with that. The entomb library is full
of functions named (unlink, link etc) which will be calledfrom rm
instead of th "real" unlink, etc and which will if necesarry
call the real unlink etc. What these functtions will actually do is
call on a process entombd (which runs suid to root - ohmigod) to move
your files to the tomb directory. The current library does not
retain directory structure, but that is little problem to fix. The
important thing is that things are moved to the tomb directory that
is on the same file system as the directory from which they are deleted.
tomb is owned by charon, and is 700. The companion program unrm can
restore the files to their original location (note, in this case you do
not neccesarily have to be in the same directory from which you deleted
them - though the files WILL be returned to the directory from which you
deleted them). Unrm will only let you restore a file if you have read
permission on the file, and write permission on the directory to which
it will be restored. Just as important, since the ownership, permissions,
and directory structure of the files will be kept, you still will not be
able to look at files you are not authorized to look at. You no longer
have to worry about moving files to a new filesystem. You know longer
have to worry about looking at stupid .# files. And since preend(1) also
takes care of cleaning out the tomb directories, you no longer need
to search for them. Another nice thing is that preend is capable of
specifying different times for differnet files. A few quotes from the
PUCC man page on entomb:

You can control whether or not your files get entombed with
the ENTOMB environment variable:

____________________________________________________________________________
|variable setting action |
____________________________________________________________________________
|"no" no files are entombed |
|"yes" (the default) all files are entombed |
|"yes:pattern" only files matching pattern are entombed |
"no:pattern" all files except those matching pattern are entombed
+__________________________________________________________________________+

.......
If the file to be entombed is NFS mounted from a remote
host, the entomb program would be unable to move it to the
tomb because of the mapping of root (UID 0) to nobody (UID
-2). Instead, it uses the RPC mechanism to call the entombd
server on the remote host, which does the work of entombing.

..........
Files destroyed by the library calls in the entombing
library, libtomb.a, are placed in subdirectories on each
filesystem. The preening daemon, preend, removes old files
from these tombs. If the filesystem in question is less
than 90% full, files are left in the tomb for 24 hours,
minus one second for each two bytes of the file. If the
filesystem is between 90 and 95% full, files last 6 hours,
again adjusted for file size. If the filesystem is between
95 and 100% full, files last 15 minutes. If the filesystem
is more than 100% full, all files are removed at about 5
minute intervals. An exception is made for files named
"a.out" or "core" and filenames beginning with a "#" or end-
ing in ".o" or "~", which are left in the tomb for at most
15 minutes.
........

The entombing library, libtomb.a, contains routines named
creat, open, rename, truncate, and unlink that are call-
compatible with the system calls of the same names, but
which as a side effect may execute /usr/local/lib/entomb to
arrange for the file in question to be entombed.

The user can control whether or not his files get entombed
with the ENTOMB environment variable. If there is no ENTOMB
environment variable or if it is set to "yes", all files
destroyed by rm, cp, and mv are saved. If the ENTOMB
environment variable is set to "no", no files are ever
entombed.

In addition, a colon-separated list of glob patterns can be
given in the ENTOMB environment variable after the initial
"yes" or "no". A glob pattern uses the special characters
`*', `?', and `[' to generate lists of files. See the
manual page for sh(1) under the heading "Filename Genera-
tion" for an explanation of glob patterns.

center box; l l. variable setting action _ "no" no files
are entombed "yes" (the default) all files are entombed
"yes:pattern" only files matching pattern are entombed
"no:pattern" all files except those matching pattern are
entombed

If the ENTOMB environment variable indicates that the file
should not be entombed, or if there is no tomb directory on
the filesytem that contains the given file, the routines in
this library simply invoke the corresponding system call.

---------------------------------
If this is not a full enough explaination, please contact me via
email and I will try to be more thorough.

}
}|> However, what Jon fails to point out is that one must remember
}|> where they deleted a file from with his method too. Say for example I do
}|> the following.
}|> $ cd $HOME/src/zsh2.00/man
}|> $ delete zsh.1
}|> Now later, when I want to retrieve zsh.1 - I MUST CHANGE DIRECTORIES
}|> to $HOME/src/zsh2.00/man. I STILL HAVE TO REMEMBER WHAT DIRECTORY I
}|> DELETED THE FILE FROM!!!! So you gain NOTHING by keeping the file in
}|> the directory it was deleted from. Or does your undelete program also
}|> search the entire damn directory structure of the system?
}
} Um, the whole idea of Unix is that the user knows what's in the file
}hierarchy. *All* Unix file utilities expect the user to remember where files

Not exactly true. Note this is the reason for the PATH variable, so that
you do not have to remember where every God-blessed command resides.


}
} How many deleted files do you normally have in a directory in any three-day
}period, or seven-day period, or whatever?

Often many - it depends on the day


}
}|> Say I do this
}|> $ ls -las
}|> 14055 -rw------- 1 wines 14334432 May 6 11:31 file12.dat
}|> 21433 -rw------- 1 wines 21860172 May 6 09:09 file14.dat
}|> $ rm file*.dat
}|> $ cp ~/new_data/file*.dat .
}|> [ note at this point, my directory will probably grow to a bigger
}|> size since therre is now a fill 70 Meg in one directory as opposed
}|> to the 35 meg that should be there using John Navarra's method]
}
} First of all, the size of a directory has nothing to do with the size of the
}files in it. Only with the number of files in it. Two extra file entries in

Ok, you are right - I wan\sn't thinking here


}
}1. Copy 300meg of GIF files into /tmp.
}
}2. "rm" them all.
}
}3. Every day or so, "undelete" them into /tmp, touch them to update the
} modification time, and then delete them.
}
}Now I'm getting away with using the preservation area as my own personal file
}space, quite possibly preventing other people from deleting files.

Well, I could copy 300meg of GIFs to /tmp and keep touching them
every few hours or so (say with a daemon I run from my crontab) and
the effect would be the same.


}
} Using $HOME/tmp avoids this problem, but (as I pointed out in my first

Yes it does, as does using filesystemroot:/tomb


}
} You could put quotas on the preserve directory. But the user's home
}directory already has a quota on it (if you're using quotas), so why not just
}leave the file in whatever filesystem it was in originally? Better yet, in

Thatt is what entomb does!


} You can't get it back in the other system suggested either.

Some kind of revision control (though I am not sure how it works) is also
present with entomb.

}|> Well most of us try not to go mounting filesystems all over the place.
}|> Who would be mounting your home dir on /mnt?? AND WHY???
}
} In a distributed environment of over 1000 workstations, where the vast
}majority of file space is on remote filesystems, virtually all file access
}happens on mounted filesystems. A generalized solution to this problem must
}therefore be able to cope with filesystems mounted in arbitrary locations.

Well, then this is an absolute kludge. How ridiculous to have to mount and
unmount everyones directory when they log in/out. ABSURD!. You would be
better off to have a few powerful centralized systems with Xwindow terminals
instead of separate workstations.

In fact, what you have appearantly makes it impossible for me to access
any other users files that he might have purposefully left accessable
unless he is logged into the same workstation. Even if he puts some files
in /tmp for me, I HAVE TO LOG INTO THE SAME WORKSTATION HE WAS ON TO GET
THEM!! And if I am working on a workstation and 10 people happen to rlogin
to it at the same time, boy are my processes gonna keep smokin.
No the idea of an Xterminal with a small processor to handle the
Xwindows, and a large system to handle the rest is MUCH MUCH more reasonable
and functional.


}
} For example, let's say I have a NFS home directory that usually mounts on
}/mit/jik. But then I log into one of my development machines in which I have
}a local directory in /mit/jik, with my NFS home directory mounted on
}/mit/jik/nfs. This *happens* in our environment. A solution that does not
}deal with this situation is not acceptable in our environment (and will
}probably run into problems in other environments as well).

Well, in most environments (as far as I know) the average user is not allowed
to mount file systems.

}
}|> Is this system source code? If so, I really don't think you should be
}|> deleting it with your own account.
} First of all, it is not your prerogative to question the source-code access
}policies at this site. For your information, however, everyone who has write
}access to the "system source code" must authenticate that access using a
}separate Kerberos principal with a separate password. I hope that meets with
}your approval.

It is my perogative to announce my opinion on whatever the hell I choose,
and it is not yours to tell me I cannot. Again this seems like a worthless
stupid kludge. What is next - a password so that you can execute ls?


}
}--
}Jonathan Kamens USnail:
}MIT Project Athena 11 Ashford Terrace
}j...@Athena.MIT.EDU Allston, MA 02134
}Office: 617-253-8085 Home: 617-782-0710

While I understand the merits of your system, I still argue that it is
NOT a particularly good one. I remove things so that I do not have
to look at them anymore. And despite your ravings at John, ls -a is
not at all uncommon. In fact I believe it is the default if you are
root is it not? Most people I know DO use -a most of the time, in
fact most have
alias ls 'ls -laF'
or something of the like. And I do not like being restricted from
ever naming a file .#jikisdumb or whatever I wanna name it.

As Always,
The Grand Master

Jonathan I. Kamens

unread,
May 7, 1991, 6:46:44 PM5/7/91
to
In article <12...@mentor.cc.purdue.edu>, a...@sage.cc.purdue.edu (The Grand Master) writes:
|> Are you telling me that when you log in, you have to wait for your home
|> directory to be mounted on the workstation you log in on?

Yes.

|> - This is
|> absolutely Horid!!

I would suggest, Bruce, that you refrain from commenting about things about
which you know very little. Your entire posting is filled with jibes about
the way Project Athena does things, when you appear to know very little about
*how* we do things or about the history of Project Athena.

I doubt that DEC and IBM would have given Athena millions of dollars over
more than seven years if they thought it was a "kludge". I doubt that
Universities and companies all over the world would be adopting portions of
the Project Athena environment if they thought it was a "kludge". I doubt DEC
would be selling a bundled "Project Athena workstation" product if they
thought it was a "kludge". I doubt the OSF would have accepted major portions
of the Project Athena environment in their DCE if they thought it was a
"kludge".

You have the right to express your opinion about Project Athena. However,
when you opinion is based on almost zero actual knowledge, you just end up
making yourself look like a fool. Before allowing that to happen any more, I
suggest you try to find out more about Athena. There have been several
articles about it published over the years, in journals such as the CACM.

You also seem to be quite in the dark about the future of distributed
computing. The computer industry has recognized for years that personal
workstations in distributed environments are becoming more popular. I have
more computing power under my desk right now than an entire machine room could
hold ten years ago. With the entire computing industry moving towards
distributed environments, you assert that Project Athena, the first successful
large-scale distributed DCE in the world, would be better off "to have a few


powerful centralized systems with Xwindow terminals instead of separate

workstations." Whatever you say, Bruce; perhaps you should try to convince
DEC, IBM, Sun, HP, etc. to stop selling workstations, since the people buying
them would obviously be better off with a few powerful centralized systems.

|> Next, at the top level of each filesystem you but a directory named
|> tomb - in other words, instead of the jik directory being the only
|> directory on your partition, there are two - jik and tomb.

"Filesystems" are arbitrary in our environment. I can mount any AFS
directory as a "filesystem" (although AFS mounts are achieved using symbolic
links, the filesystem abstraction is how we keep NFS and AFS filesystems
parallel to each other). Furthermore, I can mount any *subdirectory* of any
NFS filesystem as a filesystem on a workstation, and the workstation has no
way of knowing whether that directory really is the top of a filesystem on the
remote host, or of getting to the "tomb" directory you propose.

As I think I've already pointed out now twice, we considered what you're
proposing when we designed Athena's "delete". But we also realized that in a
generalized environment that allows arbitrary mounting of filesystems,
top-level "tomb" or ".delete" or whatever directories just don't work, and
they degenerate into storing deleted files in each directory.

If your site uses "a few powerful centralized systems" and does not allow
mounting as we do, then your site can use the entomb stuff. But it just
doesn't cut it in a large-scale distributed environment, which is the point
I've tried to make in my previous two postings (and in this one).

In any case, mounting user home directories on login takes almost no time at
all; I just mounted a random user directory via NFS and it took 4.2 seconds.
That 4.2 seconds is well worth it, considering that they can access their home
directory on any of over 1000 workstations, any of which is probably as
powerful as one of your "powerful centralized systems."

|> What these functtions will actually do is
|> call on a process entombd (which runs suid to root - ohmigod) to move
|> your files to the tomb directory.

One more time -- setuid does not work with authenticated filesystems, even
when moving files on the same filesystem. Your solution will not work in our
environment. I do not know how many times I am going to have to repeat it
before you understand it.

|> ____________________________________________________________________________
|> |variable setting action |
|> ____________________________________________________________________________
|> |"no" no files are entombed |
|> |"yes" (the default) all files are entombed |
|> |"yes:pattern" only files matching pattern are entombed |
|> "no:pattern" all files except those matching pattern are entombed
|> +__________________________________________________________________________+

Very nice. I could implement this in delete if I wanted to; this does not
seem specific to the issues we are discussing (although it's a neat feature,
and I'll have to consider it when I have time to spend on developing delete).

|> If the file to be entombed is NFS mounted from a remote
|> host, the entomb program would be unable to move it to the
|> tomb because of the mapping of root (UID 0) to nobody (UID
|> -2). Instead, it uses the RPC mechanism to call the entombd
|> server on the remote host, which does the work of entombing.

We considered this too, and it was rejected because of the complexity
argument I mentioned in my last posting. Your daemon has to be able to figure
out what filesystem to call via RPC, using gross stuff to figure out mount
points. Even if you get it to work for NFS, you've got to be able to do the
same thing for AFS, or for RVD, which is the other file protocol we use. And
when you add new file protocols, your daemon has to be able to understand them
to know who to do the remote RPC too. Not generalized. Not scalable.

Furthermore, you have to come up with a protocol for the RPC requests. Not
difficult, but not easy either.

Furthermore, the local entombd has to have some way of authenticating to the
remote entombd. In an environment where root is secure and entombd can just
use a reserved port to transmit the requests, this isn't a problem. But in an
environment such as Athena's where anyone can hook up a PC or Mac or
workstation to the network and pretend to be root, or even log in as root on
one of our workstations (or public workstation root password is "mroot";
enjoy it), that kind of authentication is useless.

No, I'm not going to debate with you why people have root access on our
workstations. I've done that flame once, in alt.security shortly after it was
created. I'd be glad to send via E-mail to anyone who asks, every posting I
made during that discussion. But I will not debate it again here; in any
case, it is tangential to the subject currently being discussed.

By the way, the more I read about your entomb system, the more I think that
it is a clever solution to the problem it was designed to solve. It has lots
of nice features, too. But it is not appropriate for our environment.

|> } Um, the whole idea of Unix is that the user knows what's in the file
|> }hierarchy. *All* Unix file utilities expect the user to remember where files
|>
|> Not exactly true. Note this is the reason for the PATH variable, so that
|> you do not have to remember where every God-blessed command resides.

Running commands is different from manipulating files. There are very few
programs which manipulate files that allow the user to specify a filename and
know where to find it automatically. And those programs that do have this
functionality do so by either (a) always looking in the same place, or (b)
looking in a limited path of places (TEXINPUTS comes to mind). I don't know
of any Unix program which, by default, takes the filename specified by the
user and searches the entire filesystem looking for it. And no, find doesn't
count, since that's the one utility that was specifically designed to do this,
since nothing else does (although even find requires that you give it a
directory to start in).

|> Well, I could copy 300meg of GIFs to /tmp and keep touching them
|> every few hours or so (say with a daemon I run from my crontab) and
|> the effect would be the same.

You could, but I might not keep 300meg of space in my /tmp partition,
whereas I would probably want to keep as much space as possible free in my
entomb partitions, so that deleted files would not be lost prematurely.

|> Well, then this is an absolute kludge. How ridiculous to have to mount and
|> unmount everyones directory when they log in/out. ABSURD!.

See above. What you are calling "ABSURD" is pretty much accepted as the
wave of the future by almost every major workstation manufacturer and OS
developer in the world. Even the Mac supports remote filesystem access at
this point.

How else do you propose a network of 1000 workstations deal with all the
users' home directories? Oh, I forgot, you don't think anyone should need to
have a network of 1000 workstations. Right, Bruce.

|> In fact, what you have appearantly makes it impossible for me to access
|> any other users files that he might have purposefully left accessable
|> unless he is logged into the same workstation.

No, we have not. As I said above, you don't know what you're talking about,
and making accusations at Project Athena when you haven't even bothered to try
to find out if there is any truth behind the accusations is unwise at best,
and foolish at worst. Project Athena provides "attach", an interface to
mount(2) which allows users to mount any filesystem they want, anywhere they
want (at least, anywhere that is not disallowed by the configuration file for
"attach"). All someone else has to do to get to my home directory is type
"attach jik".

Do not assume that Project Athena is like Purdue and then assume what we don
on that basis. Project Athena is unlike almost any other environment in the
world (although there are a few that parallel it, such as CMU's Andrew system).

|> And if I am working on a workstation and 10 people happen to rlogin
|> to it at the same time, boy are my processes gonna keep smokin.

Workstations on Project Athena are private. One person, one machine (there
are exceptions, but they are just that, exceptions).

|> No the idea of an Xterminal with a small processor to handle the
|> Xwindows, and a large system to handle the rest is MUCH MUCH more reasonable
|> and functional.

You don't know what you're talking about. Project Athena *used to be*
composed of several large systems connected to many terminals. Users could
only log in on the cluster nearest the machine they had an account on, and
near the end of the term, every machine on campus was unuseable because the
loads were so high. Now, we can end up with 1000 people logged in at a time
on workstations all over campus, and the performance is still significantly
better than it was before we switched to workstations.

|> It is my perogative to announce my opinion on whatever the hell I choose,
|> and it is not yours to tell me I cannot. Again this seems like a worthless
|> stupid kludge. What is next - a password so that you can execute ls?

You asserted that we should not be writing to system source code with our
own account. I responded by pointing out that, in effect, we are not. We
simply require separate Kerberos authentication, rather than a completely
separate login, to get source write access. Now you respond by saying that
that authentication is wrong, when it is in fact what you implied we should be
doing in the first place.

Robert J Carter

unread,
May 8, 1991, 4:12:46 AM5/8/91
to
In article <12...@mentor.cc.purdue.edu> a...@sage.cc.purdue.edu (The Grand Master) writes:

>} First of all, it is not your prerogative to question the source-code access
>}policies at this site. For your information, however, everyone who has write
>}access to the "system source code" must authenticate that access using a
>}separate Kerberos principal with a separate password. I hope that meets with
>}your approval.
>
>It is my perogative to announce my opinion on whatever the hell I choose,
>and it is not yours to tell me I cannot. Again this seems like a worthless
>stupid kludge. What is next - a password so that you can execute ls?
>}

This WAS a real interesting thread, but it's going downhill - is there
any chance you two can keep the personalities out of it, and get on
with the discussion?

--
|=================================================================| ttfn!
| Robert J Carter Oghma Systems Ottawa, Ontario |
| Phone: (613) 565-2840 | @ @
| Fax: (613) 565-2840 (Phone First) r...@oghma.ocunix.on.ca | * *
|=================================================================| \_____/

j chapman flack

unread,
May 13, 1991, 8:57:51 AM5/13/91
to
In article <JIK.91Ma...@pit-manager.mit.edu>,

j...@athena.mit.edu (Jonathan I. Kamens) writes:
>
> both because setuid programs are a
> bad idea in general,

Could someone elaborate?
--
Chap Flack Their tanks will rust. Our songs will last.
ch...@art-sy.detroit.mi.us -MIKHS 0EODWPAKHS

Nothing I say represents Appropriate Roles for Technology unless I say it does.

Tom Christiansen

unread,
May 14, 1991, 6:14:50 AM5/14/91
to
From the keyboard of ch...@art-sy.detroit.mi.us (j chapman flack):
:In article <JIK.91Ma...@pit-manager.mit.edu>,

:j...@athena.mit.edu (Jonathan I. Kamens) writes:
:>
:> both because setuid programs are a
:> bad idea in general,
:
:Could someone elaborate?

Here's Henry Spencer's setuid(7) man page. I keep wanting to
update it a bit to further educate folks on suid script problems,
but haven't yet done so.

--tom

.TH SETUID 7 local
.DA 21 Feb 1987
.SH NAME
setuid \- checklist for security of setuid programs
.SH DESCRIPTION
Writing a secure setuid (or setgid) program is tricky.
There are a number of possible ways of subverting such a program.
The most conspicuous security holes occur when a setuid program is
not sufficiently careful to avoid giving away access to resources
it legitimately has the use of.
Most of the other attacks are basically a matter of altering the program's
environment in unexpected ways and hoping it will fail in some
security-breaching manner.
There are generally three categories of environment manipulation:
supplying a legal but unexpected environment that may cause the
program to directly do something insecure,
arranging for error conditions that the program may not handle correctly,
and the specialized subcategory of giving the program inadequate
resources in hopes that it won't respond properly.
.PP
The following are general considerations of security when writing
a setuid program.
.de P
.nr x \\w'\(sq'u+1n
.TP \\nxu
\(sq
..
.P
The program should run with the weakest userid possible, preferably
one used only by itself.
A security hole in a setuid program running with a highly-privileged
userid can compromise an entire system.
Security-critical programs like
.IR passwd (1)
should always have private userids, to minimize possible damage
from penetrations elsewhere.
.P
The result of
.I getlogin
or
.I ttyname
may be wrong if the descriptors have been meddled with.
There is
.I no
foolproof way to determine the controlling terminal
or the login name (as opposed to uid) on V7.
.P
On some systems (not ours), the setuid bit may not be honored if
the program is run by
.IR root ,
so the program may find itself running as
.IR root .
.P
Programs that attempt to use
.I creat
for locking can foul up when run by
.IR root ;
use of
.I link
is preferred when implementing locking.
Using
.I chmod
for locking is an obvious disaster.
.P
Breaking an existing lock is very dangerous; the breakdown of a locking
protocol may be symptomatic of far worse problems.
Doing so on the basis of the lock being `old' is sometimes necessary,
but programs can run for surprising lengths of time on heavily-loaded
systems.
.P
Care must be taken that user requests for i/o are checked for
permissions using the user's permissions, not the program's.
Use of
.I access
is recommended.
.P
Programs executed at user request (e.g. shell escapes) must
not receive the setuid program's permissions;
use of daughter processes and
.I setuid(getuid())
plus
.I setgid(getgid())
after
.I fork
but before
.I exec
is vital.
.P
Similarly, programs executed at user request must not receive other
sensitive resources, notably file descriptors.
Use of
.IR closeall (3)
or close-on-exec arrangements,
on systems which have them,
is recommended.
.P
Programs activated by one user but handling traffic on behalf of
others (e.g. daemons) should avoid doing
.IR setuid(getuid())
or
.IR setgid(getgid()) ,
since the original invoker's identity is almost certainly inappropriate.
On systems which permit it, use of
.I setuid(geteuid())
and
.I setgid(getegid())
is recommended when performing work on behalf of the system as
opposed to a specific user.
.P
There are inherent permission problems when a setuid program executes
another setuid program,
since the permissions are not additive.
Care should be taken that created files are not owned by the wrong person.
Use of
.I setuid(geteuid())
and its gid counterpart can help, if the system allows them.
.P
Care should be taken that newly-created files do not have the wrong
permission or ownership even momentarily.
Permissions should be arranged by using
.I umask
in advance, rather than by creating the file wide-open and then using
.IR chmod .
Ownership can get sticky due to the limitations of the setuid concept,
although using a daughter process connected by a pipe can help.
.P
Setuid programs should be especially careful about error checking,
and the normal response to a strange situation should be termination,
rather than an attempt to carry on.
.PP
The following are ways in which the program may be induced to carelessly
give away its special privileges.
.P
The directory the program is started in, or directories it may
plausibly
.I chdir
to, may contain programs with the same names as system programs,
placed there in hopes that the program will activate a shell with
a permissive
.B PATH
setting.
.B PATH
should \fIalways\fR be standardized before invoking a shell
(either directly or via
.I popen
or
.IR execvp/execlp ).
.P
Similarly, a bizarre
.B IFS
setting may alter the interpretation of a shell command in really
strange ways, possibly causing a user-supplied program to be invoked.
.B IFS
too should always be standardized before invoking a shell.
(Our shell does this automatically.)
.P
Environment variables in general cannot be trusted.
Their contents should never be taken for granted.
.P
Setuid shell files (on systems which implement such) simply cannot
cope adequately with some of these problems.
They also have some nasty problems like trying to run a
.I \&.profile
when run under a suitable name.
They are terminally insecure, and must be avoided.
.P
Relying on the contents of files placed in publically-writeable
directories, such as
.IR /tmp ,
is a nearly-incurable security problem.
Setuid programs should avoid using
.I /tmp
entirely, if humanly possible.
The sticky-directories modification (sticky bit on for a directory means
only owner of a file can remove it) (we have this feature) helps,
but is not a complete solution.
.P
A related problem is that
spool directories, holding information that the program will trust
later, must never be publically writeable even if the files in the
directory are protected.
Among other sinister manipulations that can be performed, note that
on many Unixes (not ours), a core dump of a setuid program is owned
by the program's owner and not by the user running it.
.PP
The following are unusual but possible error conditions that the
program should cope with properly (resource-exhaustion questions
are considered separately, see below).
.P
The value of
.I argc
might be 0.
.P
The setting of the
.I umask
might not be sensible.
In any case, it should be standardized when creating files
not intended to be owned by the user.
.P
One or more of the standard descriptors might be closed, so that
an opened file might get (say) descriptor 1, causing chaos if the
program tries to do a
.IR printf .
.P
The current directory (or any of its parents)
may be unreadable and unsearchable.
On many systems
.IR pwd (1)
does not run setuid-root,
so it can fail under such conditions.
.P
Descriptors shared by other processes (i.e., any that are open
on startup) may be manipulated in strange ways by said processes.
.P
The standard descriptors may refer to a terminal which has a bizarre
mode setting, or which cannot be opened again,
or which gives end-of-file on any read attempt, or which cannot
be read or written successfully.
.P
The process may be hit by interrupt, quit, hangup, or broken-pipe signals,
singly or in fast succession.
The user may deliberately exploit the race conditions inherent
in catching signals;
ignoring signals is safe, but catching them is not.
.P
Although non-keyboard signals cannot be sent by ordinary users in V7,
they may perhaps be sent by the system authorities (e.g. to
indicate that the system is about to shut down),
so the possibility cannot be ignored.
.P
On some systems (not ours)
there may be an
.I alarm
signal pending on startup.
.P
The program may have children it did not create.
This is normal when the process is part of a pipeline.
.P
In some non-V7 systems, users can change the ownerships of their files.
Setuid programs should avoid trusting the owner identification of a file.
.P
User-supplied arguments and input data
.I must
be checked meticulously.
Overly-long input stored in an array without proper bound checking
can easily breach security.
When software depends on a file being in a specific format, user-supplied
data should never be inserted into the file without being checked first.
Meticulous checking includes allowing for the possibility of non-ASCII
characters.
.P
Temporary files left in public directories
like
.I /tmp
might vanish at inconvenient times.
.PP
The following are resource-exhaustion possibilities that the
program should respond properly to.
.P
The user might have used up all of his allowed processes, so
any attempt to create a new one (via
.I fork
or
.IR popen )
will fail.
.P
There might be many files open, exhausting the supply of descriptors.
Running
.IR closeall (3),
on systems which have it,
is recommended.
.P
There might be many arguments.
.P
The arguments and the environment together might occupy a great deal
of space.
.PP
Systems which impose other resource limitations can open setuid
programs to similar resource-exhaustion attacks.
.PP
Setuid programs which execute ordinary programs without reducing
authority pass all the above problems on to such unprepared children.
Standardizing the execution environment is only a partial solution.
.SH SEE ALSO
closeall(3), standard(3)
.SH HISTORY
Locally written, although based on outside contributions.
.SH AUTHOR
Henry Spencer <he...@zoo.toronto.edu>
.SH BUGS
The list really is rather long...
and probably incomplete.
.PP
Neither the author nor the University of Toronto accepts any responsibility
whatever for the use or non-use of this information.
--
Tom Christiansen tch...@convex.com convex!tchrist
"So much mail, so little time."

j chapman flack

unread,
May 20, 1991, 8:08:33 AM5/20/91
to
In article <1991May14....@convex.com> tch...@convex.COM (Tom Christiansen) writes:
>Here's Henry Spencer's setuid(7) man page. I keep wanting to
...

And a very useful man page it is. I'll hang on to that.

The man page mentions that on "some" systems pwd(1) does not run setuid-root
and so can't pwd if the parent or an ancestor directory is unreadable.

My system is one of those. Is there something intrinsically unsafe about pwd
that would create holes if I made it setuid-root?

Also, I'm not sure I understand the effect of the resource-depletion types
of attacks. Someone recently suggested by email that a program can be made
to crash and leave the user in a privileged shell. When a program bombs,
doesn't its (privileged) process disappear?

...not arguing with the statements, just trying to understand the risks...

Tom Christiansen

unread,
May 21, 1991, 8:15:55 AM5/21/91
to
From the keyboard of ch...@art-sy.detroit.mi.us (j chapman flack):
:The man page mentions that on "some" systems pwd(1) does not run setuid-root

:and so can't pwd if the parent or an ancestor directory is unreadable.
:
:My system is one of those. Is there something intrinsically unsafe about pwd
:that would create holes if I made it setuid-root?

I can't really think of anything, but this is scant evidence, let alone
proof, of trustworthiness. Most of us seem to get by find without a suid
pwd(1). It fails whenever a normal getwd(3) would fail, but few of us
consider this critical. So what? The fewer suid programs (and the fewer
programs root always runs) the less you have to worry about. And I don't
think implementing getwd(3) via a popen(3) to a suid pwd(1) is a very
elegant solution.

:Also, I'm not sure I understand the effect of the resource-depletion types


:of attacks. Someone recently suggested by email that a program can be made
:to crash and leave the user in a privileged shell. When a program bombs,
:doesn't its (privileged) process disappear?

I've heard people say this, too, but it makes little sense. All I
can imagine is that the program might be coerced into giving you
a NEW, interactive, privileged shell. It would have to be a very
naughty program to change the uid/gid privs of its calling process,
although as I've shown before, it can be done if you poke kmem.

--toim

Kartik Subbarao

unread,
May 21, 1991, 9:59:39 AM5/21/91
to
In article <1991May21....@convex.com> tch...@convex.COM (Tom Christiansen) writes:
>From the keyboard of ch...@art-sy.detroit.mi.us (j chapman flack):
>:The man page mentions that on "some" systems pwd(1) does not run setuid-root
>:and so can't pwd if the parent or an ancestor directory is unreadable.
>:
>:My system is one of those. Is there something intrinsically unsafe about pwd
>:that would create holes if I made it setuid-root?
>
>I can't really think of anything, but this is scant evidence, let alone
>proof, of trustworthiness. Most of us seem to get by find without a suid
>pwd(1). It fails whenever a normal getwd(3) would fail, but few of us
>consider this critical. So what? The fewer suid programs (and the fewer
>programs root always runs) the less you have to worry about. And I don't
>think implementing getwd(3) via a popen(3) to a suid pwd(1) is a very
>elegant solution.

I agree. What people might be grumbling about is the fact that if you cd down
into subdirectories of a directory that is mode 711, /bin/pwd, since
it only does a straight getcwd(), fails because it can't find where it is
now. But, decent shells such as zsh have pwd as a builtin, so there's no
problem. It would seem that it is the shell's responsibility to do that kind
of stuff. Also, an ofiles on your shell process should also tell you where you
are.


-Kartik

--
internet% ypwhich

subb...@phoenix.Princeton.EDU -| Internet
kar...@silvertone.Princeton.EDU (NeXT mail)
SUBB...@PUCC.BITNET - Bitnet

0 new messages