Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How can I determine which files have been read?

1 view
Skip to first unread message

Adrian Cook

unread,
Mar 18, 1999, 3:00:00 AM3/18/99
to
Hello,

I am trying to find a way of determing which files have been read
(though not necessarily written to) in a certain directory tree on a UNIX
system. Determining which files have been modified is easy - just look at
the modification timestamp - but determinig which files have been simply
read seems to be much more difficult.

Basically, I want to do this because we need to delete as many of these
files as possible due to limitations in disk space on the target machine.
The files in this directory tree belong to a certain application (an Oracle
RDBMS), and my idea was to watch to see which files were read and which
weren't during the normal operation of the application and then delete
those that weren't used. The system in question is a SCO OpenServer 5 box.
I've been able to come up with only a couple of possibilities:

1. Re-link all of the app's executables to include a customized version of
the open() system call that records each call to it and which file was
opened. This isn't a very realistic option since I don't think all of the
executables are re-linkable (i.e. we don't have the source code or
makefiles), and I don't have much such low-level programming experience
anyway.

2. We have a second SCO box with an identical configuration that's
connected to the first via a network. I was thinking that maybe I could NFS
mount the directory tree from the second SCO box onto the first, run the
app on the first box and watch the NFS traffic between the two boxes. The
problem is that I haven't been able to figure out how to do auditing of NFS
traffic. Surely this must be possible...isn't it? Has anyone out there done
anything like this?

(And to answer the obvious question...no, Oracle has *not* been very
helpful in determining which files can stay and which can go.)

If anyone has any other ideas as to how to do this, I'd love to hear from
you. (Remove "nospam" from my return address to respond to me directly.)

Thanks in advance,
Adrian Cook
adrian.no...@prior.ca


Clinton Pierce

unread,
Mar 18, 1999, 3:00:00 AM3/18/99
to
[Poster CC'd in E-Mail]

On 18 Mar 1999 16:09:19 GMT, "Adrian Cook" <adrian.no...@prior.ca>
wrote:


>Hello,
>
> I am trying to find a way of determing which files have been read
>(though not necessarily written to) in a certain directory tree on a UNIX
>system. Determining which files have been modified is easy - just look at
>the modification timestamp - but determinig which files have been simply
>read seems to be much more difficult.

> [...]


>I've been able to come up with only a couple of possibilities:
>
>1. Re-link all of the app's executables to include a customized version of

> [...]


>2. We have a second SCO box with an identical configuration that's

Ye Gods. This sounds like rabbit hunting with Thermonuclear Weapons.

Have you tried noting the inode a-time? This is updated each time a file
is opened for reading. In fact, the find(1) command will let you do
things like:

find /usr/oracle/junk -atime +1 -print

To show you all of the files that haven't been read in the last day.
Remember though, that backing up the files and in general poking around in
them will update the a-time. Also, find(1) modifies the a-time, so that
doing this a second time will show the files as accessed. In fact, IIRC,
ls(1) will show you atimes as well:

ls -lu /usr/oracle/junk

Will do the Right Thing, and you should be able to do this more than once.

--
Clinton A. Pierce "If you rush a Miracle Man, you get rotten
cli...@geeksalad.org Miracles." -- Miracle Max, The Princess Bride
http://www.geeksalad.org

Dan Mercer

unread,
Mar 18, 1999, 3:00:00 AM3/18/99
to
In article <01be715a$bd270940$a038418e@pc>,

"Adrian Cook" <adrian.no...@prior.ca> writes:
> Hello,
>
> I am trying to find a way of determing which files have been read
> (though not necessarily written to) in a certain directory tree on a UNIX
> system. Determining which files have been modified is easy - just look at
> the modification timestamp - but determinig which files have been simply
> read seems to be much more difficult.
>

It's not. There are 3 times get - modification time, cnode modification
time and access time. You want the 3rd. To see the last access time
using ls, use ls -lu (-a was already taken). The applicable switches
to the find command are "-atime time_in_days" and "-newera file".

Please remember that you can't tell who is reading the file - so you
need to disable any backups while you figure out which files are
expendable. One last warning - diskspace is cheap compared to
mistakes. If you have files being left around, you would be better
off fixing whatever process is creating but not removing them than
simply removing them at intervals.

> Basically, I want to do this because we need to delete as many of these
> files as possible due to limitations in disk space on the target machine.
> The files in this directory tree belong to a certain application (an Oracle
> RDBMS), and my idea was to watch to see which files were read and which
> weren't during the normal operation of the application and then delete
> those that weren't used. The system in question is a SCO OpenServer 5 box.

> I've been able to come up with only a couple of possibilities:
>

[[ BAD IDEAS REMOVED
--
Dan Mercer
dame...@uswest.net

>
> (And to answer the obvious question...no, Oracle has *not* been very
> helpful in determining which files can stay and which can go.)
>
> If anyone has any other ideas as to how to do this, I'd love to hear from
> you. (Remove "nospam" from my return address to respond to me directly.)
>
> Thanks in advance,
> Adrian Cook
> adrian.no...@prior.ca
>

Opinions expressed herein are my own and may not represent those of my employer.


Tony Lawrence

unread,
Mar 18, 1999, 3:00:00 AM3/18/99
to Adrian Cook
Adrian Cook wrote:
>
> Hello,
>
> I am trying to find a way of determing which files have been read
> (though not necessarily written to) in a certain directory tree on a UNIX
> system. Determining which files have been modified is easy - just look at
> the modification timestamp - but determinig which files have been simply
> read seems to be much more difficult.
>
> Basically, I want to do this because we need to delete as many of these
> files as possible due to limitations in disk space on the target machine.
> The files in this directory tree belong to a certain application (an Oracle
> RDBMS), and my idea was to watch to see which files were read and which
> weren't during the normal operation of the application and then delete
> those that weren't used. The system in question is a SCO OpenServer 5 box.
> I've been able to come up with only a couple of possibilities:

find -atime will do it, or, if you need to be more precise, just :

l -ut

and look for current times..

--
Tony Lawrence (to...@aplawrence.com)
SCO ACE
SCO articles, help, book reviews: http://www.aplawrence.com

Ken Pizzini

unread,
Mar 18, 1999, 3:00:00 AM3/18/99
to
On 18 Mar 1999 16:09:19 GMT, Adrian Cook <adrian.no...@prior.ca> wrote:
> I am trying to find a way of determing which files have been read
>(though not necessarily written to) in a certain directory tree on a UNIX
>system. Determining which files have been modified is easy - just look at
>the modification timestamp - but determinig which files have been simply
>read seems to be much more difficult.

Use "ls -lu" instead of just "ls -l". The timestamp now shows
the last time the file was read (or the time of creation if it
never has been read). (The last-access timestamp will not
(of course) be updated on a read-only filesystem.)

If you want to know this within a C or Perl program, use the
st_atime field of the struct populated by stat(). If using
find you can also make use of the -atime test to help
you find recently accessed files.

--Ken Pizzini

Ken Pizzini

unread,
Mar 18, 1999, 3:00:00 AM3/18/99
to
On Thu, 18 Mar 1999 17:31:26 GMT, Clinton Pierce <cpie...@ford.com> wrote:
> find /usr/oracle/junk -atime +1 -print
>
>To show you all of the files that haven't been read in the last day.
>Remember though, that backing up the files and in general poking around in
>them will update the a-time. Also, find(1) modifies the a-time, so that
>doing this a second time will show the files as accessed.

While find(1) will modify the atime of any directories it traverses,
it will not update the atime of any files it sees (matching or not).
(Unless (of course) it is a side-effect of some -exec or -ok action.)

--Ken Pizzini

Geoff Clare

unread,
Mar 24, 1999, 3:00:00 AM3/24/99
to
cpie...@ford.com (Clinton Pierce) writes:

>Have you tried noting the inode a-time? This is updated each time a file
>is opened for reading. In fact, the find(1) command will let you do

^^^^^^^^^^^^^^^^^^
>things like:

> find /usr/oracle/junk -atime +1 -print

Nit-pick: just opening the file is not enough to update the a-time.
You have to read at least 1 byte from it.

--
Geoff Clare g...@unisoft.com
UniSoft Limited, London, England. g...@root.co.uk

0 new messages