Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

file-find

7 views
Skip to first unread message

Bud

unread,
Mar 27, 1999, 3:00:00 AM3/27/99
to
Has anyone written a simple find-file in LISP? I want to search all hard
drives, directories, and subdirectories for a file.


Sunil Mishra

unread,
Mar 27, 1999, 3:00:00 AM3/27/99
to
"Bud" <bude...@email.msn.com> writes:

> Has anyone written a simple find-file in LISP? I want to search all hard
> drives, directories, and subdirectories for a file.

To find all lisp files in my home directory:

(directory "~/**/*.lisp")

(Not tested, but I'm pretty sure this will work fine.)

To find out more about lisp pathnames, check one of these two sites:

http://www.harlequin.com/books/HyperSpec/FrontMatter/Starting-Points.html
http://www-cgi.cs.cmu.edu/afs/cs.cmu.edu/project/ai-repository/ai/html/cltl/cltl2.html

It takes a little time to understand fully all the facilities lisp
provides, but it's well worth it.

Sunil

Christopher R. Barry

unread,
Mar 27, 1999, 3:00:00 AM3/27/99
to
Sunil Mishra <smi...@whizzy.cc.gatech.edu> writes:

> "Bud" <bude...@email.msn.com> writes:
>
> > Has anyone written a simple find-file in LISP? I want to search all hard
> > drives, directories, and subdirectories for a file.
>
> To find all lisp files in my home directory:
>
> (directory "~/**/*.lisp")
>
> (Not tested, but I'm pretty sure this will work fine.)

It's implementation dependent whether that will work or not. It does
not work with CMU Common Lisp, and is _incredibly_ slow (the better
part of a minute compared to the fraction of a second "find" takes)
with Allegro CL. Additionally, it will fail with Allegro CL if any of
your files have a ":" in them, for Allegro will identify the part of
the file-name before the ":" as a host-name and will give an error.

The only way to do this portably is to test whether each name in a
directory is a file or a directory, and move into it and repeat if
it's a directory. There is no DIRECTORYP function in ANSI CL, but the
last time the CL file stuff was discussed I believe someone posted a
portable one.

Christopher

Sunil Mishra

unread,
Mar 27, 1999, 3:00:00 AM3/27/99
to
cba...@2xtreme.net (Christopher R. Barry) writes:

> Sunil Mishra <smi...@whizzy.cc.gatech.edu> writes:
>
> > "Bud" <bude...@email.msn.com> writes:
> >
> > > Has anyone written a simple find-file in LISP? I want to search all hard
> > > drives, directories, and subdirectories for a file.
> >
> > To find all lisp files in my home directory:
> >
> > (directory "~/**/*.lisp")
> >
> > (Not tested, but I'm pretty sure this will work fine.)
>
> It's implementation dependent whether that will work or not. It does
> not work with CMU Common Lisp, and is _incredibly_ slow (the better
> part of a minute compared to the fraction of a second "find" takes)
> with Allegro CL. Additionally, it will fail with Allegro CL if any of
> your files have a ":" in them, for Allegro will identify the part of
> the file-name before the ":" as a host-name and will give an error.

Arguably these are implementation issues. Speed certainly is. If ACL is
braindead enough to attempt to interpret the pathname-string of a file that
is found, then it alone is to blame. But looking at the original post
again, my solution is *very* faulty at a higher level (discussed below).

> The only way to do this portably is to test whether each name in a
> directory is a file or a directory, and move into it and repeat if
> it's a directory. There is no DIRECTORYP function in ANSI CL, but the
> last time the CL file stuff was discussed I believe someone posted a
> portable one.

This may not be sufficient to deal with the speed issue, expecially if you
want to compare find with lisp.

Find has the following properties that allow it additional speed:

1. It does not collect by default. In other words, whatever replacement we
want in lisp should take a handler, not attempt to construct a list.
2. It does not build up a pathname structure. Lisp deals with pathnames,
which is nice for higher level operations. But I suspect that the amount
of structure wasted in building pathnames (especially for files that may
not even be relevant) is what is responsible for slowing down DIRECTORY.
3. It can take many, many more arguments to narrow down the scope of the
search.
4. It does not have to be file system independent.

Of these, the first three we can hope to do something about. The fourth has
to be a design decision. It may in fact be counter-productive to have a
file-system independent find.

The biggest slow-down, I'm willing to bet, would be from 2, and to deal
with it effectively I suspect the only way would be to try to go around the
lisp functionality. In other words, given a find command, I would like to:

0. Divide all the criteria presented into those that require parsing, and
those that do no.
1. Get all the entries in the directory, along with the associated
data. Trivially eliminate all the entries that do not meet the criteria.
2. Parse the entry. We know it is a file or a directory, so we should be
able to ignore all the nonsense about hosts and pathnames. Additionally
the underlying OS should also be able to distinguish between file and
directory (and link and ...) which is necessary for getting find to
work. This should further clarify how the entry should be parsed. Then
construct a pathname taking the start directory as the default. This
ought to result in a *huge* saving in time.
3. Apply the pathname-dependent constraints.
4. For those that satisfy all constraints, call the handler.

A similar mechanism would be necessary to get acceptable speed from
DIRECTORY. I bet CMUCL and ACL first obtain the entire string for the file
pathname, and then call PATHNAME on it. If that is the case, using the
portable DIRECTORY-P and DIRECTORY will be at least as slow as using
DIRECTORY as I had. We know a lot about the file already when we know how
to parse part of the path, and reusing this information will get us better
speed and correctness.

Sunil

Erik Naggum

unread,
Mar 28, 1999, 3:00:00 AM3/28/99
to
* cba...@2xtreme.net (Christopher R. Barry)

| Additionally, it will fail with Allegro CL if any of your files have a
| ":" in them, for Allegro will identify the part of the file-name before
| the ":" as a host-name and will give an error.

which version are you using? _years_ have passed since I reported and
Franz Inc fixed that problem. there's a reason software has version
identifiers, you know. honest people do not fail to include them.

#:Erik

Christopher R. Barry

unread,
Mar 28, 1999, 3:00:00 AM3/28/99
to
Erik Naggum <er...@naggum.no> writes:

I have a file named "Ezekial-13:18" in a bible-notes subdirectory of
my home directory.

Allegro CL Trial Edition 5.0 [Linux/X86] (8/29/98 10:57)
Copyright (C) 1985-1998, Franz Inc., Berkeley, CA, USA. All Rights Reserved.
;; Optimization settings: safety 1, space 1, speed 1, debug 2.
;; For a complete description of all compiler switches given the
;; current optimization settings evaluate (EXPLAIN-COMPILER-SETTINGS).
USER(1): (directory "~/**/*.lisp")
Error: host "Ezekial-13" not found in (sys:hosts.cl)
[condition type: SIMPLE-ERROR]

Restart actions (select using :continue):
0: :TRY-AGAIN
1: Return to Top Level (an "abort" restart)
[1] USER(2):

Christopher

Erik Naggum

unread,
Mar 28, 1999, 3:00:00 AM3/28/99
to
* cba...@2xtreme.net (Christopher R. Barry)
| I have a file named "Ezekial-13:18" in a bible-notes subdirectory of
| my home directory.

I checked with an unpatched 5.0, and the error is still there. it turns
out I had made a fix long ago that avoided this situation entirely. I
have now filed a new bug report, on which you have been copied.

#:Erik

Gareth McCaughan

unread,
Mar 29, 1999, 3:00:00 AM3/29/99
to
Christopher Barry wrote:

> I have a file named "Ezekial-13:18" in a bible-notes subdirectory of
> my home directory.

...


> USER(1): (directory "~/**/*.lisp")
> Error: host "Ezekial-13" not found in (sys:hosts.cl)
> [condition type: SIMPLE-ERROR]

Maybe it just noticed that you misspelled "Ezekiel" and thought
it should warn you. :-)

--
Gareth McCaughan Dept. of Pure Mathematics & Mathematical Statistics,
gj...@dpmms.cam.ac.uk Cambridge University, England.

Christopher R. Barry

unread,
Mar 29, 1999, 3:00:00 AM3/29/99
to
Gareth McCaughan <gj...@dpmms.cam.ac.uk> writes:

> Maybe it just noticed that you misspelled "Ezekiel" and thought
> it should warn you. :-)

Yeah.

Christopher

0 new messages