Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Invoking a Lisp compiler
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 39 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Matthew Economou  
View profile  
 More options Jan 21 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Matthew Economou <xenop...@irtnog.org>
Date: 1999/01/21
Subject: [historical] Invoking a Lisp compiler
Can anyone tell me (or point me to the relevant history) how the Lisp
compiler has traditionally been invoked?  I've found some
documentation on the internals of a Multics Lisp compiler called
"lcp", but no manual page.  The modern Lisps I've run across
(e.g. Allegro CL, GCL, Chez Scheme) don't have a "batch" mode, per se
(e.g., a non-interactive program like "cc").

--
"When I was a kid, I used to think that Dammit was God's last name, just like
Christ is Jesus' last name." - Kimberly Chapman in rec.humor.oracle.d


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Jan 22 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 1999/01/22
Subject: Re: [historical] Invoking a Lisp compiler
* Matthew Economou <xenop...@irtnog.org>
| Can anyone tell me (or point me to the relevant history) how the Lisp
| compiler has traditionally been invoked?  I've found some documentation
| on the internals of a Multics Lisp compiler called "lcp", but no manual
| page.  The modern Lisps I've run across (e.g. Allegro CL, GCL, Chez
| Scheme) don't have a "batch" mode, per se (e.g., a non-interactive
| program like "cc").

  well, I'm not too sure about this part of Lisp lore, but since the
  function is COMPILE-FILE, I'd assume that it's called much like any other
  function in the listener that effectively replaces the shell command line.

  other than that, there is indeed a batch mode in Allegro CL.  e.g., I
  build a new Allegro CL image with this Makefile:

allegro.dxl: allegro libacl503.so customization.fasl make-allegro.cl update code
        ./allegro -batch -e '(load "make-allegro.cl")' -kill
        mv --force --backup --version-control=t new-lisp.dxl allegro.dxl

allegro:
        ln -s lisp allegro
        cp -p lisp.dxl allegro.dxl

customization.fasl:     allegro ../custom/*.cl
        ./allegro -batch -e '(load "../custom/customization.cl")' -kill

  and customization.cl goes like this:

(defsystem :customization
    (:pretty-name "Customizations for Naggum Software"
     :default-pathname #p"/home/franz/custom/")
  (:serial "readtable"
           "custom"
           "naggum-software"
           "defsys-fixes"
           "excl-fixes"
           "tpl-fixes"
           "format-fixes"
           "filesys-fixes"
           "socket-fixes"))

(load-system :customization :silent t :compile t :no-warn t)
(concatenate-system :customization "customization.fasl")

  incidentally, I forget why I did LOAD-SYSTEM instead of COMPILE-SYSTEM.
  there is a reason to it.

#:Erik
--
  SIGTHTBABW: a signal sent from Unix to its programmers at random
  intervals to make them remember that There Has To Be A Better Way.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthew X. Economou  
View profile  
 More options Jan 22 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: "Matthew X. Economou" <xenop...@irtnog.org>
Date: 1999/01/22
Subject: Re: [historical] Invoking a Lisp compiler

>>>>> "EN" == Erik Naggum <e...@naggum.no> writes:

    EN>   well, I'm not too sure about this part of Lisp lore, but
    EN> since the function is COMPILE-FILE, I'd assume that it's
    EN> called much like any other function in the listener that
    EN> effectively replaces the shell command line.

Would it be sensible to have a Lisp compiler that could be invoked
similarly to e.g. CC under UNIX?  I would suppose that running the
command "lcp somefile.lsp" would be semantically equivalent to the
(COMPILE-FILE "./somefile.lsp"), in this case.

    EN>   other than that, there is indeed a batch mode in Allegro CL.
    EN> e.g., I build a new Allegro CL image with this Makefile:

This prompts more questions related to compilers (and the external
environment).

EMACS uses the term "Lisp image" when describing the last stage in its
build process.  The build process, in this final step, starts up the
EMACS Lisp engine (src/temacs) and loads all of the various editing
functions into memory.  It then forces a core dump of this running
system, and the core file is then "undumped" (their terminology) into
a runnable executable (src/emacs).

Is this (effectively) what "build a new Allegro CL image" means, or
does Allegro (and, I'm assuming, other modern Lisp implementations) do
something more like the typical C or FORTRAN compiler?

(I looked up "Lisp image" in the HyperSpec and it doesn't seem to
specify this kind of build model.  This is probably a good thing.)

The related question is, can one generate DLLs and whatnot with
Allegro (and/or other modern Lisp implementations)?  Can one call DLLs
from within Allegro, and does spec say anything about this?  I'm
curious, because every use I've put Lisp to (AI, a Scheme
meta-interpreter for a class on programming language concepts, and
EMACS) invokes this "self-containedness" property (i.e. one doesn't
really call upon external libraries, just uses already available
facilites), whereas languages like ADA, Perl, C, FORTRAN, and Pascal
seem to be very good at mixing code (especially on platforms like VMS
and UNIX).

(The reason I'm asking is that I noticed a Java front end in the EGCS
code repository.  My assumption, possibly incorrect, is that the Java
front end generates the usual output that can be fed to standard tools
like as, ar, and ld, just like GCC and G++.  If Java -- a (in my mind)
primarily interpreted language, but with features like just-in-time
compilation and byte compilation -- if Java can be compiled into
object files, then linked into programs or libraries just like any C
or C++ program, why couldn't the same be done with Lisp or Scheme?

I seem to be having a difficult time picturing such a Lisp compiler
though; all my experience with Lisp has been with interpreted
environments like EMACS, GCL, or Chez Scheme.  _Lisp in Small Pieces_
arrived in the mail today, so I hope to clear some of my confusion
with Lisp compilers up.)

--
'Fed up with excitement?  Bored with fun?  Too many stimuli?  Let waves of
boredom wash over you with "The Incredibly Dull Cam".'
               -- (pointer at http://sg1.pcy.kcl.ac.uk/~foop/dullat.html)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Jan 23 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 1999/01/23
Subject: Re: [historical] Invoking a Lisp compiler
* "Matthew X. Economou" <xenop...@irtnog.org>
| Would it be sensible to have a Lisp compiler that could be invoked
| similarly to e.g. CC under UNIX?

  no, because the results aren't "linked" and "executed" from the
  command-line, except, of course, in a similar "batch" mode.

  note, however, that the command-line is really an interpreter and calls
  out to functions that are loaded from disk every time.  usually, this
  interpreter has some strange syntax for sub-evaluation and has only a few
  simple control structures, cannot use the value from the functions it has
  called except through arcane syntax, and usually you have to deal with
  their "output" through I/O, not actually any values per se.  re-using the
  output of a function may also need a lot of work because it has been
  optimized for people, not for re-use.  I prefer the REPL in Lisp...

| EMACS uses the term "Lisp image" when describing the last stage in its
| build process.  The build process, in this final step, starts up the
| EMACS Lisp engine (src/temacs) and loads all of the various editing
| functions into memory.  It then forces a core dump of this running
| system, and the core file is then "undumped" (their terminology) into a
| runnable executable (src/emacs).

  that last past is wrong.  the executable is not built from a core dump.
  by that time, too much is actually lost to be useful in building an
  executable on most operating systems.  instead, the information in memory
  is written out in a way and an order that causes it to be loaded into
  memory the same way.  basically, the process is that of preparing to make
  the operating system re-create the conditions that exist in the image at
  dump time.

| Is this (effectively) what "build a new Allegro CL image" means, or does
| Allegro (and, I'm assuming, other modern Lisp implementations) do
| something more like the typical C or FORTRAN compiler?

  no, they do basically the same thing: load it into memory, then write
  memory to disk in a way that causes it to be loaded in the same way.
  note that a C program doesn't necessarily know what it will look like in
  memory until actually loaded.  there are lots of "loose ends" that the
  operating system and the dynamic object loader take care of.

| The related question is, can one generate DLLs and whatnot with Allegro
| (and/or other modern Lisp implementations)?

  yes.  however, "DLL" is a Windows thing.  it is as yet not possible for
  end users to build shared objects or shared libraries, the Unix way.

| Can one call DLLs from within Allegro, and does spec say anything about
| this?

  which spec?  the documentation for Allegro CL details a foreign interface
  that I believe is fully capable of using any DLL the system provides.

| I'm curious, because every use I've put Lisp to (...) invokes this "self-
| containedness" property (...), whereas languages like ADA, Perl, C,
| FORTRAN, and Pascal seem to be very good at mixing code (...).

  Common Lisp implementations are better at this than most other languages
  -- they have to by virtue of being in a minority.  however, Lisp has a
  much different concept of what constitutes "interface" than C does, and
  the C calling conventions are sorely lacking in the needs that Lisp has.
  also, of course, statically typed languages discard type information by
  execution time, and Lisp keeps it around in the pointers and such.

  calling out to foreign functions is no big deal, you only need to declare
  them properly, but that is no news to people from the languages you need
  to talk to.  if the language is any good (and most aren't :), you should
  be able to generate the interface code in Lisp automatically.  however,
  there is often insufficient information on how the run-time system calls
  its own functions that it is possible to do this without actually
  compiling some interface functions and go through an expensive interface
  that has known properties.

| I seem to be having a difficult time picturing such a Lisp compiler
| though; all my experience with Lisp has been with interpreted
| environments like EMACS, GCL, or Chez Scheme.

  Emacs and GCL have compilers and you really need them.  don't know about
  Chez Scheme, but from what I have read, it does indeed compile files and
  all that stuff.

  how do you picture a C compiler?  would it be harder if you could type in
  C code to an interactive environment and see results immediately?

  BTW, I think you confuse "interpreted" with "interactive".  you really
  don't know what happens when ypu have typed in an expression.  for all
  you know or care, the system doesn't do (loop (print (eval (read)))), but
  (loop (print (funcall (compile nil `(lambda () ,(read))))))

#:Erik
--
  SIGTHTBABW: a signal sent from Unix to its programmers at random
  intervals to make them remember that There Has To Be A Better Way.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rainer Joswig  
View profile  
 More options Jan 23 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: jos...@lavielle.com (Rainer Joswig)
Date: 1999/01/23
Subject: Re: [historical] Invoking a Lisp compiler
In article <w4og192r3j3....@nemesis.irtnog.org>, "Matthew X. Economou" <xenop...@irtnog.org> wrote:

> The related question is, can one generate DLLs and whatnot with
> Allegro (and/or other modern Lisp implementations)?

Lispworks for Windows supports compiling to DDL.

> I seem to be having a difficult time picturing such a Lisp compiler
> though; all my experience with Lisp has been with interpreted
> environments like EMACS, GCL, or Chez Scheme.

Hmm, hasn't GCL a compiler (compiles to C). Hmm, hasn't
Chez Scheme a compiler, too.

--
http://www.lavielle.com/~joswig


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Rodgers  
View profile  
 More options Jan 23 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: m...@wildcard.butterfly.demon.co.uk (Martin Rodgers)
Date: 1999/01/23
Subject: Re: [historical] Invoking a Lisp compiler
In article <w4og192r3j3....@nemesis.irtnog.org>, xenop...@irtnog.org
says...

> Would it be sensible to have a Lisp compiler that could be invoked
> similarly to e.g. CC under UNIX?  I would suppose that running the
> command "lcp somefile.lsp" would be semantically equivalent to the
> (COMPILE-FILE "./somefile.lsp"), in this case.

You can do this with CLISP. Here's an example:

all : nodes.fas

nodes.fas : nodes.lisp
    clisp -c nodes.lisp

The clisp script invokes CLISP for me.
--
Remove insect from address to email me | You can never browse enough
     will write code that writes code that writes code for food


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Jan 23 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 1999/01/23
Subject: Re: [historical] Invoking a Lisp compiler
* "Matthew X. Economou" <xenop...@irtnog.org>
| Would it be sensible to have a Lisp compiler that could be invoked
| similarly to e.g. CC under UNIX?

  just to update my previous response.  Allegro CL 5.0 has more command-
  line "punch" than I related in the previous article.  I wrote the build
  stuff for 4.3 and never really took the time to update them.  Allegro CL
  5.0 takes some new command-line options, listed here with the old
  equivalent option, minus the necessary quoting the shell needs.

-C file     -e (compile-file file)
-L file     -e (load file)

  some probably find this a lot easier to deal with, especially from
  scripts.

#:Erik
--
  SIGTHTBABW: a signal sent from Unix to its programmers at random
  intervals to make them remember that There Has To Be A Better Way.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Raffael Cavallaro  
View profile  
 More options Jan 24 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: raff...@mediaone.net (Raffael Cavallaro)
Date: 1999/01/24
Subject: Re: [historical] Invoking a Lisp compiler

In article <3126088259895...@naggum.no>, Erik Naggum <e...@naggum.no> wrote:
> BTW, I think you confuse "interpreted" with "interactive".  you really
>  don't know what happens when ypu have typed in an expression.  for all
>  you know or care, the system doesn't do (loop (print (eval (read)))), but
>  (loop (print (funcall (compile nil `(lambda () ,(read))))))

Exactly. For example, MCL does this and only this (i.e., there is no
interpreter, just an incremental compiler fast enough to be
indistinguishable from an interpreter). Everything the user enters
interactively (at the top-level prompt) is compiled to native PPC code
immediately.

raf

--
Raffael Cavallaro


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Warnock  
View profile  
 More options Jan 25 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: r...@rigden.engr.sgi.com (Rob Warnock)
Date: 1999/01/25
Subject: Re: [historical] Invoking a Lisp compiler
Erik Naggum  <e...@naggum.no> wrote:
+---------------
| | The related question is, can one generate DLLs and whatnot with Allegro
| | (and/or other modern Lisp implementations)?
|
|   yes.  however, "DLL" is a Windows thing.  it is as yet not possible for
|   end users to build shared objects or shared libraries, the Unix way.
+---------------

Actually, the latest version of MzScheme provides a compiler-to-C ["mzc"]
that allows compiling Scheme code into a MzScheme "extension" which in Unix
is a ".so" file (a.k.a. DSO). Such files can later be linked into a running
MzScheme session with the [non-standard] call (load-extension "foo.so").

-Rob

-----
Rob Warnock, 8L-855             r...@sgi.com
Applied Networking              http://reality.sgi.com/rpw3/
Silicon Graphics, Inc.          Phone: 650-933-1673
2011 N. Shoreline Blvd.         FAX: 650-964-0811
Mountain View, CA  94043        PP-ASEL-IA


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Jan 25 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 1999/01/25
Subject: Re: [historical] Invoking a Lisp compiler
* r...@rigden.engr.sgi.com (Rob Warnock)
| Actually, the latest version of MzScheme provides a compiler-to-C ["mzc"]
| that allows compiling Scheme code into a MzScheme "extension" which in Unix
| is a ".so" file (a.k.a. DSO). Such files can later be linked into a running
| MzScheme session with the [non-standard] call (load-extension "foo.so").

  the question was about Allegro CL, not about MzScheme.

  personally, I don't think "Scheme" is the answer to _any_ Common Lisp
  question to begin with, but may just be me.

#:Erik
--
  SIGTHTBABW: a signal sent from Unix to its programmers at random
  intervals to make them remember that There Has To Be A Better Way.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthew X. Economou  
View profile  
 More options Jan 25 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: "Matthew X. Economou" <xenop...@irtnog.org>
Date: 1999/01/25
Subject: Re: [historical] Invoking a Lisp compiler

>>>>> "RW" == Rob Warnock <r...@rigden.engr.sgi.com> writes:

    RW> Actually, the latest version of MzScheme provides a
    RW> compiler-to-C ["mzc"] that allows compiling Scheme code into a
    RW> MzScheme "extension" which in Unix is a ".so" file
    RW> (a.k.a. DSO). Such files can later be linked into a running
    RW> MzScheme session with the [non-standard] call (load-extension
    RW> "foo.so").

Can you point me to some sources?  I'd like to take a look at how it
handles linking and the runtime.

--
"When I was a kid, I used to think that Dammit was God's last name, just like
Christ is Jesus' last name." - Kimberly Chapman in rec.humor.oracle.d


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Matthew X. Economou  
View profile  
 More options Jan 25 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: "Matthew X. Economou" <xenop...@irtnog.org>
Date: 1999/01/25
Subject: Re: [historical] Invoking a Lisp compiler

>>>>> "EN" == Erik Naggum <e...@naggum.no> writes:

    EN>   the question was about Allegro CL, not about MzScheme.

Actually, my original question included Chez Scheme.  :)

    EN>   personally, I don't think "Scheme" is the answer to _any_
    EN> Common Lisp question to begin with, but may just be me.

Note to self: Never, ever, get on Erik's bad side.

--
"When I was a kid, I used to think that Dammit was God's last name, just like
Christ is Jesus' last name." - Kimberly Chapman in rec.humor.oracle.d


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Jan 26 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 1999/01/26
Subject: Re: [historical] Invoking a Lisp compiler
* "Matthew X. Economou" <xenop...@irtnog.org>
| Note to self: Never, ever, get on Erik's bad side.

  is it not just great, then, that I tend to forget _who_ did something
  that annoyed me?  :)

#:Erik
--
  SIGTHTBABW: a signal sent from Unix to its programmers at random
  intervals to make them remember that There Has To Be A Better Way.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rob Warnock  
View profile  
 More options Jan 26 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: r...@rigden.engr.sgi.com (Rob Warnock)
Date: 1999/01/26
Subject: Re: [historical] Invoking a Lisp compiler
Matthew X. Economou <xenop...@irtnog.org> wrote:
+---------------
| RW> Actually, the latest version of MzScheme provides a
| RW> compiler-to-C ["mzc"] that allows compiling Scheme code into a
| RW> MzScheme "extension" which in Unix is a ".so" file ...
|
| Can you point me to some sources?  I'd like to take a look at how it
| handles linking and the runtime.
+---------------

Start here:

        http://www.cs.rice.edu/CS/PLT/packages/mzscheme/

-Rob

-----
Rob Warnock, 8L-855             r...@sgi.com
Applied Networking              http://reality.sgi.com/rpw3/
Silicon Graphics, Inc.          Phone: 650-933-1673
2011 N. Shoreline Blvd.         FAX: 650-964-0811
Mountain View, CA  94043        PP-ASEL-IA


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Barry Margolin  
View profile  
 More options Jan 26 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Barry Margolin <bar...@bbnplanet.com>
Date: 1999/01/26
Subject: Re: [historical] Invoking a Lisp compiler
In article <3126305073280...@naggum.no>, Erik Naggum  <e...@naggum.no> wrote:

>* "Matthew X. Economou" <xenop...@irtnog.org>
>| Note to self: Never, ever, get on Erik's bad side.

>  is it not just great, then, that I tend to forget _who_ did something
>  that annoyed me?  :)

I'll bet you still remember me. :)

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Erik Naggum  
View profile  
 More options Jan 26 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Erik Naggum <e...@naggum.no>
Date: 1999/01/26
Subject: Re: [historical] Invoking a Lisp compiler
* Barry Margolin <bar...@bbnplanet.com>
| I'll bet you still remember me. :)

  *laugh*  yeah, I do. :)

#:Erik
--
  SIGTHTBABW: a signal sent from Unix to its programmers at random
  intervals to make them remember that There Has To Be A Better Way.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "fuzzy string matcher in Lisp" by Dieter Menszner
Dieter Menszner  
View profile  
 More options Jan 28 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: mensz...@t-online.de (Dieter Menszner)
Date: 1999/01/28
Subject: fuzzy string matcher in Lisp

I am looking for a Lisp function which does a fuzzy string match
e.g.

   (fuzzy-match "document" "documet")
  => t

Its purpose is to be typing error tolerant.

Does anybody know of some downloadable source code ?

  Dieter Menszner


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Sunil Mishra  
View profile  
 More options Jan 28 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Sunil Mishra <smis...@whizzy.cc.gatech.edu>
Date: 1999/01/28
Subject: Re: fuzzy string matcher in Lisp

mensz...@t-online.de (Dieter Menszner) writes:
> I am looking for a Lisp function which does a fuzzy string match
> e.g.

>    (fuzzy-match "document" "documet")
>   => t

> Its purpose is to be typing error tolerant.

> Does anybody know of some downloadable source code ?

>   Dieter Menszner

I've written a partial set of routines for doing string search that you
could potentially adapt to do string matching. In your case, you seem to
want to handle substitution, deletion and insertion errors. This is what I
had done for string search allowing for these errors:

(defun smvs-match-first-item (pattern text evaluator key shifts errors)
  (loop with pattern-item = (funcall key (aref pattern 0))
        for text-index from 0 below (length text)
        for text-item = (funcall key (aref text text-index))
        do (setf (aref shifts text-index) text-index)
           (setf (aref errors text-index) (funcall evaluator pattern-item text-item))))

(defun smvs-match-next-item (pattern text evaluator key error-limit
                                     shifts last-shifts errors last-errors pattern-index)
  (flet ((eval-delete-both (pre-delete-error pattern-item text-item)
           (if (> pre-delete-error error-limit)
             pre-delete-error
             (+ pre-delete-error (funcall evaluator pattern-item text-item))))
         (save-match-result (shift error text-index)
           (setf (aref shifts text-index) shift)
           (setf (aref errors text-index) error)))
    (declare (inline eval-delete-both save-match-result))
    (loop with pattern-item = (funcall key (aref pattern pattern-index))
          for text-index from 1 below (length text)
          for text-item = (funcall key (aref text text-index))
          for delete-text-shift = (aref shifts (1- text-index))
          and delete-text-error = (1+ (aref errors (1- text-index)))
          and delete-pattern-shift = (aref last-shifts text-index)
          and delete-pattern-error = (1+ (aref last-errors text-index))
          and delete-both-shift = (aref last-shifts (1- text-index))
          and delete-both-error = (eval-delete-both (aref last-errors (1- text-index))
                                                    pattern-item text-item)
          do (if (> delete-text-error delete-pattern-error)
               (if (> delete-pattern-error delete-both-error)
                 (save-match-result delete-both-shift delete-both-error text-index)
                 (save-match-result delete-pattern-shift delete-pattern-error text-index))
               (if (> delete-text-error delete-both-error)
                 (save-match-result delete-both-shift delete-both-error text-index)
                 (save-match-result delete-text-shift delete-text-error text-index))))))

(defun string-match-variable-size (pattern text handler evaluator key error-limit)
  ;; Using dynamic programming
  ;; I don't have to keep around the entire substring match matrix, since
  ;; the match only depends on the current and the last row. This allows
  ;; for an enormous reduction in memory use for large patterns.
  (let* ((pattern-length (length pattern))
         (text-length (length text))
         (shifts (make-array text-length :element-type 'fixnum))
         (last-shifts (make-array text-length :element-type 'fixnum))
         (errors (make-array text-length))
         (last-errors (make-array text-length)))
    ;; Ignore first column
    (setf (aref last-errors 0) (1+ error-limit))
    ;; Initialization - match first pattern item
    (smvs-match-first-item pattern text evaluator key shifts errors)
    ;; Expansion - build on first match
    (loop for pattern-index from 1 below pattern-length
          do (psetq shifts last-shifts
                    last-shifts shifts
                    errors last-errors
                    last-errors errors)
             (setf (aref errors 0) (1+ error-limit))
             (smvs-match-next-item pattern text evaluator key error-limit
                                   shifts last-shifts errors last-errors pattern-index))
    ;; Collection - call handler
    (loop with last-shift = -1
          for text-index from  (if (> pattern-length 1) 1 0) below text-length
          for shift = (aref shifts text-index)
          when (and (<= (aref errors text-index) error-limit) (> shift last-shift))
            do (setq last-shift shift)
               (funcall handler pattern text shift (1+ text-index)))))


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kent M Pitman  
View profile  
 More options Jan 28 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Kent M Pitman <pit...@world.std.com>
Date: 1999/01/28
Subject: Re: fuzzy string matcher in Lisp

mensz...@t-online.de (Dieter Menszner) writes:
> I am looking for a Lisp function which does a fuzzy string match
> e.g.

>    (fuzzy-match "document" "documet")
>   => t

> Its purpose is to be typing error tolerant.

Excellent goal.  I wish more people cared about such things.

> Does anybody know of some downloadable source code ?

I don't.  However...

A long time ago (around 1979) I had this problem to solve.  I wrote a
trivial program that scored matches by counting 1 point for every char
in the right position and 0.5 for every char that was off by one and
that scaled the result by the differences in the lengths.  so, for
example,

  d   o   c   u   m   e   n   t  
  d   o   c   u   m   e   t  

  1   1   1   1   1   1  0.5 = 6.5 out of possible 8 match chars

 So (6.5/8 match chars) * (7/8 match length) = 71.1% match total

or

  d   o   c   u   m   e   n   t

  d   o   c   u   m   n   e   t

  1   1   1   1   1  0.5 0.5  1 = 7/8 match chars

 * 8/8 match length = 87.5% match

I found that you rarely get more than 1 match higher than 0.5
in a small finite set that is relatively varied in nature.
That is, if you're trying to match one of
 "document" or "frog" or "apple"
If you have a lot of words that are the same, you may need to
sort by bestness when you get scores for each.

In case anyone is curious, I used the algorithm for allowing a user
interface to a game I'd written to do things like:

 PROGRAM:  Is your animal a frgo?

 USER: You misspelled frog.

 PROGRAM: Ok.  Is your animal a frog?

The program would match "frog" against each of the words in the
previous query and guess that "frog" was replacing "frgo" because
it had a good "closeness" to the given word according to the
metric that I've described above.

I later also generalized it to work with PDP10 Macsyma input so
you could get an interaction like:

 USER MIS-TYPES:  intergate(sin(x),x);

 SYSTEM DOESN'T UNDERSTAND "intergate" (should have been "integrate")

 USER PRESSES CONTROL-BACKSLASH

 SYSTEM RESPONDS: What did you misspell?

 USER TYPES:   integrate;

 SYSTEM RETRIES "integrate(sin(x),x);"

I note as an aside that Zork also had a command interface that
allowed you to say "oops <word>" when you misspelled a word.
I don't know what algorithm they used to match which word to
replace, but I strongly approve of the interface for not having
required you to mention both the old and new word.
A technique of this kinds is nice because it minimizes
gratuitous syntax that the user has to type.  No reasonable
person would ask "which word should be spelled `frog'"
and so, I claim, one of the problems with the programs most
people write is that they think it's reasonable for a program
to either ask that question or to ask "what word should I
replace?"  

I don't have code.  But if the problem is that you just
didn't have a theory of how to do the match at all, then
maybe the above gives you enough clue of how you might
do it.  Then again, my approach was entirely ad hoc
and you miht be able to think of another way of doing it
better.  Or someone might have some formal theory of
what it means to do an "exact" fuzzy match (if that's
not an oxymoron)...  in which case, the above might
be useles to you.  But it's the best I had to offer
since I'm too lazy to actually write the extremely
straightforward code to implement it today...
Good luck...
 --Kent


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David B. Lamkins  
View profile  
 More options Jan 28 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: "David B. Lamkins" <dlamk...@teleport.com>
Date: 1999/01/28
Subject: Re: fuzzy string matcher in Lisp
In article <79hoaKzh...@0796153227-1000.dialin.t-online.de> ,

mensz...@t-online.de (Dieter Menszner) wrote:

> I am looking for a Lisp function which does a fuzzy string match
> e.g.

>    (fuzzy-match "document" "documet")
>   => t

> Its purpose is to be typing error tolerant.

> Does anybody know of some downloadable source code ?

There was an interesting article about computing the "Levenshtein distance"
between two strings presented by Ray Valdez in Dr. Dobbs Journal almost
seven years ago.

I had written a Pascal implementation to try it out, and I still have that
code in my archives.  Here's the header comment, which has all the source
references:

{Compute the Levenshtein distance between a pair of strings.}
{Adapted from C code presented in "Finding String Distances",}
{Ray Valdés, Dr. Dobb¹s Journal, April 1992, ppg. 56‹ 62, 107.}
{Algorithm due to V.I. Levenshtein, as presented in "Time Warps,}
{String Edits, and MacroMolecules: The Theory and Practice of}
{Sequence Comparison", Sankoff and Kruskal, eds., Addison­Wesley,}
{1983. Macintosh implementation for THINK Pascal by D.B.Lamkins.}

The API has a call to initialize the algorithm using four values to specify
the cost of matching, insertion, deletion, and transposition.  Then you call
the comparison function with a pair of strings, and the algorithm returns a
distance based upon the total cost of the edits required to transform one
string into another (and, optionally, a list of the edit operations needed.)

You could get your fuzzy string match by computing the Levenshtein distance
and comparing the result to some threshold.

I thought I had translated the algorithm into Common Lisp at one point, but
I can't find it now.  I'd be happy to send or post the source if someone
wants to take a shot at the translation, or you could check out the original
sources referenced above.

--
David B. Lamkins <http://www.teleport.com/~dlamkins/>

There are many ways to abbreviate something, but only one way not to.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Will Hartung  
View profile  
 More options Jan 29 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: "Will Hartung" <Will.Hart...@msoft.com>
Date: 1999/01/29
Subject: Re: fuzzy string matcher in Lisp

>and you >miht< be able to think of another way of doing it

  m   i   g   h   t
  m   i   h   t
  1   1 0.5 0.5
(3/5) * (4/5) = .48

*FAIL*

"I'm sorry, I don't understand 'miht'" :-)

Not to super relevant, and I may be mistaken, but you may be able to find
suitable code in the Squeak Smalltalk sources code.

I haven't played with it recently, but the original Smalltalk has a
"spelling corrector" when you mispelled variables and what not. When it
detected a misspelling, it popped up a list of possible matches.

Assuming that Squeak still has the functionality, then the source would be
in the distribution. Yeah, it's in Smalltalk, but its better than nothing.

You can find Squeak at http://www.create.ucsb.edu/squeak/

It could be a wild goose chase, but may be worth looking into. (Or for the
lazy, get on their mailing list or over on c.l.smalltalk and see if someone
can strip out the pertinent code.)

> But it's the best I had to offer
>since I'm too lazy to actually write the extremely
>straightforward code to implement it today...
>Good luck...
> --Kent

What he said.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
paul_rudin  
View profile  
 More options Jan 29 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: paul_ru...@scientia.com
Date: 1999/01/29
Subject: Re: fuzzy string matcher in Lisp
In article <79hoaKzh...@0796153227-1000.dialin.t-online.de>,
  mensz...@t-online.de (Dieter Menszner) wrote:

> I am looking for a Lisp function which does a fuzzy string match
> e.g.

>    (fuzzy-match "document" "documet")
>   => t

> Its purpose is to be typing error tolerant.

> Does anybody know of some downloadable source code ?

Gnus, the emacs newsreader, does some fuzzy matching of this kind for the
purposes of analysing news headers. This is written in elisp and is available
almost anywhere (and presumably subject to the GPL). It may be of some help to
you.

-----------== Posted via Deja News, The Discussion Network ==----------
http://www.dejanews.com/       Search, Read, Discuss, or Start Your Own    


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Robert Bechtel  
View profile  
 More options Jan 29 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Robert Bechtel <rbech...@home.com>
Date: 1999/01/29
Subject: Re: fuzzy string matcher in Lisp
Interlisp.  Spelling correction as part of the DWIM (Do What I Mean)
package.  Doubt that source is available, but if you can track down a
manual, there is a description of the algorithm used, as well as
descriptions of the functions that make up the spelling corrector.

While there's a lot more to it, the first paragraph of section 15.7.4 of
the October 1983 _Interlisp_Reference_Manual_ by Xerox says:

"The basic philosophy of DWIM spelling correction is to count the number
of disagreements between two words, and use this number divided by the
length of the longer of the two words as a measure of their relative
disagreement.  One minus this number is then the relative agreement or
closeness.  For example, CONS and CONX differ only in their last
character.  Such substitution errors count as one disagreement, so that
the two words are in 75% agreement.  Most calls to the spelling
corrector specify a relative agreement of 70, so that a single
substitution error is permitted in words of four characters of longer.
However, spelling correction on shorter words is possible since certain
types of differences such as single transpositions are not counted as
disagreements.  For example, AND and NAD have a relative agreement of
100.  Calls to the spelling corrector from DWIM use the value of
FIXSPELLREL, which is initially 70.  Note that by setting FIXSPELLREL to
100, only spelling corrections with "zero" mistakes, will be considered,
e.g., transpositions, double characters, etc."

There are several other well-thought-out variations (considering shifts,
keyboard locality, etc.), at least if you're typing "English Lisp"
("American Lisp"?).

Grumble.  Makes me remember all the other neat stuff that didn't make
the Common Lisp merge.

bob bechtel


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Kent M Pitman  
View profile  
 More options Jan 29 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Kent M Pitman <pit...@world.std.com>
Date: 1999/01/29
Subject: Re: fuzzy string matcher in Lisp

"Will Hartung" <Will.Hart...@msoft.com> writes:
> >and you >miht< be able to think of another way of doing it
>   m   i   g   h   t
>   m   i   h   t
>   1   1 0.5 0.5
> (3/5) * (4/5) = .48
> *FAIL*
> "I'm sorry, I don't understand 'miht'" :-)

heh.  yeah, it does degrade a bit for short words.  in fairness,
though, a typo makes more difference in a short word than in a
long one.  but your point is well-taken.

btw, i didn't just suddenly become a worse speller a couple months
ago.  rather, i switched to using a PC with a really annoying keypress
sensitivity that loses a lot of chars that other keyboards don't.  i
often end up finding i've lost letters on this keyboard... sometimes i
fix them, but sometimes it's too much trouble.  it's amazing to me
that with all the world relying so heavily on keyboards there isn't
more variety available in order to assure better quality.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Philip Lijnzaad  
View profile  
 More options Jan 29 1999, 3:00 am
Newsgroups: comp.lang.lisp
From: Philip Lijnzaad <lijnz...@ebi.ac.uk>
Date: 1999/01/29
Subject: Re: fuzzy string matcher in Lisp
On Fri, 29 Jan 1999 03:56:49 GMT,
"Kent" == Kent M Pitman <pit...@world.std.com> writes:

Kent> ago.  rather, i switched to using a PC with a really annoying keypress
Kent> sensitivity that loses a lot of chars that other keyboards don't.  

Seeing that you use Gnus as a news reader, you could consider using emacs
flyspell-mode, which does on-the-fly spell checking (highlighting typo's and
repeated words), and works like a dream. It's built in to emacs 20.3 and
higher, but I suspect it will work on emacs 19.34.

                                                                      Philip
--
erh ... the laziness you wield is the patience you yield, sort of thing %-/
--------------------------------------------------------------------------- --
Philip Lijnzaad, lijnz...@ebi.ac.uk | European Bioinformatics Institute
+44 (0)1223 49 4639                 | Wellcome Trust Genome Campus, Hinxton
+44 (0)1223 49 4468 (fax)           | Cambridgeshire CB10 1SD,  GREAT BRITAIN
PGP fingerprint: E1 03 BF 80 94 61 B6 FC  50 3D 1F 64 40 75 FB 53


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 39   Newer >
« Back to Discussions « Newer topic     Older topic »