Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Lisp problems (maybe emacs)

19 views
Skip to first unread message

Charlie

unread,
Nov 18, 2002, 6:58:00 AM11/18/02
to
When I do M-C-x (lisp-eval-defun) in emacs' normal lisp mode to the
following defun I get errors from both clisp and cmucl. I include the output
from cmucl because it looks more descriptive. When I copy+paste into either
interpreter or do a (load "") of the file I get no complaints and it all
works nicely. What's going on (I do plan to switch to ILISP at some point
but for now I'd like to keep using this)
Thanks
Charlie

Here is the function and cmucl output:

(defun translate (x y z)
(make-array '(4 4) :initial-contents
`((1 0 0 ,x)
(0 1 0 ,y)
(0 0 1 ,z)
(0 0 0 1))))
;----------------------

Error in KERNEL::UNBOUND-SYMBOL-ERROR-HANDLER: the variable X is unbound.

Restarts:
0: [ABORT] Return to Top-Level.

Debug (type H for help)

(EVAL X)
Source: Error finding source:
Error in function DEBUG::GET-FILE-TOP-LEVEL-FORM: Source file no longer
exists:
target:code/eval.lisp.
0]

Daniel Barlow

unread,
Nov 18, 2002, 6:57:53 AM11/18/02
to
"Charlie" <char...@zoom.co.uk> writes:

> When I do M-C-x (lisp-eval-defun) in emacs' normal lisp mode to the
> following defun I get errors from both clisp and cmucl. I include the output
> from cmucl because it looks more descriptive. When I copy+paste into either
> interpreter or do a (load "") of the file I get no complaints and it all
> works nicely. What's going on (I do plan to switch to ILISP at some point
> but for now I'd like to keep using this)
> Thanks
> Charlie
>
> Here is the function and cmucl output:
>
> (defun translate (x y z)
> (make-array '(4 4) :initial-contents

If your function actually is indented as shown, I think it's confusing
inferior-lisp.

Normally a defun starts when there is an char with open-parenthesis
syntax at the beginning of a line. If `defun-prompt-regexp' is
non-nil, then a string which matches that regexp may precede the
open-parenthesis, and point ends up at the beginning of the line.

(beginning-of-defun documentation)

so it's only sending the function body to the inferior lisp, instead
of the whole defun. I tried this with the simpler function

(defun badly-indented ()
(+ 1 2))

which does indeed cause cmucl to print ``3''.

Workaround: indent the function in the conventional fashion. This is
a good thing to do anyway.


-dan

--

http://ww.telent.net/cliki/ - Link farm for free CL-on-Unix resources

Charlie

unread,
Nov 18, 2002, 7:17:34 AM11/18/02
to
Yep, that works. I feel stupid for asking such a simple question.
Thank-you.
Charlie

Tim Bradshaw

unread,
Nov 18, 2002, 8:41:58 AM11/18/02
to
* charlieb wrote:
> Yep, that works. I feel stupid for asking such a simple question.

You shouldn't! Other than the indentation the function was actually
fine, which makes a change, and you'd clearly actually tried it (in
more than one implementation even) before asking. These are the kinds
of questions people *like*.

--tim

Charlie

unread,
Nov 18, 2002, 9:48:59 AM11/18/02
to
You'd think I'd have tried indenting correctly before I went off and tried
it with all those different things! It's just frustrating is all.
Cheers
Charlie

"Tim Bradshaw" <t...@cley.com> wrote in message
news:ey37kfb...@cley.com...

Kurt B. Kaiser

unread,
Nov 18, 2002, 4:06:02 PM11/18/02
to
Tim Bradshaw <t...@cley.com> writes:

> Other than the indentation the function was actually fine

If course it's all those silly parens which make Lisp unambiguous :) :)

KBK

Tim Bradshaw

unread,
Nov 18, 2002, 5:49:31 PM11/18/02
to
* Kurt B Kaiser wrote:
> If course it's all those silly parens which make Lisp unambiguous :) :)

Obviated by annoying editors which don't actually look for a top-level
form but decide that the cheap trick of looking for an open paren at
the top level is enough. Which it is, just often enough so that when
it isn't it *really* hurts.

--tim

Charlie

unread,
Nov 19, 2002, 4:11:46 AM11/19/02
to
You know I tried placing the cursor all over the place: before the defun, on
the defun, after the defun. Nothing worked. I'm not about to start insulting
emacs but you'd think an application written in lisp would have better
support for it without having to go the non-libre (for the moment) ilisp
route.

Charlie.

"Tim Bradshaw" <t...@cley.com> wrote in message

Marco Antoniotti

unread,
Nov 19, 2002, 9:52:06 AM11/19/02
to

"Charlie" <char...@zoom.co.uk> writes:

> You know I tried placing the cursor all over the place: before the defun, on
> the defun, after the defun. Nothing worked. I'm not about to start insulting
> emacs but you'd think an application written in lisp would have better
> support for it without having to go the non-libre (for the moment) ilisp
> route.

ILISP is "non libre" according to the strict interpretations of the
word as it accrued meaning in the context of "Free Software".

However, ILISP is essentially "gratis"น and it is "very very very
easily redistributable", for a more pragmatic interpretation of the
ILISP license and of the difficulties that we (meaning the ILISP
developers) have, in deciding whether it should and could be released
under "Libre" licenses.

Losing the historical perspective on anything is always problematic.
In the case of ILISP, I had a conversation with RMS many many years
ago (1993/94 - I can dig it out I guess) about what steps where
necessary to make ILISP part of GNU Emacs. It turned out that too
many people had been involved in ILISP (no: I am not one of the
original developers) that it was too difficult to contact all of them
and have them sign papers assigning the rights to the FSF. Over the
years this situation has just gotten worse.

There is a discussion going on in the ILISP mailing lists in order to
decide what to do about it. This has been carried out by the
fantastic efforts of Kevin Rosenberg's as a "good faith effort" to
contact as many people who contributed to ILISP as possible. Given
that we have conflicting goals (maintaining "a" "libre" status and
making ILISP usable by all the CL vendors), I do not know what the
outcome will be.

So, please. Before making blanket statements about the "liberty" of a
piece of software - and more specifically, of one that, IMHO, has been
beneficial to the community as a whole - try to understand what are
the difficulties involved and why a pragmatic solution may - in this
and other cases - be better.

Cheers

น Turn out that ILISP is "too gratis" to meet the newly reinterpreted
DFSGs.

--
Marco Antoniotti ========================================================
NYU Courant Bioinformatics Group tel. +1 - 212 - 998 3488
715 Broadway 10th Floor fax +1 - 212 - 995 4122
New York, NY 10003, USA http://bioinformatics.cat.nyu.edu
"Hello New York! We'll do what we can!"
Bill Murray in `Ghostbusters'.

Charlie

unread,
Nov 19, 2002, 10:46:21 AM11/19/02
to
My apologies, I had no idea it was such a complex problem. I fully intend to
switch to Ilisp when I have time to get, install and learn it. I was just
annoyed that it wasn't included on the debian woody cds because they have
very strict rules about what they consider "free". I certainly didn't mean
to imply that ilisp is in some sense crippled by its licensing issues
because I have heard nothing but praise for it.

Charlie.

"Marco Antoniotti" <mar...@cs.nyu.edu> wrote in message
news:y6cof8l...@octagon.valis.nyu.edu...

Will Deakin

unread,
Nov 19, 2002, 10:44:14 AM11/19/02
to
Charlie wrote:
> My apologies, I had no idea it was such a complex problem. I fully intend to
> switch to Ilisp when I have time to get, install and learn it. I was just
> annoyed that it wasn't included on the debian woody cds because they have
> very strict rules about what they consider "free".
If you run xemacs then the ilisp package is (fairly) up to date. (Let
me know if you have problems with it). Alternative Kevin Rosenberg is
doing sterling work packaging ilisp for emacs at which point apt is
your friend...

:)w

Charlie

unread,
Nov 19, 2002, 11:22:32 AM11/19/02
to
Thank-you all. I'll let you know if I have any more problems.
Cheers
Charlie.

"Will Deakin" <aniso...@hotmail.com> wrote in message
news:ardm3v$9r3$1...@newsreaderg1.core.theplanet.net...

Daniel Barlow

unread,
Nov 19, 2002, 11:37:07 AM11/19/02
to
Marco Antoniotti <mar...@cs.nyu.edu> writes:

> There is a discussion going on in the ILISP mailing lists in order to
> decide what to do about it. This has been carried out by the
> fantastic efforts of Kevin Rosenberg's as a "good faith effort" to
> contact as many people who contributed to ILISP as possible. Given
> that we have conflicting goals (maintaining "a" "libre" status and
> making ILISP usable by all the CL vendors), I do not know what the
> outcome will be.

I'm on the ILISP mailing lists and don't recall having seen any
discussion of problems for the CL vendors that that a switch to GPL
would entail. Which CL vendors distribute ILISP anyway, as a matter
of interest?

The "contacting all known contributers" issue is a biggie, I agree.
I applaud Kevin's work here.

> So, please. Before making blanket statements about the "liberty" of a
> piece of software - and more specifically, of one that, IMHO, has been
> beneficial to the community as a whole - try to understand what are
> the difficulties involved and why a pragmatic solution may - in this
> and other cases - be better.

I agree. In passing I note that ILISP is already available on the
CDROMs sold by the FSF, so if even _they_ can be pragmatic I think the
rest of us can probably use ILISP happily. ;-)

> น Turn out that ILISP is "too gratis" to meet the newly reinterpreted
> DFSGs.

BTW, "newly reinterpreted" is not really accurate. There have been
discussions about this on debian-legal and the xemacs lists over two
years ago - in fact, one of the ILISP maintainers was involved ...

http://list-archive.xemacs.org/xemacs-beta/200005/msg00338.html

Marco Antoniotti

unread,
Nov 19, 2002, 12:35:49 PM11/19/02
to

Daniel Barlow <d...@telent.net> writes:

> Marco Antoniotti <mar...@cs.nyu.edu> writes:
>
> > There is a discussion going on in the ILISP mailing lists in order to
> > decide what to do about it. This has been carried out by the
> > fantastic efforts of Kevin Rosenberg's as a "good faith effort" to
> > contact as many people who contributed to ILISP as possible. Given
> > that we have conflicting goals (maintaining "a" "libre" status and
> > making ILISP usable by all the CL vendors), I do not know what the
> > outcome will be.
>
> I'm on the ILISP mailing lists and don't recall having seen any
> discussion of problems for the CL vendors that that a switch to GPL
> would entail. Which CL vendors distribute ILISP anyway, as a matter
> of interest?

I am just putting my hands in front of me. The hope is to have the
vendors distribute ILISP of course.

> The "contacting all known contributers" issue is a biggie, I agree.
> I applaud Kevin's work here.
>

...

> > น Turn out that ILISP is "too gratis" to meet the newly reinterpreted
> > DFSGs.
>
> BTW, "newly reinterpreted" is not really accurate. There have been
> discussions about this on debian-legal and the xemacs lists over two
> years ago - in fact, one of the ILISP maintainers was involved ...
>
> http://list-archive.xemacs.org/xemacs-beta/200005/msg00338.html

I sent that message CC:ing the xemacs mailing list. After that, we
did not change anything since the issue just faded. I do not follow
`debian-legal' and I got aware of these issues again only recently
thanks to Kevin.

Cheers

Gareth McCaughan

unread,
Nov 19, 2002, 7:15:15 PM11/19/02
to
Marco Antoniotti wrote:

> There is a discussion going on in the ILISP mailing lists in order to
> decide what to do about it. This has been carried out by the
> fantastic efforts of Kevin Rosenberg's as a "good faith effort" to
> contact as many people who contributed to ILISP as possible. Given
> that we have conflicting goals (maintaining "a" "libre" status and
> making ILISP usable by all the CL vendors), I do not know what the
> outcome will be.

Why is there a conflict between being "libre" and being
usable by all the CL vendors? I can see that there might
be a conflict between being *GPL* and being usable by
all the vendors, but the GPL is not the only "libre"
licence. (For instance, the BSD licence without the
"obnoxious advertising clause" is certainly "libre"
and it's hard to imagine what problem any vendor could
have with it.)

--
Gareth McCaughan Gareth.M...@pobox.com
.sig under construc

Charlie

unread,
Nov 20, 2002, 5:08:09 AM11/20/02
to
I just got ilisp going. It was alot easier than I expected both on win32 and
debian. The indentation problem is exactly the same though:

C-M-x anywhere in this function and the interpreter will return 3 and
possibly complain. Now where do I go to find out about/report a bug in
emacs?

(defun badly-indented ()
(+ 1 2))

Charlie.

"Will Deakin" <aniso...@hotmail.com> wrote in message
news:ardm3v$9r3$1...@newsreaderg1.core.theplanet.net...

Will Deakin

unread,
Nov 20, 2002, 7:41:56 AM11/20/02
to
Charlie wrote:
> I just got ilisp going. It was alot easier than I expected both on win32 and
> debian.
Cool.

> C-M-x anywhere in this function and the interpreter will return 3 and
> possibly complain. Now where do I go to find out about/report a bug in
> emacs?

Hmmm. I think this "feature" is known about but I could be wrong.
However, to report a bug with emacs try:
www.gnu.org/software/emacs/emacs.html#YouHelp

:)w

Marco Antoniotti

unread,
Nov 20, 2002, 10:36:40 AM11/20/02
to

Gareth McCaughan <Gareth.M...@pobox.com> writes:

I am wary of potential conflicts. As you pointed out, making a
package GPL'ed does have implications. Of course, we'll have to hear
the vendors on this.

Cheers

Paolo Amoroso

unread,
Nov 20, 2002, 11:18:17 AM11/20/02
to
On Tue, 19 Nov 2002 16:37:07 +0000, Daniel Barlow <d...@telent.net> wrote:

> would entail. Which CL vendors distribute ILISP anyway, as a matter
> of interest?

Xanalys provided patches to make ILISP work better with LispWorks, and
offered some help to the maintainers for testing ILISP with the licensed
version of the product. I don't know whether they distribute or recommend
ILISP.


Paolo
--
EncyCMUCLopedia * Extensive collection of CMU Common Lisp documentation
http://www.paoloamoroso.it/ency/README

Eduardo Muñoz

unread,
Nov 20, 2002, 6:26:13 PM11/20/02
to
"Charlie" <char...@zoom.co.uk> writes:

> I just got ilisp going. It was alot easier than I expected both on win32 and
> debian. The indentation problem is exactly the same though:
>
> C-M-x anywhere in this function and the interpreter will return 3 and
> possibly complain. Now where do I go to find out about/report a bug in
> emacs?

This by design. See Emacs info node 'Left margin paren':

"..."
" In the earliest days, the original Emacs found defuns by moving
upward a level of parentheses or braces until there were no more levels
to go up. This always required scanning all the way back to the
beginning of the buffer, even for a small function. To speed up the
operation, we changed Emacs to assume that any opening delimiter at the
left margin is the start of a defun. This heuristic is nearly always
right, and avoids the need to scan back to the beginning of the buffer.
However, it mandates following the convention described above."

--

Eduardo Muñoz

Charlie

unread,
Nov 21, 2002, 4:30:28 AM11/21/02
to
The people at gnu.emacs.help suggested I put

(setq beginning-of-defun-function (lambda nil
(re-search-backward "^(defun")))

in my .emacs to make emacs look for "(defun" at the start of a line as the
beginning of a function not just an "(". What's wrong with this being the
default?
If this is getting too far off topic please ignore me it's really just
curiosity at this point.

Charlie


"Eduardo Muñoz" <e...@jet.es> wrote in message news:u8yzn3...@jet.es...

Tim Bradshaw

unread,
Nov 21, 2002, 3:44:30 AM11/21/02
to
* Eduardo Muñoz wrote:

> "..."
> " In the earliest days, the original Emacs found defuns by moving
> upward a level of parentheses or braces until there were no more levels
> to go up. This always required scanning all the way back to the
> beginning of the buffer, even for a small function. To speed up the
> operation, we changed Emacs to assume that any opening delimiter at the
> left margin is the start of a defun. This heuristic is nearly always
> right, and avoids the need to scan back to the beginning of the buffer.
> However, it mandates following the convention described above."

This is classic bad design. So it was too slow on a PDP10. Is it
going to be too slow on a 2GHz Pentium? I think not. But do they put
in a `wrong but fast / slow but right' toggle, because they know about
Moore's law? No. And this stuff is written by the people who put Unix
down for not doing the Right Thing.

And it's not even *true*, of course. The system could maintain state
about where various top-level points were, meaning it almost never
needs to scan everything. But no, let's just crap over our users for
ever.

--tim

Raymond Wiker

unread,
Nov 21, 2002, 4:56:19 AM11/21/02
to
"Charlie" <char...@zoom.co.uk> writes:

> The people at gnu.emacs.help suggested I put
>
> (setq beginning-of-defun-function (lambda nil
> (re-search-backward "^(defun")))
>
> in my .emacs to make emacs look for "(defun" at the start of a line as the
> beginning of a function not just an "(". What's wrong with this being the
> default?
> If this is getting too far off topic please ignore me it's really just
> curiosity at this point.

It won't work with defparameter, defconstant, defmacro,
defpackage, in-package, eval-when, setq and a whole bunch of other
forms that occur as top-level forms.

--
Raymond Wiker Mail: Raymon...@fast.no
Senior Software Engineer Web: http://www.fast.no/
Fast Search & Transfer ASA Phone: +47 23 01 11 60
P.O. Box 1677 Vika Fax: +47 35 54 87 99
NO-0120 Oslo, NORWAY Mob: +47 48 01 11 60

Try FAST Search: http://alltheweb.com/

Erik Naggum

unread,
Nov 21, 2002, 5:53:21 AM11/21/02
to
* "Charlie" <char...@zoom.co.uk>

| What's wrong with this being the default?

That you will evaluate the first `defun´ before point instead of the
expression you want to evaluate if it does not match the expectations
of this over-eager regexp.

Just indent your code correctly, and the whole problem goes away.

--
Erik Naggum, Oslo, Norway

Act from reason, and failure makes you rethink and study harder.
Act from faith, and failure makes you blame someone and push harder.

Erik Naggum

unread,
Nov 21, 2002, 5:57:13 AM11/21/02
to
* Tim Bradshaw

| And it's not even *true*, of course. The system could maintain state
| about where various top-level points were, meaning it almost never needs
| to scan everything. But no, let's just crap over our users for ever.

It is in fact pretty pathetic that people use this braindamaged font-lock
shit, which /really/ consumes computrons for no good reason and which has
/already/ identified the whole form so it could crayon all over it, and
then think that such a backward scan is problematic.

On the other hand, if you reach a paren at the beginning of a line and it
is not in a string literal and not the start of a top-level form, the user
should get a brief electrical shock with wall-socket voltage and fix it.

Espen Vestre

unread,
Nov 21, 2002, 6:06:23 AM11/21/02
to

Erik Naggum <er...@naggum.no> writes:

> It is in fact pretty pathetic that people use this braindamaged font-lock
> shit, which /really/ consumes computrons for no good reason and which has
> /already/ identified the whole form so it could crayon all over it, and
> then think that such a backward scan is problematic.

"Braindamaged font-lock shit"?

YMMV wrt. how useful coloring is to you. And I'd like to see the still-
in-use machine that uses a significant share of its cpu cycles for
font-lock-mode!

(off-topic: It's pretty fascinating to observe how long my laptop can
run on battery when it runs linux, with emacs and several LispWorks
processes, each with umpteen (font-lock-mode!) windows, compared to
how fast Windows 98 can suck life out of the same batteries, doing
close to nothing)
--
(espen)

Tim Bradshaw

unread,
Nov 21, 2002, 6:08:45 AM11/21/02
to
* Erik Naggum wrote:

> On the other hand, if you reach a paren at the beginning of a line and it
> is not in a string literal and not the start of a top-level form, the user
> should get a brief electrical shock with wall-socket voltage and fix it.

Well, actually, I admit to doing stuff like this occasionally:

#+com.cley.weld/test
(progn

;;;

(defun ...)
(def...)

(test...)

)

On the other hand, the various things in the PROGN kind of *are*
top-level forms both in the formal CL sense, and in the sense that if
I ask the editor to evaluate one, I really only want it evaluated, not
the whole PROGN. A better way is to have a separate
conditionally-loaded test-harness file though, I think.

(This article is arguing against my previous article to some degree,
but not in the larger sense that I'd like to have the *choice* of
`look for /^(/' or `do the right thing', without having to implement
the latter myself.)

--tim

Charlie

unread,
Nov 21, 2002, 7:22:21 AM11/21/02
to
Don't worry, I intend to. This thread has already got much longer and in
depth than I had ever envisioned. And my problem went away after the first
reply anyway.
Cheers,
Charlie

"Erik Naggum" <er...@naggum.no> wrote

Erik Naggum

unread,
Nov 21, 2002, 6:23:10 PM11/21/02
to
* Espen Vestre
| "Braindamaged font-lock shit"?

It is regexp-based and therefore extremely unintelligent and it also does
a stupid, trivial syntax-only highlighting. A more literal interpretation
of "braindamaged" is hardly possible. Working with crayons like that just
looks amazingly retarded to me. I mean, English is harder to get exactly
right than any programming language, but do people need (or get) color
highlighting of something trivially based on regexp-matching? No. Font-
lock mode is a fancy gimmick because it is trivially possible, not because
it is actually useful.

| And I'd like to see the still- in-use machine that uses a significant
| share of its cpu cycles for font-lock-mode!

That is not the point. The point is that complaining about the CPU time
it takes to do font-locking is more than the CPU time it would take to
scan back to the beginning of the buffer to count match parentheses, and
if you use the font-lock nonsense, the buffer already has that information
pre-computed. So it is an argument against the relative expense of the
mechanism that was sacrificed while another very wasteful technology was
adopted.

Christian Nybų

unread,
Nov 21, 2002, 7:09:41 PM11/21/02
to
Erik Naggum <er...@naggum.no> writes:

> It is regexp-based and therefore extremely unintelligent

I've lately seen examples of this relation -- are there papers
covering the subject?
--
chr

Erik Naggum

unread,
Nov 21, 2002, 11:14:35 PM11/21/02
to
* Christian Nybų

| I've lately seen examples of this relation -- are there papers covering
| the subject?

I have not seen any papers on it, but those who think about the issue for
just a few seconds realize that parsing is a stateful process and regular
expressions are inherently stateless and the amount of state information
you can thereby employ in the syntax recognition process is very limited.

Some of the things I want help with when looking at code is to click on a
symbol name and see its lexical scope and all references highlighted, or
to show free variable references inside each enclosing scope. Such things
are not hard to do if you edit the actual code instead of just characters.

Espen Vestre

unread,
Nov 22, 2002, 4:09:35 AM11/22/02
to
Erik Naggum <er...@naggum.no> writes:

> Some of the things I want help with when looking at code is to click on a
> symbol name and see its lexical scope and all references highlighted, or
> to show free variable references inside each enclosing scope. Such things
> are not hard to do if you edit the actual code instead of just characters.

ah, this reminds me of good old SEdit!

I agree that this is a different world from the stupid idea of using
regular expressions for something which is a text-book example of a
real context-free language. However, font-lock-mode helps me read my
code faster, so I use it. (But now that you pointed it out, I feel
kind of guilty, that I actually use a crappy piece of software because
it happens to work most of the time, just like I accuse the MS Crowd
of doing ;-))
--
(espen)

Pascal Costanza

unread,
Nov 22, 2002, 6:09:49 AM11/22/02
to
Erik Naggum wrote:
> * Christian Nybų
> | I've lately seen examples of this relation -- are there papers covering
> | the subject?
>
> I have not seen any papers on it, but those who think about the issue for
> just a few seconds realize that parsing is a stateful process and regular
> expressions are inherently stateless and the amount of state information
> you can thereby employ in the syntax recognition process is very limited.

I strongly agree. I have never understood the hype about regular
expressions. Recursive-descent parsers are extremely simple to write
once you've got it, and even if you don't need state information upfront
it's easy to add state when you happen to need it later on. (They are a
little bit more wordy though.)


Pascal

--
Pascal Costanza University of Bonn
mailto:cost...@web.de Institute of Computer Science III
http://www.pascalcostanza.de Römerstr. 164, D-53117 Bonn (Germany)

Espen Vestre

unread,
Nov 22, 2002, 7:29:23 AM11/22/02
to
Pascal Costanza <cost...@web.de> writes:

> I strongly agree. I have never understood the hype about regular
> expressions.

context-free grammars are a bit hard to specify in a one-liner...
--
(espen)

Tim Bradshaw

unread,
Nov 22, 2002, 8:08:07 AM11/22/02
to
* Espen Vestre wrote:
> context-free grammars are a bit hard to specify in a one-liner...

So are complicated regexps. Sure, you can do it and people do, but
they look like line noise. And they also don't actually work in lots
of cases because you need a more powerful language.

--tim

Joe Marshall

unread,
Nov 22, 2002, 9:19:02 AM11/22/02
to
Espen Vestre <espen@*do-not-spam-me*.vestre.net> writes:

> However, font-lock-mode helps me read my code faster, so I use it.

Same here. But it's not the fontification that's the problem, it's
the stupid way it is done.

> (But now that you pointed it out, I feel kind of guilty, that I
> actually use a crappy piece of software because it happens to work
> most of the time, just like I accuse the MS Crowd of doing ;-))

Fontification has a rather soft failure mode, and it's hard enough
trying to come up with regular expressions that work even *some* of
the time.

I think I've become complacent.

Erik Naggum

unread,
Nov 22, 2002, 9:36:46 AM11/22/02
to
* Tim Bradshaw

| So are complicated regexps. Sure, you can do it and people do, but they
| look like line noise. And they also don't actually work in lots of cases
| because you need a more powerful language.

What has worried me since I sat down to write a regular expression that
only matched all valid SGML start- and end-tags is that both the false
negatives (things it should match but does not) and the false positives
(things it should not match but does) are extremely hard to catch.

Nils Goesche

unread,
Nov 22, 2002, 10:58:05 AM11/22/02
to
Pascal Costanza <cost...@web.de> writes:

> Erik Naggum wrote:
> > I have not seen any papers on it, but those who think about the
> > issue for just a few seconds realize that parsing is a stateful
> > process and regular expressions are inherently stateless and the
> > amount of state information you can thereby employ in the syntax
> > recognition process is very limited.

> I strongly agree. I have never understood the hype about regular
> expressions. Recursive-descent parsers are extremely simple to write
> once you've got it, and even if you don't need state information
> upfront it's easy to add state when you happen to need it later
> on. (They are a little bit more wordy though.)

For one thing, regular expressions are theoretically interesting.
Have you ever tried to write your own regexp matcher? It's not as
easy as one might think. Regexps are a direct application of the
theory of finite automatons, and if you are not aware of that getting
it right will be extremely hard.

Using them is another matter. They are a convenient tool for
searching, for instance: Do you really only use fgrep, never grep? I
also use isearch-forward-regexp in Emacs quite a lot. In Common Lisp,
I like APROPOS; wouldn't it be nice if you could do, say, (apropos
"most.*float")? Recently I installed Michael Parkers REGEX package,
and wrote

(defun grep (regex &optional package
&key (upcase t) (internals t))
(unless package
(setq package (list-all-packages)))
(let ((matcher (regex:compile-str (if upcase
(string-upcase regex)
regex)))
(count 0))
(flet ((check-package (next)
(loop
(multiple-value-bind (morep sym)
(funcall next)
(unless morep
(return))
(when (regex:scan-str matcher (symbol-name sym))
(incf count)
(fresh-line)
(prin1 sym)
(cond ((special-operator-p sym)
(format t " -- special form"))
((macro-function sym) (format t " -- macro"))
((fboundp sym) (format t " -- ~S"
(symbol-function sym)))
((boundp sym) (format t " -- value: ~S"
(symbol-value sym)))))))))
(if internals
(with-package-iterator (next package :internal :external)
(check-package (lambda () (next))))
(with-package-iterator (next package :external)
(check-package (lambda () (next))))))
(format t "~&~[No~;One~:;~:*~D~] match~:*~[~;~:;es~]." count)
(values)))

Now it works:

CL-USER 6 > (grep "most.*float")
MOST-NEGATIVE-SHORT-FLOAT -- value: -1.7976931348623165E308
MOST-NEGATIVE-LONG-FLOAT -- value: -1.7976931348623165E308
MOST-POSITIVE-LONG-FLOAT -- value: 1.7976931348623165E308
MOST-POSITIVE-SHORT-FLOAT -- value: 1.7976931348623165E308
MOST-POSITIVE-DOUBLE-FLOAT -- value: 1.7976931348623165E308
MOST-NEGATIVE-SINGLE-FLOAT -- value: -1.7976931348623165E308
MOST-NEGATIVE-DOUBLE-FLOAT -- value: -1.7976931348623165E308
MOST-POSITIVE-SINGLE-FLOAT -- value: 1.7976931348623165E308
8 matches.

Another area where they are useful is -- parsing :-) Not for writing
the whole parser, of course. Yes, regexps are misused horribly when
people write complete parsers with them (or rather, don't write a
complete parser at all and just hope that their regexp will work most
of the time); but -- most parsers start with a tokenizer, a lexer.
The complexity of regexps that describe just a single /token/, rather
than the whole input, is usually still manageable. And -- a lexer
with a good regex engine under the hood will be fast, and I mean
*super-fast*! Handwritten lexers tend to be slow, consing monsters;
sure, it is possible to do it right, but so /tedious/ that it seems to
me like writing assembly language instead of a high level language.
(Even though the regexp version will /look/ more like machine language
;-). I'd rather have that code generated for me.

In short: Use regexps for what they're good at: Teaching finite
automatons, interactive searching, and generating lexers. (But /not/
as a substitute for parsers).

Regards,
--
Nils Gösche
"Don't ask for whom the <CTRL-G> tolls."

PGP key ID 0x0655CFA0

Pascal Costanza

unread,
Nov 22, 2002, 11:48:04 AM11/22/02
to
Nils Goesche wrote:
> Pascal Costanza <cost...@web.de> writes:
>
>
>>Erik Naggum wrote:
>>
>>> I have not seen any papers on it, but those who think about the
>>> issue for just a few seconds realize that parsing is a stateful
>>> process and regular expressions are inherently stateless and the
>>> amount of state information you can thereby employ in the syntax
>>> recognition process is very limited.
>>
>
>>I strongly agree. I have never understood the hype about regular
>>expressions. Recursive-descent parsers are extremely simple to write
>>once you've got it, and even if you don't need state information
>>upfront it's easy to add state when you happen to need it later
>>on. (They are a little bit more wordy though.)
>
>

[various interesting uses of regular expressions]


> In short: Use regexps for what they're good at: Teaching finite
> automatons, interactive searching, and generating lexers. (But /not/
> as a substitute for parsers).

OK, you have several points there. I have never experienced a situation
where I had the impression that learning regular expresssion would buy
me an important advantage though. But maybe this is one of those cases
where you realize the advantages only afterwards...

Alexander Schmolck

unread,
Nov 23, 2002, 12:05:16 PM11/23/02
to
Pascal Costanza <cost...@web.de> writes:

Yes, regepxs suck badly, but I think that the fact that regexp-based parsing
doesn't work properly might actually partly be a feature in emacs. From my
experience, emacs language modes tend to be relatively robust in the face of
keyword or construct additions etc. Thus, although things like font-lock
etc. never work 100%, they often still work reasonably for slightly different,
or newer versions of languages. Of course having language modes that have a
proper understanding of the underlying languages would be *much* more useful.

The other reason for the popularity of regexps is presumably that they are
more or less standardized and available for a variety of languages. Moreover,
often those languages are fairly slow, so that their regexp-engines, usually
written in C, are the only way to "parse" text efficiently.

alex

0 new messages