Q: parsing strings

The Glauber

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

Hello,

is there an easy way in CL to parse fields contained within strings?

For example, suppose a file name:
"LASP.P.MVR.KS.TITL.S04752.D120765"

Is there an easy way to break this into a list:
("LASP" "P" "MVR" "KS" "TITL" "S04752" "D120765")

In Perl, for example, you could do this either by using the "split"
function or regular expressions.
Help me break out of my Perl habit! :-) :-) :-)

Thanks!

glauber

--
Glauber Ribeiro
thegl...@my-deja.com http://www.myvehiclehistoryreport.com
"Opinions stated are my own and not representative of Experian"

Sent via Deja.com http://www.deja.com/
Before you buy.

The Glauber

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

In article <8qt214$vu7$1...@nnrp1.deja.com>,

The Glauber <thegl...@my-deja.com> wrote:
> Hello,
>
> is there an easy way in CL to parse fields contained within strings?
>
> For example, suppose a file name:
> "LASP.P.MVR.KS.TITL.S04752.D120765"
>
> Is there an easy way to break this into a list:
> ("LASP" "P" "MVR" "KS" "TITL" "S04752" "D120765")

This is my shamefully novice attempt: it takes 2 parameters: a string to
parse and a separator character.

(defun my-split (source-str separator-char)
(let
((result-list nil)
(start-pos 0)
(end-pos 0))
(dotimes (num-elements (count separator-char source-str))
(setf end-pos
(position separator-char source-str :start (+ 1 start-pos)))
(setf result-list
(append result-list (list (subseq source-str start-pos end-pos))))
(setf start-pos (+ 1 end-pos))
)
(setf result-list
(append result-list (list (subseq source-str (+ 1 end-pos)))))
result-list
))

Christian Nybų

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

The Glauber <thegl...@my-deja.com> writes:

> is there an easy way in CL to parse fields contained within strings?
>
> For example, suppose a file name:
> "LASP.P.MVR.KS.TITL.S04752.D120765"

There is READ-DELIMITED-LIST, but it tries to read the stream, so it's
not quite useful for your purpose. In Franz' AllegroServe, there's a
function that does what you want;

USER(17): (net.aserve::split-on-character
"LASP.P.MVR.KS.TITL.S04752.D120765" #\.)

("LASP" "P" "MVR" "KS" "TITL" "S04752" "D120765")

As the code is GPL'ed, I suppose it's allright to copy it here, for
context, get the whole aserve package at ftp.franz.com/serve.

;; The if* macro used in this code can be found at:
;;
;; http://www.franz.com/~jkf/ifstar.txt

(defun split-on-character (str char &key count)
;; given a string return a list of the strings between occurances
;; of the given character.
;; If the character isn't present then the list will contain just
;; the given string.
(let ((loc (position char str))
(start 0)
(res))
(if* (null loc)
then ; doesn't appear anywhere, just the original string
(list str)
else ; must do some work
(loop
(push (subseq str start loc) res)
(setq start (1+ loc))
(if* count then (decf count))
(setq loc (position char str :start start))
(if* (or (null loc)
(eql 0 count))
then (if* (< start (length str))
then (push (subseq str start) res)
else (push "" res))
(return (nreverse res)))))))

--
chr

The Glauber

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

In article <8766nhg...@lapchr.siteloft.com>,

cn...@eunet.no (Christian Nybų) wrote:
> The Glauber <thegl...@my-deja.com> writes:
>
> > is there an easy way in CL to parse fields contained within strings?
> >
> > For example, suppose a file name:
> > "LASP.P.MVR.KS.TITL.S04752.D120765"
>
> There is READ-DELIMITED-LIST, but it tries to read the stream, so it's
> not quite useful for your purpose.

How about read-delimited-list using a string-stream?

glauber

Sunil Mishra

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

Maybe I'm missing something, but I just don't see why the function in
aserve is so complex... This does essentially the same job, minus the
cuont argument, which should be really easy to introduce.

(defun split (string char)
(loop for start = 0 then (1+ end)
for end = (position char string :start start)
collect (subseq string start (or end (length string)))
while end))

Christian Nybų wrote:

> The Glauber <thegl...@my-deja.com> writes:
>
>
>> is there an easy way in CL to parse fields contained within strings?
>>
>> For example, suppose a file name:
>> "LASP.P.MVR.KS.TITL.S04752.D120765"
>
>
> There is READ-DELIMITED-LIST, but it tries to read the stream, so it's

David E. Lamy

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

In article <8qt214$vu7$1...@nnrp1.deja.com>, The Glauber wrote:
>Hello,

>
>is there an easy way in CL to parse fields contained within strings?
>
>For example, suppose a file name:
>"LASP.P.MVR.KS.TITL.S04752.D120765"
>

>Is there an easy way to break this into a list:

>("LASP" "P" "MVR" "KS" "TITL" "S04752" "D120765")
>

Here follows a beginner's solution:

(defun my-split (orig key)
(let (list-of-subseqs)
(if (null orig)
list-of-subseqs
(let ((pos-key (position key orig)))
(cond ((null pos-key)
(append (append list-of-subseqs (list orig))
(my-split nil key)))
(t
(append
(append list-of-subseqs (list (subseq orig 0 pos-key)))
(my-split (subseq orig (1+ pos-key)) key))))))))

I hope that this is not a too inelegant use of recursion.

--
__o As of 09/27/2000, free software is still the state of an
_`\<,_ open mind. Develop it, share it and use it!
(*)/ (*) David Emile Lamy del...@mint.net
pgp key available at http://pgp5.ai.mit.edu/~bal

Hannu Koivisto

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

Sunil Mishra <sunil....@everest.com> writes:

| Maybe I'm missing something, but I just don't see why the function in
| aserve is so complex... This does essentially the same job, minus the
| cuont argument, which should be really easy to introduce.

And minus the optimization that if the character doesn't appear
anywhere in the string, the original string is returned in a list.
But you are right, when I first saw that code it looked so
convoluted that I didn't even bother to try to read and understand
all of it.

(defun string-split (str &optional (separator #\Space))
"Splits the string STR at each SEPARATOR character occurrence.
The resulting substrings are collected into a list which is returned.
A SEPARATOR at the beginning or at the end of the string STR results
in an empty string in the first or last position of the list
returned."
(declare (type string str)
(type character separator))

(loop for start = 0 then (1+ end)

for end = (position separator str :start start)
collect (subseq str start end)
until (null end)))

When you replace that unneccessary (OR END (LENGTH STRING)) with
just END, you get almost my version above (where I'm being only
slightly more explicit with the ending condition), which I think I
posted here a while ago. If the original poster happens to read
this, I'd recommend trying Deja next time. This kind of questions
are asked quite frequently.

And yes, introducing the count argument is easy, just one line
more. If one also wants that "character doesn't appear anywhere in
the string -> return the original string in a list" optimization,
it's also just two more lines or so if added explicitly to this
function, but one could also consider introducing an alternative
subseq function, which doesn't make the guarantee that it always
allocates a new sequence for a result and doesn't share storage
with an old sequence, and use that. In any case, even with both
those additions, the result would be less complex (more readable,
less than half in length and probably faster too) than the one from
aserve.

--
Hannu

Rainer Joswig

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

In article <slrn8t4f72...@opal.mint.net>, del...@opal.mint.net
(David E. Lamy) wrote:

> In article <8qt214$vu7$1...@nnrp1.deja.com>, The Glauber wrote:
> >Hello,
> >
> >is there an easy way in CL to parse fields contained within strings?
> >
> >For example, suppose a file name:
> >"LASP.P.MVR.KS.TITL.S04752.D120765"
> >
> >Is there an easy way to break this into a list:
> >("LASP" "P" "MVR" "KS" "TITL" "S04752" "D120765")
> >
> Here follows a beginner's solution:
>
> (defun my-split (orig key)
> (let (list-of-subseqs)
> (if (null orig)
> list-of-subseqs
> (let ((pos-key (position key orig)))
> (cond ((null pos-key)
> (append (append list-of-subseqs (list orig))
> (my-split nil key)))
> (t
> (append
> (append list-of-subseqs (list (subseq orig 0 pos-key)))
> (my-split (subseq orig (1+ pos-key)) key))))))))
>
> I hope that this is not a too inelegant use of recursion.

Try to avoid APPEND.

--
Rainer Joswig, Hamburg, Germany
Email: mailto:jos...@corporate-world.lisp.de
Web: http://corporate-world.lisp.de/

Lieven Marchand

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

The Glauber <thegl...@my-deja.com> writes:

> In Perl, for example, you could do this either by using the "split"
> function or regular expressions.
> Help me break out of my Perl habit! :-) :-) :-)

There have been several implementations of split posted and refined on
this group. Use deja.com.

PS: Anyone know of another comprehensive archive to refer people to
now that deja.com seems to gradually get out of the archiving
business?

--
Lieven Marchand <m...@bewoner.dma.be>
Lambda calculus - Call us a mad club

Frank A. Adrian

unread,

Sep 27, 2000, 3:00:00 AM9/27/00

to

Why not rename the sequence var, take out the declarations, and name the
function split? It seems to work for any type of sequence. 'Twould be more
handy, though not necessarily more efficient, methinks.

faa

"Hannu Koivisto" <az...@iki.fi.ns> wrote in message
news:87bsx98...@senstation.vvf.fi...

Dirk Zoller

unread,

Oct 10, 2000, 2:46:38 AM10/10/00

to

Hello,

I always felt that the very powerful (format) lacks a counterpart
for input. Like in C there is scanf() which in some sense reverses
the effect of printf().

I then tried to write such a thing. It seems possible. Please find
attached a first attempt to do it.

I ran the attached program on a large file with stock-index prices.
In order to make it fast enough I added declarations and proclamations
which on CMUCL actually resulted in a very compact and fast machine
representation. Not too far away from what C's scanf() would do.

Why isn't such a nice function already defined in Common Lisp?

Kind regards

Dirk

--
Dirk Zoller Fon: 06106-876566
Obere Marktstraße 5 e-mail: d...@sol-3.de
63110 Rodgau

scan.lisp

Rahul Jain

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

In article <39E2BB4E...@onlinehome.de> on Tue, 10 Oct 2000 08:46:38

+0200, Dirk Zoller <d...@onlinehome.de> wrote:

> Hello,
>
> I always felt that the very powerful (format) lacks a counterpart
> for input. Like in C there is scanf() which in some sense reverses
> the effect of printf().
>
> I then tried to write such a thing. It seems possible. Please find
> attached a first attempt to do it.

...

>
> Why isn't such a nice function already defined in Common Lisp?

Because of *print-readably* and (read).
For most objects, it works quite well.

--
-> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <-
-> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <-
-> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <-
-> -/- http://photino.sid.rice.edu/ -=- mailto:rahul...@usa.net -\- <-
|--|--------|--------------|----|-------------|------|---------|-----|-|
Version 11.423.999.210020101.23.50110101.042
(c)1996-2000, All rights reserved. Disclaimer available upon request.

Dirk Zoller

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Rahul Jain wrote:
> > I always felt that the very powerful (format) lacks a counterpart
> > for input. Like in C there is scanf() which in some sense reverses
> > the effect of printf().
> >

> > Why isn't such a nice function already defined in Common Lisp?
>

> Because of *print-readably* and (read).
> For most objects, it works quite well.

And what if I have file with data not printed by Lisp which I want
to in (not read back in)?

By pointing out the analogy printf()/scanf() I didn't mean that
I just want to read data which I previously wrote (this is particularly
easy in Lisp, although I found the Lisp reader of at least one system
pretty slow).

No, I meant to read *any* formatted text data from the outside world,
like (format) can very nicely format data to the outside world.

Don't Lisp programmers do that?

--
Dirk Zoller Phone: +49-69-50959861
Sol-3 GmbH&Co KG Fax: +49-69-50959859
Niddastrape 98-102 e-mail: d...@sol-3.de
60329 Frankfurt/M
Germany

Christophe Rhodes

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Dirk Zoller <d...@sol-3.de> writes:

> Rahul Jain wrote:
> > > I always felt that the very powerful (format) lacks a counterpart
> > > for input. Like in C there is scanf() which in some sense reverses
> > > the effect of printf().
> > >

> > > Why isn't such a nice function already defined in Common Lisp?
> >

> > Because of *print-readably* and (read).
> > For most objects, it works quite well.
>
> And what if I have file with data not printed by Lisp which I want
> to in (not read back in)?
>
> By pointing out the analogy printf()/scanf() I didn't mean that
> I just want to read data which I previously wrote (this is particularly
> easy in Lisp, although I found the Lisp reader of at least one system
> pretty slow).
>
> No, I meant to read *any* formatted text data from the outside world,
> like (format) can very nicely format data to the outside world.
>
> Don't Lisp programmers do that?

This seems to come up every once in a while. I think the answer is
"yes, they do", but that people write their own, dedicated code for
the particular application.

Having said that, I amused myself a while ago (when I was meant to be
revising for finals) by writing some code to do (setf format). You're
most welcome to a copy, with of course no warranty, etc, etc,
etc. It's at
<URL:http://www-jcsu.jesus.cam.ac.uk/~csr21/format-setf.lisp>. Last I
checked, it didn't break on load.

However, the reasons that there isn't anything like this in the
language include the uncertainty of the semantics of (setf (format nil
"~d~d" x y) "123"), and so on, and you'll find that my code will
probably break on ambiguous input.

Christophe
--
Jesus College, Cambridge, CB5 8BL +44 1223 524 842
(FORMAT T "(~@{~w ~}~3:*'~@{~w~^ ~})" 'FORMAT T "(~@{~w ~}~3:*'~@{~w~^ ~})")

Dirk Zoller

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Christophe Rhodes wrote:

> This seems to come up every once in a while. I think the answer is
> "yes, they do", but that people write their own, dedicated code for
> the particular application.

That was the answer I got two years ago when I last played with Lisp
and asked the same stupid question. I just thought maybe Lispers, who
seem otherwise very clever people, meanwhile got tired of reinventing
such an obviously required wheel. Seems not?

> However, the reasons that there isn't anything like this in the
> language include the uncertainty of the semantics of (setf (format nil
> "~d~d" x y) "123"), and so on, and you'll find that my code will
> probably break on ambiguous input.

This example isn't ambigous: x is assigned 123, y is left alone, a value
of 1 is returned.

But why so ambitious? Why this added (setf) complexity? I would just
return multiple values or a list of results. That would also be more
"functional".

Rainer Joswig

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

In article <39E2DDCD...@sol-3.de>, Dirk Zoller <d...@sol-3.de>
wrote:

> Rahul Jain wrote:
> > > I always felt that the very powerful (format) lacks a counterpart
> > > for input. Like in C there is scanf() which in some sense reverses
> > > the effect of printf().
> > >

> > > Why isn't such a nice function already defined in Common Lisp?
> >

> > Because of *print-readably* and (read).
> > For most objects, it works quite well.
>
> And what if I have file with data not printed by Lisp which I want
> to in (not read back in)?
>
> By pointing out the analogy printf()/scanf() I didn't mean that
> I just want to read data which I previously wrote (this is particularly
> easy in Lisp, although I found the Lisp reader of at least one system
> pretty slow).
>
> No, I meant to read *any* formatted text data from the outside world,
> like (format) can very nicely format data to the outside world.

I don't have a definitive answer for that, but it seems
that there are no "simple" solutions available. Somebody
has to propose a design (an implementation would also
be nice) and the community will have to see if they like
it.

> Don't Lisp programmers do that?

They do. There are some regexp-packages (non-standard, too)
and some parser tools that people seem to use.

Unfortunately sometimes a "naive" design mixes with slow
basic performance of some Lisp facilities (STREAMS, ...),
so handwritten code is often faster.

Rainer Joswig

Christophe Rhodes

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Dirk Zoller <d...@sol-3.de> writes:

> Christophe Rhodes wrote:
>
> > This seems to come up every once in a while. I think the answer is
> > "yes, they do", but that people write their own, dedicated code for
> > the particular application.
>
> That was the answer I got two years ago when I last played with Lisp
> and asked the same stupid question. I just thought maybe Lispers, who
> seem otherwise very clever people, meanwhile got tired of reinventing
> such an obviously required wheel. Seems not?

I don't think that the correct solution is at all obvious. That's the
problem (well, not that _I_ don't think the solution is obvious, but
that lots of people don't think the solution is obvious. Um. You know
what I mean).

> > However, the reasons that there isn't anything like this in the
> > language include the uncertainty of the semantics of (setf (format nil
> > "~d~d" x y) "123"), and so on, and you'll find that my code will
> > probably break on ambiguous input.
>
> This example isn't ambigous: x is assigned 123, y is left alone, a value
> of 1 is returned.
>
> But why so ambitious? Why this added (setf) complexity? I would just
> return multiple values or a list of results. That would also be more
> "functional".

Well, the short answer is that it was because it was fun (remember, I
should have been working on passing a Physics degree).

However, you can get what you want in any case with a little simple
macrology on top of the setf stuff, no?

Dirk Zoller

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Christophe Rhodes wrote:
> I don't think that the correct solution is at all obvious. That's the
> problem (well, not that _I_ don't think the solution is obvious, but
> that lots of people don't think the solution is obvious. Um. You know
> what I mean).

Well, probably due to my lack of experience with Lisp, I don't see
any problems. I'd say, what I've posted earlier are the beginnings
of a nice solution.

The only drawback is that the Lisp code to fiddle around with each
character of both format string and input string is probably pretty
inefficient (unless heavily pumped up with type declarations).
Therefore this should be in the System implementation, were it could
be done in C or whatever chosen by the system maker.

The C-programmer (although she could) typically doesn't write her own
scanf(), so why should the Lisp programmer? Both the amount of work
and the benefits seem similar to me in both worlds. One provides such
a solution, the other doesn't. Strange.

Marco Antoniotti

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Dirk Zoller <d...@onlinehome.de> writes:

> This is a multi-part message in MIME format.
> --------------FB3F02D642D2D93114FB7658
> Content-Type: text/plain; charset=iso-8859-1
> Content-Transfer-Encoding: 8bit

>
> Hello,
>
> I always felt that the very powerful (format) lacks a counterpart
> for input. Like in C there is scanf() which in some sense reverses
> the effect of printf().

Your deed is a worthy one. However, note that CL has READ which makes
scanf pretty much useless in many contexts.

> I then tried to write such a thing. It seems possible. Please find
> attached a first attempt to do it.
>
> I ran the attached program on a large file with stock-index prices.
> In order to make it fast enough I added declarations and proclamations
> which on CMUCL actually resulted in a very compact and fast machine
> representation. Not too far away from what C's scanf() would do.

See above.

Cheers

--
Marco Antoniotti =============================================================
NYU Bioinformatics Group tel. +1 - 212 - 998 3488
719 Broadway 12th Floor fax +1 - 212 - 995 4122
New York, NY 10003, USA http://galt.mrl.nyu.edu/valis
Like DNA, such a language [Lisp] does not go out of style.
Paul Graham, ANSI Common Lisp

Marco Antoniotti

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Dirk Zoller <d...@sol-3.de> writes:

> Rahul Jain wrote:
> > > I always felt that the very powerful (format) lacks a counterpart
> > > for input. Like in C there is scanf() which in some sense reverses
> > > the effect of printf().
> > >

> > > Why isn't such a nice function already defined in Common Lisp?
> >

> > Because of *print-readably* and (read).
> > For most objects, it works quite well.
>
> And what if I have file with data not printed by Lisp which I want
> to in (not read back in)?

That is the crux of the problem. XML is *a* solution not only for CL
but for the rest of the wolrd as well :)

> By pointing out the analogy printf()/scanf() I didn't mean that
> I just want to read data which I previously wrote (this is particularly
> easy in Lisp, although I found the Lisp reader of at least one system
> pretty slow).
>
> No, I meant to read *any* formatted text data from the outside world,
> like (format) can very nicely format data to the outside world.
>

> Don't Lisp programmers do that?

Yes. And in the most general case you must resolve to writing a
complex parser (which may rely on scanf) to handle the quirkiness of
the format. That is why XML is a step in the right direction.

Duane Rettig

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Dirk Zoller <d...@onlinehome.de> writes:

> Hello,
>
> I always felt that the very powerful (format) lacks a counterpart
> for input. Like in C there is scanf() which in some sense reverses
> the effect of printf().

Perhaps this is a naiive question, but how often do you really use
scanf? Is it really useful to you?

Or perhaps you are fitting simpler problems into scanf solutions:
I found a few uses of scanf in our own code, but most of them are
of the form:

sscanf( argv[1], "%d", &n );

which is much more simply and efficiently written as a call to atoi()
or atol().

Perhaps read-from-string, followed by a type test, is a simpler
way to solve some of the higher level problems you would like to
solve with a (setf format) function.

--
Duane Rettig Franz Inc. http://www.franz.com/ (www)
1995 University Ave Suite 275 Berkeley, CA 94704
Phone: (510) 548-3600; FAX: (510) 548-8253 du...@Franz.COM (internet)

Erik Naggum

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

* Dirk Zoller <d...@sol-3.de>

| Well, probably due to my lack of experience with Lisp, I don't see
| any problems. I'd say, what I've posted earlier are the beginnings
| of a nice solution.

You responded "why so ambitious" to someone's argument that you
can't do it all. If you have the beginnings, where is it going, if
you don't want others to be so ambitious?

| The only drawback is that the Lisp code to fiddle around with each
| character of both format string and input string is probably pretty
| inefficient (unless heavily pumped up with type declarations).
| Therefore this should be in the System implementation, were it could
| be done in C or whatever chosen by the system maker.

Wrong conclusion. Therefore the Common Lisp implementation should
be improved to the point where such things are fast enough to be
written in a good language. In practice, they are. Doing parsing
in C is insane. C is unsuited to process textual input. (One might
also argue that Common Lisp is unsuited to process binary input.)
Perl is the answer to the needs of the C programmers. And if this
doesn't scare you, you're too much of a cynic for your own good.

| The C-programmer (although she could) typically doesn't write her
| own scanf(), so why should the Lisp programmer? Both the amount of
| work and the benefits seem similar to me in both worlds. One
| provides such a solution, the other doesn't. Strange.

I wasn't aware that C programmers were female¹, but C programmers
don't write their own necessary tools because they have something
that almost fits the bill, for a relaxed understanding of "fits",
and so they never quite get it right, especially when the discover
that to get it entirely right requires massive support from tools
not at their disposal. Hence the gargantuan libraris of C++ and
Java, which both do a much better approximation to "fit" than C ever
could hope for, except they are also _massively_ expensive to learn
to use well and very cumbersome and verbose in practice to boot.

Incidentally, tools such as yacc and lex are very good, but they are
much, much slower than anyone who contemplates using them could even
conceive that they would be. It's C, so it must be fast, right?
Well, they're C allright, and _therefore_ slow, because C doesn't
have the necessary machinery to process textual input efficiently,
so those who wanted it made half-assed attempts and were satisfied
with them prematurely, like the immigrants who stop improving their
English as soon as they are no longer actively bothered by repeating
themselves to those who don't understand him, or find others who can
understand their inferior language skills and pronunciation.

I find that Lisp's very nature makes writing parsers easy and very
straight-forward. Much easier than doing them in C with all sorts
of inferior tools that don't quite cut it. Like scanf, regexps, ...

#:Erik
-------
¹ I am, however, aware of the silly, annoying trend among some people
who don't appreciate the history of the English language to think
that "he" and "man" refers to the _male_. They don't. The male in
English doesn't have his own pronouns and sex²-specific terms the
way the female does. And so now you want to take everything away
from the males? To what end is this productive and constructive?
² Yes, it's "sex", not "gender", too.
--
If this is not what you expected, please alter your expectations.

Dirk Zoller

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Duane Rettig wrote:
> > I always felt that the very powerful (format) lacks a counterpart
> > for input. Like in C there is scanf() which in some sense reverses
> > the effect of printf().
>

> Perhaps this is a naiive question, but how often do you really use
> scanf? Is it really useful to you?

Less often than xprintf(), but for good reasons. I have to deal with
messages formatted by other software and sscanf() is extremely handy
to chop these into the right pieces and dig out the values at the same
time.

I also define simple config file formats with a certain flexibility
which I achieve without massive parsers just by trying to sscanf() a
line in various ways. Without much hassle you get the information:
Is this the line you're expecting, what are the values?

> Or perhaps you are fitting simpler problems into scanf solutions:
> I found a few uses of scanf in our own code, but most of them are
> of the form:
>
> sscanf( argv[1], "%d", &n );
>
> which is much more simply and efficiently written as a call to atoi()
> or atol().

Only if you can live with the uncertainty if that was a number at all
what you handed over to atoi(). I usually perfer something like

if (1 != sscanf (line, "%d %n", &value, &length) ||
length != strlen (line))
that was no number;

Simple enough, as it also catches cases where I expect a number of
numbers or strings with certain delimiters. That's just great.
(Don't try this with Borland or Microsoft C, their scanf() is broken.)

> Perhaps read-from-string, followed by a type test, is a simpler
> way to solve some of the higher level problems you would like to
> solve with a (setf format) function.

Sounds a little like, hey there's no need for sscanf(), you can
achieve all that with a little char* hackery. You can, but it's not
convenient. In the case of Lisp it is also very inefficient.

(To support that efficiency argument: I sed/awk-ed my input file into
nice Lisp expressions. That took a second. Then I handed it to (read),
that took 20 seconds. Besides having to resort to such a technique for
a simple data input problem is not exactly what I expect from a general
purpose high level language, the performance of (read) when applied to
lots of data was -- at least with my system -- very poor.)

All people pointing to (read) are missing the point. The most grotesque
idea I heard was hacking read macros to do some parsing on input. This
is probably very exciting, but I'd still prefer if I could achieve
the same with a simple and effective tool which works similarly to
(format).

Dirk Zoller

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Erik Naggum wrote:

> You responded "why so ambitious" to someone's argument that you
> can't do it all. If you have the beginnings, where is it going, if
> you don't want others to be so ambitious?

Negative. I meant it is pointless (over-ambitious) to do it in that
(setf()) disguise, at least at first. I'd care for the functionality
first. This can be perfectly presented using a list as return value or
multiple values.

> I wasn't aware that C programmers were femaleš, but C programmers

Personally, I wish more programmers were female.

The rest is blah blah, partly right, partly wrong, mostly off topic,
just the stuff to heat up the discussion a little.

Raymond Laning

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

the people responding to your postings evidently never had to deal with
integrating legacy (e.g. paleolithic) systems. I had to write a
formatted-read function to read output from fortran programs that could
not be maintained because the people that wrote them were no longer
employed (or in some cases, living). I am sorry that the sourcecode for
my function is no longer available to me else I would pass it along, but
IIRC it had many similarities to yours

Dirk Zoller wrote:
>
<snip>

The Glauber

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

In article <39E2BB4E...@onlinehome.de>,

d...@onlinehome.de, d...@sol-3.de wrote:
> This is a multi-part message in MIME format.
> --------------FB3F02D642D2D93114FB7658
> Content-Type: text/plain; charset=iso-8859-1
> Content-Transfer-Encoding: 8bit
>

> Hello,
>
> I always felt that the very powerful (format) lacks a counterpart
> for input. Like in C there is scanf() which in some sense reverses
> the effect of printf().
>
> I then tried to write such a thing. It seems possible. Please find
> attached a first attempt to do it.

[...]

Dirk,

this file doesn't load. According to CLISP:
CL-USER[7]> (load "scan.lsp")

;; Loading file scan.lsp ... *** - READ: input stream #<BUFFERED FILE-STREAM
CHARACTER #P"scan.lsp" @262> ends within an object. Last opening parenthesis
probably in line 200.

Dirk Zoller

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

The Glauber wrote:

> this file doesn't load. According to CLISP:
> CL-USER[7]> (load "scan.lsp")
>
> ;; Loading file scan.lsp ... *** - READ: input stream #<BUFFERED FILE-STREAM
> CHARACTER #P"scan.lsp" @262> ends within an object. Last opening parenthesis
> probably in line 200.

Yes I accidentally posted some old garbage file.

Please find attached a version I just checked to be working with clisp.

Anyway I must warn you that this file is in the midst of being reworked
from scanning strings to scanning streams. Most functions still scan strings
and are not actually being called. Only a simple integer conversion has
already tranformed into the stream-scanning style and is called in place
of all other conversions.

Given that, the idea come over anyway. I'm not sure what is better (scanning
streams vs. scanning streams). Maybe that will be a performance/versatility
tradeoff. Strings can be treated as streams but not vice versa. OTOH scanning
strings might perform better. You see, I'm experimenting.

I'll be happy to send this test file "daxa.asc" by mail. It's to big to post.

Regards

scan.lisp

Tim Bradshaw

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

* Raymond Laning wrote:
> the people responding to your postings evidently never had to deal with
> integrating legacy (e.g. paleolithic) systems. I had to write a
> formatted-read function to read output from fortran programs that could
> not be maintained because the people that wrote them were no longer
> employed (or in some cases, living). I am sorry that the sourcecode for
> my function is no longer available to me else I would pass it along, but
> IIRC it had many similarities to yours

If I had to do that (and I have done related stuff), I'd tend to do
the grotty massaging in awk or (nowadays) perl or something rather
than spend a whole lot of time doing stuff in Lisp. This stems from
experiences trying to do similar things with scanf in C and eventually
giving up because it just ended up easier to do the conversion in a
dedicated string-bashing language: scanf is pretty fragile.

In fact if I had to do it again, I'd probably write something which
looked like a formatted-read function but actually the data massaging
utility with its stdout piped into a Lisp stream.

Now of course purists will hate me for being willing to use perl as
well believing that READ isn't always the answer. Oh well.

--tim

Duane Rettig

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Raymond Laning <rcla...@west.raytheon.com> writes:

> the people responding to your postings evidently never had to deal with
> integrating legacy (e.g. paleolithic) systems.

Why did you assume that? What evidence do you give?

> I had to write a
> formatted-read function to read output from fortran programs that could
> not be maintained because the people that wrote them were no longer
> employed (or in some cases, living). I am sorry that the sourcecode for
> my function is no longer available to me else I would pass it along, but
> IIRC it had many similarities to yours

A couple of questions for you:

1. Did you ever have personal control over such sources? Or was
it owned by the organization you wrote it for? If it was owned
by others, did they have a policy of non-propagation of such
innovations to the outside world?

2. If it was at all under your control to promulgate the sources,
did you consider the functionality to be of general-purpose
use? Whether true or not, did you seek outside help to further
generalize it?

3. If the code was fully general, did you try to pass this code along
as a potential enhancement to the Common Lisp spec?

4. If you get to this question without rejecting the other three,
why did you then not pass it along while you had control of the
code?

These questions are rhetorical; I do not want the answers to them.
I am simply revealing my thought process for asking questions
of Mr. Zoller, who is in the very beginning stages of a similar
process.

Duane Rettig

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Dirk Zoller <d...@sol-3.de> writes:

> Duane Rettig wrote:
> > > I always felt that the very powerful (format) lacks a counterpart
> > > for input. Like in C there is scanf() which in some sense reverses
> > > the effect of printf().
> >

> > Perhaps this is a naiive question, but how often do you really use
> > scanf? Is it really useful to you?
>
> Less often than xprintf(), but for good reasons. I have to deal with
> messages formatted by other software and sscanf() is extremely handy
> to chop these into the right pieces and dig out the values at the same
> time.
>
> I also define simple config file formats with a certain flexibility
> which I achieve without massive parsers just by trying to sscanf() a
> line in various ways. Without much hassle you get the information:
> Is this the line you're expecting, what are the values?

Understood. It seems, though, like you are not talking about the
inverse of format, but instead the inverse of scanf, in lisp. And
while it might be possible to provide a tool that does both, they
are really worlds apart (or should I say: languages apart?)
If this description is allegorical, or if you are looking for a tool
that parses both lisp output and C output, then you should state
what you want to accomplish specifically, otherwise I am taking
your example to be literal.

> > Or perhaps you are fitting simpler problems into scanf solutions:
> > I found a few uses of scanf in our own code, but most of them are
> > of the form:
> >
> > sscanf( argv[1], "%d", &n );
> >
> > which is much more simply and efficiently written as a call to atoi()
> > or atol().
>
> Only if you can live with the uncertainty if that was a number at all
> what you handed over to atoi(). I usually perfer something like
>
> if (1 != sscanf (line, "%d %n", &value, &length) ||
> length != strlen (line))
> that was no number;
>
> Simple enough, as it also catches cases where I expect a number of
> numbers or strings with certain delimiters. That's just great.
> (Don't try this with Borland or Microsoft C, their scanf() is broken.)

This is not failsafe because C does not type its data; whether it is
a number or a string or a struct, "it's all bits".

> > Perhaps read-from-string, followed by a type test, is a simpler
> > way to solve some of the higher level problems you would like to
> > solve with a (setf format) function.
>
> Sounds a little like, hey there's no need for sscanf(), you can
> achieve all that with a little char* hackery. You can, but it's not
> convenient. In the case of Lisp it is also very inefficient.
>
> (To support that efficiency argument: I sed/awk-ed my input file into
> nice Lisp expressions. That took a second. Then I handed it to (read),
> that took 20 seconds. Besides having to resort to such a technique for
> a simple data input problem is not exactly what I expect from a general
> purpose high level language, the performance of (read) when applied to
> lots of data was -- at least with my system -- very poor.)

This doesn't really support an efficiency argument, because it is purely
anecdotal. Please provide real data and/or code with which to reproduce
this.

> All people pointing to (read) are missing the point. The most grotesque
> idea I heard was hacking read macros to do some parsing on input. This
> is probably very exciting, but I'd still prefer if I could achieve
> the same with a simple and effective tool which works similarly to
> (format).

It is not always correct to use read. If you are trying to parse C data,
Lisp's reader is not the one to use.

If, on the other hand, you are talking about inter-language communication
in general, then that is a much larger issue.

Erik Naggum

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

* Marco Antoniotti <mar...@cs.nyu.edu>

| Yes. And in the most general case you must resolve to writing a
| complex parser (which may rely on scanf) to handle the quirkiness of
| the format. That is why XML is a step in the right direction.

Bzzzt. Just Plain Wrong. XML does _exactly_ nothing to help this.
It doesn't even _enable_ something that helps the situation. XML is
just syntax for naming elements in a structure. That structure has
a view, according to the granularity at which you want to process it.

In Common Lisp, the Lisp reader increases the granularity to the
object level. This is very good. This is in fact brilliant. XML
does no such thing. XML only names the _strings_ that somehow make
up the objects, the operative word being "somehow".

How do you write a date in XML? I favor <date>2000-10-10</date>.
Others <date><year>2000</year><month>10</month><day>10</day><date>,
and yet others prefer to omit the century (nothing learned from
Y2K), write the date in human-friendly forms, or even using the
names of the months, abbreviated, in local languages, misspelled.

You can teach the Common Lisp reader to accept @2000-10-10 as a date
object, and I have. It works like a charm. What does XML offer
above and beyond specific object notations? And speaking of those
"objects", XML is supposed object-oriented, but in reality, it's
_only_ character-oriented, as in: no object in sight, as in: those
who get the named strings (counting both elements and attributes)
from XML need to parse them for their own contents -- because, and
this surprises the XML people when the limitation of their bogus
approach dawns on them, real data is _not_ made up of strings.
Strings constitute _representations_ of the data, which must be
parsed, checked for consistency and used to create objects, and if
this sounds like we're back at square 1, that's exactly the case.

XML is a giant step in no direction at all. If a syntax doesn't
produce objects that may be manipulated as such, it's worthless.
In the case of XML, it's a step in the right direction for all the
hopeless twits who otherwise wouldn't have a job in the booming IT
industry, for all those H1B visa applicants who would never have a
chance to get out of their rotten, backward countries, etc, but as
far as the information is concerned, our ability to read and write
data consistently and portably, XML offers us exactly _nothing_, but
carries huge expenses and causes investments and time to be diverted
from every smarter solution, which could be a competitor... which is
why such fine information custodians as Microsoft are adopting and
embracing it.

This is not to say that XML can't be used productively, but it isn't
_XML_ that's doing it when it's done, it's the semantics you add to
the syntax that does it, the objects that wind up in memory in some
computer somewhere and which there exists code to manipulate. You
can do that better, cheaper, faster, and even better standardized
without XML than with XML. XML is a truly _magnificent_ waste.

#:Erik

Erik Naggum

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

* Dirk Zoller <d...@sol-3.de>

| The rest is blah blah, partly right, partly wrong, mostly off topic,
| just the stuff to heat up the discussion a little.

Why are you so easily manipulated? Why are you telling everyone?

Where's the purpose to your communication that enables others to see
where you're going and to share your journey with you? Responses
like the above are clear indicators that you don't have any purpose
of your own and get side-tracked by any contrary information. This
also explains why you think the way you do about how to use Lisp and
why you are reinventing an inferior non-solution to a non-problem.
Think carefully about what you want to accomplish, do not focus on
the means of accomplishing it until you are ready. Above all, don't
get all self-defensive because you weren't ready in time -- just
think about it more and _become_ ready at some later time. It's OK
to make mistakes if they are recognized as such, but not OK if you
defend them as if they weren't.

This was intended to return the discussion to a normal temperature
despite your stupid desire to heat it up for your own entertainment.
If you feel heated up, answer the first two questions honestly, and
realize that you have exposed yourself _way_ too much already.

Erik Naggum

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

* Raymond Laning <rcla...@west.raytheon.com>

| the people responding to your postings evidently never had to deal
| with integrating legacy (e.g. paleolithic) systems.

Or that's just what they had, but they did it the right way.

| I had to write a formatted-read function to read output from fortran
| programs that could not be maintained because the people that wrote
| them were no longer employed (or in some cases, living). I am sorry
| that the sourcecode for my function is no longer available to me
| else I would pass it along, but IIRC it had many similarities to
| yours

Then there's no wonder you, too, feel that legacy systems are
painful and that the right solution lies in simple-minded but overly
powerful tools like regular expressions and simple-minded parsers.

Actually _understanding_ a legacy data format is not easy, as most
of the people who write their own data formats are incredibly stupid
and short-sighted (as in writing years with two digits), and you're
trying to use all your brainpower to be as dumb as someone who
didn't have a clue that someday someone would have to think like
they did, because they didn't think at all. Clearly, a regular
expression or something like "scanf" can't hack this -- both are
rife with the same kind of short-sightedness that produce such
random results. Hoping for a match between the outcomes of two
random processes is just insane.

Writing an input processor ("reader") for some foreign language or
data format is not something you do by reversing "format". Hell,
you don't _use_ format to produce syntactically correct output in
other syntaxes, either. format is meant for _human_ consumption.

Some day, programmers will understand that there are three ways to
represent information: computer-to-human, human-to-computer, and
computer-to-computer; they have exactly _nothing_ in common which
you can use to deal with another when you have dealt with one.
Tools that seem to work most of the time (perl), or that promise
something they cannot possibly deliver (XML), will only delay it.

Duane Rettig

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Dirk Zoller <d...@onlinehome.de> writes:

> Duane Rettig wrote:
>
> > Understood. It seems, though, like you are not talking about the
> > inverse of format, but instead the inverse of scanf, in lisp.
>

> and more evidence of confusion, like

>
> > This is not failsafe because C does not type its data; whether it is
> > a number or a string or a struct, "it's all bits".
>

> Sigh!!
>
>
> Please folks, stop this ignorant ranting against other languages.
> That's the constant habit in all *losing* languages groups.

I now see my mistake. It was in taking the question you posed
originally:

>> Why isn't such a nice function already defined in Common Lisp?

at face value, and trying to lead you to some reasonable set of
possible answers. Instead, it is clear that it wasn't an honest
question at all; it was nothing but a troll.

I apologize to you and to the rest of this newsgroup for wasting
everyone's time.

Johannes Beck

unread,

Oct 10, 2000, 3:00:00 AM10/10/00

to

Duane Rettig wrote:

>
> Dirk Zoller <d...@onlinehome.de> writes:
>
> > Hello,
> >
> > I always felt that the very powerful (format) lacks a counterpart
> > for input. Like in C there is scanf() which in some sense reverses
> > the effect of printf().
>

> Perhaps this is a naiive question, but how often do you really use
> scanf? Is it really useful to you?
>

> Or perhaps you are fitting simpler problems into scanf solutions:
> I found a few uses of scanf in our own code, but most of them are
> of the form:
>
> sscanf( argv[1], "%d", &n );
>
> which is much more simply and efficiently written as a call to atoi()
> or atol().
>

> Perhaps read-from-string, followed by a type test, is a simpler
> way to solve some of the higher level problems you would like to
> solve with a (setf format) function.

Over the years I have come to the point of absolutely avoiding
read-from-string to parse string data from the outside world (eg user
interface, text files, sockets). you always forget to put an
error-handler around it or play with the reader before it is secure to
use read-from-string. and afterwards you have to the check the type of
the result if its like what you've expected.

if there's the chance that some wrong input will cause an error in your
program someone will make this wrong input (imagine how strings like
"(foo" "#.bla" are treated by read-from-string).

So there's definitely a need for a good and stable string parsing
functions besides the reader.

Bye
Joe

--
Johannes Beck http://home.arcor-online.de/johannes.beck/

Dirk Zoller

unread,

Oct 10, 2000, 8:39:44 PM10/10/00

to

Duane Rettig wrote:

> Understood. It seems, though, like you are not talking about the
> inverse of format, but instead the inverse of scanf, in lisp.

and more evidence of confusion, like

> This is not failsafe because C does not type its data; whether it is
> a number or a string or a struct, "it's all bits".

Sigh!!

Please folks, stop this ignorant ranting against other languages.
That's the constant habit in all *losing* languages groups.

> This doesn't really support an efficiency argument, because it is purely
> anecdotal. Please provide real data and/or code with which to reproduce
> this.

Correct. My "anecdote" didn't technically support an argument.
I phrased that badly.

But this experiment I did was good enough to convince me that (read) is
not the answer, not even with massive aid from evil alien tools which
were written in -- shrudder -- C.

> It is not always correct to use read. If you are trying to parse C data,
> Lisp's reader is not the one to use.

There might be "Lisp-data" in the same sense as there are "Dbase data" or
"MS-Word 2000 data".

But there is no such thing like "C-data" out there.

And at first glance, Lisp does not seem to be up to this situation.
Leave it like that. Continue to take pride in either your skills to
work around this huge gap or your ignorance of not seeing it.
But don't complain about people preferring other tools.

Dirk Zoller

unread,

Oct 10, 2000, 9:33:14 PM10/10/00

to

Sorry list, I should have known better.
Maybe it has some entertaining value.

Erik Naggum wrote:
> * Dirk Zoller <d...@sol-3.de>
> | The rest is blah blah, partly right, partly wrong, mostly off topic,
> | just the stuff to heat up the discussion a little.
>
> Why are you so easily manipulated? Why are you telling everyone?

Well, your comments tend in all sorts of directions which seemed useless
with respect to what I was trying to say.

You also make cheap points insofar you simply claim when the world is not
perfect, then it just has to be perfect (compilers just must be compiling
good, data formats just have to be well designed etc etc) Of course its
you who decides what is perfect. A never ending source of joy for you.

This is not helpful to me and yes, this made your posting off topic blah blah.

> Where's the purpose to your communication that enables others to see
> where you're going and to share your journey with you?

Parse error in sentence.

> Responses
> like the above are clear indicators that you don't have any purpose
> of your own and get side-tracked by any contrary information.

I would not go so far as to call your postings information.
In fact I'm trying to get not -- too -- distracted.
You're doing a good job at distracting. Years of practice I guess.

> This
> also explains why you think the way you do about how to use Lisp and
> why you are reinventing an inferior non-solution to a non-problem.

See above, your trick is to aggressively deny anything not matching your
notion of perfection of the day.

Doing so at maximum length, implicitly pretending you could do better,
probably never showing a constructive solution. Yawn.

> Think carefully about what you want to accomplish, do not focus on
> the means of accomplishing it until you are ready. Above all, don't
> get all self-defensive because you weren't ready in time -- just
> think about it more and _become_ ready at some later time. It's OK
> to make mistakes if they are recognized as such, but not OK if you
> defend them as if they weren't.

You're so kind master. I really buy that you mean that.

> This was intended to return the discussion to a normal temperature

Here I'd been your friend again :-)

> despite your stupid desire to heat it up for your own entertainment.

But that spoiled it :-(

> If you feel heated up, answer the first two questions honestly, and

I feel annoyed. Bugged.

> realize that you have exposed yourself _way_ too much already.

Oh really? Damn!

Erik Naggum

unread,

Oct 10, 2000, 10:21:29 PM10/10/00

to

* Dirk Zoller <d...@onlinehome.de>

| Sorry list, I should have known better.

Yes, you should. I'll repeat the questions you should ask yourself
and answer honestly without posting any more personally revealing
nonsense.

Why are you so easily manipulated? Why are you telling everyone?

You have chosen to take on the role of a village idiot on tour,
asking everyone to yank your chain in order to entertain them. It
isn't entertaining. It's very stupid and annoying to watch to boot.

Go reimplement scanf in XML, now.

The Glauber

unread,

Oct 11, 2000, 3:00:00 AM10/11/00

to

In article <39E388DE...@onlinehome.de>,
d...@onlinehome.de, d...@sol-3.de wrote:
[...]

> Yes I accidentally posted some old garbage file.
>
> Please find attached a version I just checked to be working with clisp.

[...]

Thanks, i'll give it a try.

Incidentally, there's a library of Scheme functions called SLIB
(http://swissnet.ai.mit.edu/~jaffer/SLIB.html)
which includes scanf, and even (horror!) printf.
Someone with a lot of free time might try to translate that to Lisp.

Marco Antoniotti

unread,

Oct 11, 2000, 3:00:00 AM10/11/00

to

Erik Naggum <er...@naggum.net> writes:

> * Marco Antoniotti <mar...@cs.nyu.edu>
> | Yes. And in the most general case you must resolve to writing a
> | complex parser (which may rely on scanf) to handle the quirkiness of
> | the format. That is why XML is a step in the right direction.
>
> Bzzzt. Just Plain Wrong. XML does _exactly_ nothing to help this.
> It doesn't even _enable_ something that helps the situation. XML is
> just syntax for naming elements in a structure. That structure has
> a view, according to the granularity at which you want to process
> it.

Come on! I did not say that XML is an absolutely good thing. I was
merely impliyng that XML is a reinforcement of the Fundamental Law of
Programming Languages

\limit{y \rigtharrow \mathrm{today} + \epsilon} PL_{y}
= \mathrm{Common Lisp}_{1989}
+ \mathrm{Type Inference}

(where $y$ is the year and $PL_y$ is any programming language other
than Common Lisp as it is used in year $y$). (I know the TeX may be
wrong! :) )

XML *almost* serves as S-exprs for the rest of the world (namely the
C/C++, Perl and Java world). The pragmatics of this fact are IMHO
very important. Hype has its importance.

As for the rest of your message, it is right on the money.

Marco Antoniotti

unread,

Oct 11, 2000, 3:00:00 AM10/11/00

to

The Glauber <thegl...@my-deja.com> writes:

> In article <39E388DE...@onlinehome.de>,
> d...@onlinehome.de, d...@sol-3.de wrote:
> [...]

> > Yes I accidentally posted some old garbage file.
> >
> > Please find attached a version I just checked to be working with clisp.

> [...]
>
> Thanks, i'll give it a try.
>
> Incidentally, there's a library of Scheme functions called SLIB
> (http://swissnet.ai.mit.edu/~jaffer/SLIB.html)
> which includes scanf, and even (horror!) printf.
> Someone with a lot of free time might try to translate that to Lisp.

Why? :)

Rainer Joswig

unread,

Oct 11, 2000, 3:00:00 AM10/11/00

to

In article <39E3714B...@arcormail.de>, Johannes Beck
<johann...@arcormail.de> wrote:

> So there's definitely a need for a good and stable string parsing
> functions besides the reader.

There are simple options like the META parser from
Henry Baker.

In a more interesting world we all might use CLIM:

http://www.xanalys.com/software_tools/reference/lwu41/climuser/GUID_105.HTM
http://www.xanalys.com/software_tools/reference/lwu41/climuser/GUID_106.HTM

Short explanation:

There are basic presentation types like INTEGER or STRING.
Presentation types can take PARAMETERS (like min-value and max-value) and OPTIONS
(like the base of the integer to be read).
There are more complex presentation types like SEQUENCE, OR, AND, ...
Presentation types form a hierarchy (like in CLOS).
One can ACCEPT and PRESENT objects. Special methods can
define own parsers/presenters for certain presentation types.
ACCEPTING from strings is possible via ACCEPT-FROM-STRING:

CLIM:ACCEPT-FROM-STRING type string
&key view default default-type activation-gestures
additional-activation-gestures delimiter-gestures
additional-delimiter-gestures start end

Like ACCEPT, except that the input is taken from string, starting at
the position specified by start and ending at end. view, default, and
default-type are as for accept.

ACCEPT-FROM-STRING returns an object and a presentation type (as in ACCEPT),
but also returns a third value, the index at which input terminated.

Parsing an integer between 0 and 1000 with base 16:

? (clim:accept-from-string '((clim:integer 0 1000) :base 16)
"FF")
255
((INTEGER 0 1000) :BASE 16)
2

One of FOO or BAR:

? (clim:accept-from-string '(clim:member-sequence (foo bar))
"foo")
FOO
((CLIM:COMPLETION (FOO BAR) :TEST EQL) :HIGHLIGHTER #<Compiled-function CLIM-INTERNALS::HIGHLIGHT-COMPLETION-CHOICE #xED72B6E> :PRINTER #<Compiled-function CLIM:WRITE-TOKEN #xEC47BAE>)
3

Some of FOO, BAR and BAZ.

? (clim:accept-from-string '(clim:subset-sequence (foo bar baz))
"foo,bar")
(FOO BAR)
((CLIM:SUBSET-COMPLETION (FOO BAR BAZ) :TEST EQL) :HIGHLIGHTER #<Compiled-function CLIM-INTERNALS::HIGHLIGHT-COMPLETION-CHOICE #xED72B6E> :PRINTER #<Compiled-function CLIM:WRITE-TOKEN #xEC47BAE>)
7

Yes or No:

? (clim:accept-from-string 'clim:boolean "Yes")
T
CLIM:BOOLEAN
3

A keyword:

? (clim:accept-from-string 'clim:keyword ":foo")
:FOO
KEYWORD
4

A float between 0.3 and 0.5:

? (clim:accept-from-string '(clim:float 0.3 0.5) "0.4")
0.4
(FLOAT 0.3 0.5)
3

A sequence of floats separated by comma:

? (clim:accept-from-string '(clim:sequence clim:float) "0.4,0.3,0.4")
(0.4 0.3 0.4)
(SEQUENCE FLOAT)
11

A boolean, a string and an integer separated by comma:

? (clim:accept-from-string
'(clim:sequence-enumerated clim:boolean clim:string (clim:integer 0 100))
"Yes,Rainer,36")
(T "Rainer" 36)
(CLIM:SEQUENCE-ENUMERATED CLIM:BOOLEAN STRING (INTEGER 0 100))
13

A boolean or an integer:

? (clim:accept-from-string
'(clim:or clim:boolean (clim:integer 0 100))
"Yes")
T
(OR CLIM:BOOLEAN (INTEGER 0 100))
3

A boolean, a string and an integer separated by space:

? (clim:accept-from-string
'((clim:sequence-enumerated clim:boolean clim:string (clim:integer 0 100))
:separator #\space)
"Yes Rainer 36")
(T "Rainer" 36)
((CLIM:SEQUENCE-ENUMERATED CLIM:BOOLEAN STRING (INTEGER 0 100)) :SEPARATOR #\Space)
13

Make it shorter by defining an abbrevition MY-RECORD-ROW:

? (clim:define-presentation-type-abbreviation my-record-row ()
'((clim:sequence-enumerated clim:boolean clim:string (clim:integer 0 100))
:separator #\space))
MY-RECORD-ROW

? (clim:accept-from-string 'my-record-row "Yes Rainer 36")
(T "Rainer" 36)
((CLIM:SEQUENCE-ENUMERATED CLIM:BOOLEAN STRING (INTEGER 0 100)) :SEPARATOR #\Space)
13

It is especially interesting, because you can define new presentation types, which
inherit from others (like "age" could inherit from integer with a specified range).
Sure you can define presentations based on CLOS classes, etc...

Paolo Amoroso

unread,

Oct 12, 2000, 3:00:00 AM10/12/00

to

On Wed, 11 Oct 2000 21:37:17 +0200, Rainer Joswig
<jos...@corporate-world.lisp.de> wrote:

> In a more interesting world we all might use CLIM:

Just a quick reminder that an effort is under way to develop a free
implementation of CLIM:

http://www.mikemac.com/mikemac/McCLIM/index.html

Mailing list:

http://www2.cons.org:8000/mailman/listinfo/free-clim

Mailing list archives:

http://www2.cons.org:8000/pipermail/free-clim/ (new)
http://www3.cons.org/maillists/free-clim (old)

Paolo
--
EncyCMUCLopedia * Extensive collection of CMU Common Lisp documentation
http://cvs2.cons.org:8000/cmucl/doc/EncyCMUCLopedia/

Raymond Laning

unread,

Oct 17, 2000, 3:00:00 AM10/17/00

to

<snip>

> Why did you assume that? What evidence do you give?

<snip>

> 1. Did you ever have personal control over such sources? Or was
> it owned by the organization you wrote it for? If it was owned
> by others, did they have a policy of non-propagation of such
> innovations to the outside world?
>
> 2. If it was at all under your control to promulgate the sources,
> did you consider the functionality to be of general-purpose
> use? Whether true or not, did you seek outside help to further
> generalize it?
>
> 3. If the code was fully general, did you try to pass this code along
> as a potential enhancement to the Common Lisp spec?
>
> 4. If you get to this question without rejecting the other three,
> why did you then not pass it along while you had control of the
> code?
>
> These questions are rhetorical; I do not want the answers to them.
> I am simply revealing my thought process for asking questions
> of Mr. Zoller, who is in the very beginning stages of a similar
> process.
>

> --
> Duane Rettig Franz Inc. http://www.franz.com/ (www)
> 1995 University Ave Suite 275 Berkeley, CA 94704
> Phone: (510) 548-3600; FAX: (510) 548-8253 du...@Franz.COM (internet)

Although you did not wish answers for these questions, I will provide
them in case they might be useful to Mr. Zoller:

the code was developed for Wisdom Systems, which was a competitor to
ICAD. In order to deploy our system at one of our clients, it was
necessary to interface to some FORTRAN analysis code that was not
maintained. While rewriting the analysis code in the Concept Modeller
(an object-oriented, Lisp system) would have been the Right Thing, it
was not an option at the time. Wisdom Systems struggled to make a
profit on its software, so giving away such ancillary code, while
desirable, was not an option. In hindsight, I probably should have
pursued the actions you ask about, because the sources are now likely at
the bottom of the Charles River along with the rest of Wisdom Systems'
code, where it went after ICAD bought WS.

And yes, it was general enough to be useful, IMHO. I'm sure
ICAD/Concentra would gladly part with the code, since I doubt they still
support the Concept Modeller ;-)