ANSI CL proposal: format

Kent M Pitman

unread,

Feb 21, 1998, 3:00:00 AM2/21/98

to

Sam Steingold <s...@usa.net> writes:

> 1. The existing user-defined format functionality of ~/function/ has a
> disadvantage of the requirement that function is in the package
> COMMON-LISP-USER.
>
> 2. ~/function/ is far longer than ~= and less readable (IMHO).

Suppose module FOO defined ~= to print an object as its boolean (e.g.,
3 or ALPHA or (C . D) as T, and NIL as NIL). Suppose module BAR
defined ~= to print a sequence as its length. eg., "Foo" or (A B C)
as 3. Suppose I load them both. Won't the second one I load break
the first? CL users don't have define-format-char not because it's
hard to thin of or to implement; it's because it's TOO valuable--no
portable program can assume it loads in a clean environment, nor that
it is the ONLY user.

~/xxx/ may not seem very useful, but ~/foo:xxx/ should, since it keeps modules
FOO and BAR from colliding.

Erik Naggum

unread,

Feb 22, 1998, 3:00:00 AM2/22/98

to

* Kent M Pitman

| Suppose module FOO defined ~= to print an object as its boolean (e.g., 3
| or ALPHA or (C . D) as T, and NIL as NIL). Suppose module BAR defined ~=
| to print a sequence as its length. eg., "Foo" or (A B C) as 3. Suppose
| I load them both. Won't the second one I load break the first? CL users
| don't have define-format-char not because it's hard to thin of or to
| implement; it's because it's TOO valuable--no portable program can assume
| it loads in a clean environment, nor that it is the ONLY user.

hm. cannot this argument be repeated for reader macros and PPRINT and
most anything useful that uses global resources? would it not make more
sense to define some global object that controlled this, like *READTABLE*
or *PRINT-PPRINT-DISPATCH* that removed the possibilities for collisions
to the programmer's control?

#:Erik
--
God grant me serenity to accept the code I cannot change,
courage to change the code I can, and wisdom to know the difference.

Howard R. Stearns

unread,

Feb 23, 1998, 3:00:00 AM2/23/98

to

(This is a spin-off thread from the original.)

Why are the package qualification rules within ~/ different than those
for symbols? In other words, why are "~/x:foo/" and "~/x::foo/" the
same when handled the format string "reader" yet "x:foo" and "x::foo"
are different when handled by the ordinary symbol reader when FOO is not
exported from package X?

If it is an issue where we don't want to burden the implementors, would
it be wrong for an implementation to signal an error (at either run-time
or compile-time) for (format x "~/x:foo/" arg) if FOO is not exported
from package X?

----------------------
From "22.3.5.4 Tilde Slash: Call Function"

All of the characters in name are treated as if they were upper case. If
name contains a single colon (:) or double colon (::), then everything
up to but not including the first ":" or "::" is
taken to be a string that names a package. Everything after the first
":" or "::" (if any) is taken to be a string that
names a symbol. The function corresponding to a ~/name/ directive is
obtained by looking up the symbol that has the
indicated name in the indicated package. If name does not contain a ":"
or "::", then the whole name string is looked
up in the COMMON-LISP-USER package.

Kent M Pitman wrote:
>
> Sam Steingold <s...@usa.net> writes:
>
> > 1. The existing user-defined format functionality of ~/function/ has a
> > disadvantage of the requirement that function is in the package
> > COMMON-LISP-USER.
> >
> > 2. ~/function/ is far longer than ~= and less readable (IMHO).
>

> Suppose module FOO defined ~= to print an object as its boolean (e.g.,
> 3 or ALPHA or (C . D) as T, and NIL as NIL). Suppose module BAR
> defined ~= to print a sequence as its length. eg., "Foo" or (A B C)
> as 3. Suppose I load them both. Won't the second one I load break
> the first? CL users don't have define-format-char not because it's
> hard to thin of or to implement; it's because it's TOO valuable--no
> portable program can assume it loads in a clean environment, nor that
> it is the ONLY user.
>

Barry Margolin

unread,

Feb 23, 1998, 3:00:00 AM2/23/98

to

In article <34F1BE...@elwood.com>,

Howard R. Stearns <how...@elwood.com> wrote:
>Why are the package qualification rules within ~/ different than those
>for symbols? In other words, why are "~/x:foo/" and "~/x::foo/" the
>same when handled the format string "reader" yet "x:foo" and "x::foo"
>are different when handled by the ordinary symbol reader when FOO is not
>exported from package X?

Because format strings are processed at run time, not read time, and FORMAT
has no way of knowing what the state of the package system was when the
source was read in. Someone working on a source file in the FOO package
should be able to refer to his internal symbols using FOO:BAR, rather than
FOO::BAR as if he were referencing them from another package.

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Cambridge, MA
Support the anti-spam movement; see <http://www.cauce.org/>
Please don't send technical questions directly to me, post them to newsgroups.

Howard R. Stearns

unread,

Feb 24, 1998, 3:00:00 AM2/24/98

to

Barry Margolin wrote:
>
> In article <34F1BE...@elwood.com>,
> Howard R. Stearns <how...@elwood.com> wrote:
> >Why are the package qualification rules within ~/ different than those
> >for symbols? In other words, why are "~/x:foo/" and "~/x::foo/" the
> >same when handled the format string "reader" yet "x:foo" and "x::foo"
> >are different when handled by the ordinary symbol reader when FOO is not
> >exported from package X?
> >

> > If it is an issue where we don't want to burden the implementors, would
> > it be wrong for an implementation to signal an error (at either run-time

> > or compile-time) for (format x "~/x:foo/" arg) if FOO is not exported
> > from package X?

>
> Because format strings are processed at run time, not read time, and FORMAT
> has no way of knowing what the state of the package system was when the
> source was read in. Someone working on a source file in the FOO package
> should be able to refer to his internal symbols using FOO:BAR, rather than
> FOO::BAR as if he were referencing them from another package.
>

Thanks for your reply, Barry, but I guess I'm still confused. I don't
see WHY a user would feel he should be able to reference internal
symbols using foo:bar rather than foo::bar.

I understand that there is a difference between resolving a symbol
reference at read time vs. processing time for the format-string.
However, the things effected by this are:
1. What package is to be assumed when no qualifier is used.
2. The time at which an error is signalled if the symbol is not
accessible, etc.
The time in which the lookup is performed should NOT effect whether or
not the symbol is exported any more than it should effect whether the
symbol is available at all. (Lets not get into time invariants for
packages unless we really have to. My point is that if the package
system is modified, all bets are off anyway, not just the issue of
whether an accessible symbol is also exported.)

So, I'm still left with the questions:
1. Why the disparity between the semantics of ":" vs "::" in the two
situations, and.
2. Would there be anything wrong with an imlementation signalling the
same error at format string processing time as it would for symbol
lookup by the reader if the symbol is referenced with a single colon and
not, if fact, exported.

Kent M Pitman

unread,

Feb 25, 1998, 3:00:00 AM2/25/98

to

Erik Naggum <cle...@naggum.no> writes:

>
> * Kent M Pitman

> | Suppose module FOO defined ~= to print an object as its boolean (e.g., 3
> | or ALPHA or (C . D) as T, and NIL as NIL). Suppose module BAR defined ~=
> | to print a sequence as its length. eg., "Foo" or (A B C) as 3. Suppose
> | I load them both. Won't the second one I load break the first? CL users
> | don't have define-format-char not because it's hard to thin of or to
> | implement; it's because it's TOO valuable--no portable program can assume
> | it loads in a clean environment, nor that it is the ONLY user.
>

> hm. cannot this argument be repeated for reader macros

No, because LOAD binds *READTABLE* and so you can do [as I often do]:

(IN-PACKAGE "MINE")
(EVAL-WHEN (:COMPILE-TOPLEVEL :LOAD-TOPLEVEL :EXECUTE)
(SETQ *READTABLE* (MY-SPECIALIZED-READTABLE)))

This is entirely safe and local if you don't modify the global readtable.
As to modifying the global readtable--you just have to provide equivalent
debugging functions that allow the user to select which one to be active
at any given time--you may not be able to mix syntaxes, but the interactive
user should be able to cope with that.

> and PPRINT

Modifications to PPRINT should, IMO, only be done for symbols in one's own
package. I agree that if you change the print syntax for the standard syntax
stuff, you're asking for trouble. (Even there, *PRINT-PPRINT-DISPATCH*
gives some sense of partitioning, but since operators are side-effecting
potentially shared structure there, I'm queasy in a way that I'm less so about
readtables. There are more ways to share a list than there are to share
a readtable...)

> and
> most anything useful that uses global resources?

Note if those global resources are partitioned in a way that is "closed over".

Symbol semantics is effectively closed-over in CL via the package system;
something not true in Scheme, for example, where there is no package system
and simply going through READ is not enough to attach a partition.

Readmacros, once executed, are "closed over" in the sense that later changing
the syntax doesn't undo a previous correct parse, so the fact that the readtable
is constantly changing is not a problem.

> would it not make more
> sense to define some global object that controlled this, like *READTABLE*
> or *PRINT-PPRINT-DISPATCH* that removed the possibilities for collisions
> to the programmer's control?

In principle, yes, but in practice, no. It happened on the Lisp
Machine and was an absolute and utter disaster (IMO)
comptatibility-wise. ~nX in Zetalisp meant do-n-spaces and ~X in CL
means Hex (consume an arg). The problem is that format strings are a kind of
data. When you see doc for ERROR and it says "takes a format string",
you need to know "what kind?". You don't want to document for every function
which kind of format string it takes or it defeats a lot of the purpose.
BETTER would be to have a syntax that reifies the meaning of a format string
using the format-control hair, turning them into compiled format strings.
e.g., a #"..." that would produce a function which was the compiled format string.
Then the prevailing syntax at the time of READ (i.e., the caller's choice) would
dominate, rather than the prevailing syntax at the time of definition of the
operator being called (something that ought by right be a private issue).

> #:Erik

Hey, what happened to that #\Erik character? I'd rather not have this
conversation again and again with a bunch of Erik clones.

Erik Naggum

unread,

Feb 27, 1998, 3:00:00 AM2/27/98

to

* Kent M Pitman

| No, because LOAD binds *READTABLE* and so you can do [as I often do]:
|
| (IN-PACKAGE "MINE")
| (EVAL-WHEN (:COMPILE-TOPLEVEL :LOAD-TOPLEVEL :EXECUTE)
| (SETQ *READTABLE* (MY-SPECIALIZED-READTABLE)))

point taken. incidentally, this seems like a super candidate for a
(standard) macro.

| Note if those global resources are partitioned in a way that is "closed
| over".

point taken.

| The problem is that format strings are a kind of data. When you see doc
| for ERROR and it says "takes a format string", you need to know "what
| kind?".

ok, let me back out a bit and suggest that the mechanisms available to
the FORMATTER macro should be available with a user-specified language.
that way, FORMAT and all its users and friends continue to work as they
used to do, but FORMATTER could take an optional second argument with a
table of some kind that specifies the behavior of each character, so the
remaining power of the FORMAT language could be utilized.

the reason people want to extend FORMAT is not to make global changes to
the formatter, but to be relieved of writing the entire FORMATTER engine
anew when need something specific. (at least if I can extrapolate from
"me" to "people".) it's a royal pain to create a macro that hacks up a
string to build another string it passes to FORMATTER so it can be used
the same way FORMATTER can. I tried this once with a TIME-FORMATTER, but
it stranded before I got it general enough to even satisfy myself. I
wrote FORMAT-TIME-STRING for Emacs because I hate hard-coded time formats
that can't be made to format according to ISO 8601, and wanted something
with similar power and ease. (another problem with TIME-FORMATTER was
that it should be able to accept a universal-time OR a list of decoded
time values, but that's not FORMAT's fault.)

(and while I'm at it, it would be nice to add LOOP constructs, too. :)

| Hey, what happened to that #\Erik character? I'd rather not have this
| conversation again and again with a bunch of Erik clones.

heh. I figured #\Erik was _way_ too constant (although EQ-ness was not
guaranteed). so now it's a fresh me with every article. this _may_ be
an exaggeration in the opposite direction, of course.

Kent M Pitman

unread,

Feb 27, 1998, 3:00:00 AM2/27/98

to

Erik Naggum <cle...@naggum.no> writes:

> * Kent M Pitman
> | No, because LOAD binds *READTABLE* and so you can do [as I often do]:
> |
> | (IN-PACKAGE "MINE")
> | (EVAL-WHEN (:COMPILE-TOPLEVEL :LOAD-TOPLEVEL :EXECUTE)
> | (SETQ *READTABLE* (MY-SPECIALIZED-READTABLE)))
>
> point taken. incidentally, this seems like a super candidate for a
> (standard) macro.

As the issue name IN-SYNTAX might suggest, I originally proposed it as
a macro and no one bought it. Desperate for the functionality (since
readmacros indeed made no sense to me without it), I ended up gutting
the proposal and turning it into a request for LOAD and COMPILE-FILE
to bind *READTABLE*. The group took mercy on me at that point. At
the time, no one implemented that, but all had seen the ill effects of
its absence and no one suggested any harm. Maybe with some experience
with the idiom, somoene will evolve back the originally proposed
name. :-)

> | The problem is that format strings are a kind of data. When you see doc
> | for ERROR and it says "takes a format string", you need to know "what
> | kind?".
>
> ok, let me back out a bit and suggest that the mechanisms available to
> the FORMATTER macro should be available with a user-specified language.
> that way, FORMAT and all its users and friends continue to work as they
> used to do, but FORMATTER could take an optional second argument with a
> table of some kind that specifies the behavior of each character, so the
> remaining power of the FORMAT language could be utilized.
>
> the reason people want to extend FORMAT is not to make global changes to
> the formatter, but to be relieved of writing the entire FORMATTER engine
> anew when need something specific. (at least if I can extrapolate from
> "me" to "people".) it's a royal pain to create a macro that hacks up a
> string to build another string it passes to FORMATTER so it can be used
> the same way FORMATTER can. I tried this once with a TIME-FORMATTER, but
> it stranded before I got it general enough to even satisfy myself. I
> wrote FORMAT-TIME-STRING for Emacs because I hate hard-coded time formats
> that can't be made to format according to ISO 8601, and wanted something
> with similar power and ease. (another problem with TIME-FORMATTER was
> that it should be able to accept a universal-time OR a list of decoded
> time values, but that's not FORMAT's fault.)

This is not a bad idea but is something that would need the details worked out.
I'd suggest keeping the time formatter in the back of your mind and writing a
general format engine, just as you suggest and then contributing it to
some public code repository for use and/or getting vendors to try it. You're
talking about a thing that's pretty big... it would probably need some field
testing to shake out the bugs, etc.

Alternatively, see if you can convince one or more vendors to generalize
their format implementation offering such a set of operators.

It sounds like an extremely productive idea, but it needs more than
one person's support and experience behind it, I'd think.

> (and while I'm at it, it would be nice to add LOOP constructs, too. :)

They were lost at the very last minute in the iteration proposal, as I recall.
I'm not sure why. There was a perfectly workable proposal and was ripe for
standardization.