Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

#'format and string interpolation

52 views
Skip to first unread message

David Bakhash

unread,
Jul 22, 1999, 3:00:00 AM7/22/99
to
Hi,

Is there a FORMAT-like routine out there that has a special escape sequence
for printing the values of string variables directly inside the format
strings? This would work as follows:

(setq str-1 "hello"
str-2 "world")

(format t "$str-1 beautiful $str-2!")

would print:

"hello beautiful world!"

(without the quotes, of course)

here, I used dollar sign, since that's what's used in Perl, but that's
irrelevant. Any pointers appreciated. I've found that (concatenate 'string
...) is also too adequate.

thanks,
dave

Vassil Nikolov

unread,
Jul 22, 1999, 3:00:00 AM7/22/99
to comp.la...@list.deja.com
David Bakhash wrote: [1999-07-22 14:26 -0400]

> Hi,
>
> Is there a FORMAT-like routine out there that has a special escape sequence
> for printing the values of string variables directly inside the format
> strings? This would work as follows:
>
> (setq str-1 "hello"
> str-2 "world")
>
> (format t "$str-1 beautiful $str-2!")
>
> would print:
>
> "hello beautiful world!"
>
> (without the quotes, of course)

[...]

Why is (FORMAT T "~A beautiful ~A!" str-1 str-2) unacceptable to you?


Vassil Nikolov
Permanent forwarding e-mail: vnik...@poboxes.com
For more: http://www.poboxes.com/vnikolov
Abaci lignei --- programmatici ferrei.

Russell Senior

unread,
Jul 22, 1999, 3:00:00 AM7/22/99
to
>>>>> "Vassil" == Vassil Nikolov <vnik...@poboxes.com> writes:

David> Is there a FORMAT-like routine out there that has a special
David> escape sequence for printing the values of string variables
David> directly inside the format strings? This would work as
David> follows:

David> (setq str-1 "hello" str-2 "world")

David> (format t "$str-1 beautiful $str-2!")

David> would print:

David> "hello beautiful world!"

David> (without the quotes, of course)

Vassil> Why is (FORMAT T "~A beautiful ~A!" str-1 str-2) unacceptable
Vassil> to you?

I had a similar problem recently trying to insert a DOS-style CRLF
line ending. I couldn't seem to get a literal #\Return into my
control-string from source code and I didn't want to pollute all of my
format calls with the extra parameter. My solution was simply to
build the control string in another format statement. In my case:

(defvar *fmtstr* (format nil "~~{~~:[.~~;~~:*~~A~~]~~^ ~~}~C~~%" #\Return))

which yields, approximately:

"~{~:[.~;~:*~A~]~^ ~}^M
"

then, I used the control string elsewhere, e.g.:

(format t *fmtstr* '(list of stuff))


--
Russell Senior ``The two chiefs turned to each other.
sen...@teleport.com Bellison uncorked a flood of horrible
profanity, which, translated meant, `This is
extremely unusual.' ''

David Bakhash

unread,
Jul 22, 1999, 3:00:00 AM7/22/99
to
Vassil Nikolov <vnik...@poboxes.com> writes:

> > Is there a FORMAT-like routine out there that has a special escape sequence
> > for printing the values of string variables directly inside the format
> > strings?> >
> [...]
>
> Why is (FORMAT T "~A beautiful ~A!" str-1 str-2) unacceptable to you?

If you've ever used Perl for CGI scripts, you might see how a format string
with over a dozen substitutions can become cumbersome to read, seeing a ~A and
then traversing through the other ~A's to find out its index, and then to
count down the list of &rest args to see which one goes there. It's shameful
in that context, though I'm sure that there's a simple enough fix. If not,
one can write a parser and a macro to do this stuff at compile-time, and I'm
sure it's been done. Maybe in CL-HTTP?

dave

Vassil Nikolov

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to comp.la...@list.deja.com
Russell Senior wrote: [1999-07-22 13:40 -0700]

[...]


> I had a similar problem recently trying to insert a DOS-style CRLF
> line ending. I couldn't seem to get a literal #\Return into my
> control-string

[...]

By the way, the effect of writing a #\Return to a text stream is
implementation-defined.

Pierre R. Mai

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
David Bakhash <ca...@bu.edu> writes:

It seems to me that a high-level approach might be better suited for
your problem, i.e. don't think about strings and string substitution,
but think about the stuff you want to generate. What is it?
Some structured markup language like HTML/DocBook/whatever? LaTeX?
Tables?

Then define a mapping between that and some internal Lisp
data structure, and write a rendering function, that traverses that
data structure and emits the stringified external representation.

Finally write some nice macrology to specify templates and
substitution at the data structure level.

If you take the short-cut of using Lisp lists as the internal data
structure, you can even use backquote for this, so you only have to
define a nice mapping and write the rendering function(s). For
example for HTML (or something similar):

(defun render-html-to-stream (stream html)
(if (consp html)
(render-html-element stream (car html) (cdr html))
(format stream "~A" html)))

(defun render-html-element (stream element-spec contents)
(let ((name (if (consp element-spec) (car element-spec) element-spec))
(attributes (if (consp element-spec) (cdr element-spec) nil)))
(format stream "<~A~{ ~A=\"~A\"~}>" name attributes)
(mapcar #'(lambda (html) (render-html-to-stream stream html)) contents)
(format stream "</~A>" name)))

(render-html-to-stream my-stream
`(html
(head
(title "This is a test page for " ,user-name ", the holy master!"))
((body :bgcolor "#2f2f2f")
(h1 "Welcome back " ,user-name)
(p "The text elements are concatenated...."))))

For a real solution you'll need to escape certain strings rendered,
keep track of some context to insert linefeeds in the right places,
and you'll probably want to have more fine-grained control over the
way you render certain types (i.e. have a special object type for
colors, or similar things), instead of relying on the Lisp printer to
do the right thing...

The "right" approach is to use your own high-level datastructures, and
invent some nice (read-)macros to make typing in templates easy. Or
read in templates from a file. Or a database.

See Common SQL's reader-syntax for an example of this: #\[ is heavily
overloaded to create the right SQL-expression objects, which are then
rendered to the backend database engine...

Regs, Pierre.

--
Pierre Mai <pm...@acm.org> PGP and GPG keys at your nearest Keyserver
"One smaller motivation which, in part, stems from altruism is Microsoft-
bashing." [Microsoft memo, see http://www.opensource.org/halloween1.html]

Stig Hemmer

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
David Bakhash <ca...@bu.edu> writes:
> here, I used dollar sign, since that's what's used in Perl, but that's
> irrelevant. Any pointers appreciated. I've found that (concatenate 'string
> ...) is also too adequate.

too adequate?

You might want to make a reader macro that expands for example
#"$str-1 beautiful $str-2!"
into for example
(concatenate 'string str-1 " beautiful " str-2)

Stig Hemmer,
Jack of a Few Trades.

Tim Bradshaw

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
* David Bakhash wrote:

> If you've ever used Perl for CGI scripts, you might see how a format
> string with over a dozen substitutions can become cumbersome to
> read, seeing a ~A and then traversing through the other ~A's to find
> out its index, and then to count down the list of &rest args to see
> which one goes there.

This is kind of a nice point. FORMAT really has what I think
linguists call a `cross-serial' dependency between the format string
and the arguments, and these basically don't occur in natural
languages, presumably because they're a pain for humans to
parse. Actually I think FORMAT is much simpler than the NL case but it
still might be a pain to read.

Anyway, is something like this what you are after?


(defun stringify (&rest strings/objects)
(apply #'concatenate 'string
(mapcar #'(lambda (x)
(typecase x
(string x)
(t (princ-to-string x))))
strings/objects)))

* (stringify "this is " 'stringify " thing with " 4 " arguments")
"this is stringify thing with 4 arguments"

? It's not quite as simple as perl, but it's OK I think.

--tim

Erik Naggum

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
* Russell Senior <sen...@teleport.com>

| I had a similar problem recently trying to insert a DOS-style CRLF
| line ending.

this particular problem is better dealt with at the streams level.

#:Erik
--
suppose we blasted all politicians into space.
would the SETI project find even one of them?

Erik Naggum

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
* David Bakhash <ca...@bu.edu>

| If you've ever used Perl for CGI scripts, you might see how a format
| string with over a dozen substitutions can become cumbersome to read,
| seeing a ~A and then traversing through the other ~A's to find out its
| index, and then to count down the list of &rest args to see which one
| goes there. It's shameful in that context, though I'm sure that there's
| a simple enough fix. If not, one can write a parser and a macro to do
| this stuff at compile-time, and I'm sure it's been done.

string substitution has always been the wrong approach, but since you
cannot easily build and use more advanced structures in Perl (or many
other Unix tools), that's what you use, because it works most of the
time. when it doesn't work is when you have magic characters that alter
the meaning of the resulting string. Unix tools are rife with security
holes because of this. e.g., CGI scripts have to be careful when passing
input strings to programs so they don't actually run other programs when
some other program re-interprets the strings.

building strings to be passed around and parsed at every junction in the
data flow is probably the dumbest design ever created. instead, build
and use real data structures. Lisp got this right, and the rest of the
world hasn't, so there's no need to import their braindamage into Lisp.
(Pierre R. Mai wrote what I had in mind on how to do this.)

David Bakhash

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
pm...@acm.org (Pierre R. Mai) writes:

> your problem, i.e. don't think about strings and string substitution,
> but think about the stuff you want to generate. What is it?
> Some structured markup language like HTML/DocBook/whatever? LaTeX?
> Tables?

> [...]


> Then define a mapping between that and some internal Lisp
> data structure, and write a rendering function, that traverses that
> data structure and emits the stringified external representation.

The markup is HTML, and I'm sure one day it may be one of these others.
That's why I'd rather not invent a macrology, though I think one can go nuts
here, and do something really cool. But the point is...

there already is a language that's nice and high-level here.

i.e. why invent another one. I'll explain further...

> The "right" approach is to use your own high-level datastructures, and
> invent some nice (read-)macros to make typing in templates easy. Or
> read in templates from a file. Or a database.

I don't want to turn this into a major Lisp project. There is a mechanism
that does this right. It's fine. I believe Perl does it right, with string
interpolation. For example, let's say I want to create a table, after doing a
database lookup, and getting a bunch of rows back. In Perl you'd do something
like:

print "
<p>The rows are shown below:</p>
<TABLE border>
<TR>
<TH align=center> col-1</TH>
<TH align=center> col-2</TH>
<TH align=center> col-3</TH>
</TR>
";
foreach $row (@rows) {
my %row = %$row;
print "
<TR>
<TD align=left> $row{'col-1'}</TD>
<TD align=left> $row{'col-2'}</TD>
<TD align=left> $row{'col-3'}</TD>
</TR>
";
}
...

It's nice, simple, and it works just fine. If you already know HTML then why
should you have to learn something else? And so fine...you write this
macrology. Do you think people would use this over something like what I have
above, which is actually simpler, almost no learning curve, and works for any
markup?

> See Common SQL's reader-syntax for an example of this: #\[ is heavily
> overloaded to create the right SQL-expression objects, which are then
> rendered to the backend database engine...

I use LispWorks/SQL, and I just write the calls in SQL and use
#'sql:execute-command and #'sql:query instead of the other stuff. It's easier
to just embed the SQL, at least to me.

dave

David Bakhash

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
Erik Naggum <er...@naggum.no> writes:

> building strings to be passed around and parsed at every junction in the
> data flow is probably the dumbest design ever created.

I don't think they're re-parsed at every junction. AFAIK, there's some
"magic" that occurs somewhere. I agree with the security issues, though.
It's dangerous to be able to run functions from inside a string, but all I
wanted was for something like this:

(format* t "where is the ${obj}?")

to be converted to:

(format t "where is the ~A?" $obj)

is evaluating a variable so dangerous? In Perl, yeah, because of things like
ties, which run methods every time the value of a variable is fetched. But
I don't think in Lisp this is necessarily a bad thing, especially if used
properly.

dave

Kent M Pitman

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
David Bakhash <ca...@bu.edu> writes:

Your chosen syntax transformation is very bizarre in failing to use a ~ op.
You'd want to survey the space of ops and find a hole. There are a few.
I notice that ~:% is open, for example. So is ~0w, so notations like
~0w{FOO} or ~:%{FOO} are possible.

Also, you probably want to translate to *FOO*, not $FOO. So even something
like ~:%{FOO} => *FOO* would be better. Or even ~:%*FOO* in the format string,
making only * variables accessible.

Those problems are ok, but these are harder:

(defpackage "FOO"
(:use "CL"))
(defpackage "BAR"
(:use"))
(in-package "FOO")
(defun foo ()
(format* t "where is the ${obj}"))
(defun foo1 ()
(let ((*package* (find-package "BAR")))
(foo)))

Note here that the package has changed. Since FORMAT* gets a string, and
since the package of definition time ["FOO"] is not the package of execution
time ["BAR"], how will FORMAT* know to use foo::$obj and not bar::$obj?

The PPRINT solution (in ~/.../) to this is to require the use of a
package prefix. I don't like that solution very much. (I almost lean
toward thinking all format strings should be compiled, and that #"..."
should denote a format string compiled at readtime, so that ~/.../ can
be resolved at read-time, when *package* is the expected value in ordinary
code, or at some known and well-controlled time when a constructed format
string is used.) But in any case, you have to hair it up at least that much.

There is also an issue that this kind of thing is horribly opaque to compilers
trying to do the right thing in dumping out images with the right amount
of functionality in them (tree-shaking). Any string with a certain substring
in it, almost regardless of its context, might get passed around to a place
where (format* t x) was and suddenly become a need to access that symbol.

None of this is fatal, but it's all more complicated than you're suggesting.

Of course, it's easy for users to add this stuff themselves. They only
have to say "this works for me" and that's the end of it. That's not to say
the language shouldn't care about this--just that when languages do, they have
to care about more things and there are more constraints on action.
A public library might be a good way to test the waters on an alternative.


Tim Bradshaw

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
David Bakhash <ca...@bu.edu> writes:

> (format* t "where is the ${obj}?")

Well, if you change the braces to quotes and get rid of the $:

"where is the "obj"?"

And wrap it in a call to my stringify function:

(stringify "where is the "obj"?")

I think you are done!

--tim

Tim Bradshaw

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
David Bakhash <ca...@bu.edu> writes:

> It's nice, simple, and it works just fine. If you already know HTML then why
> should you have to learn something else? And so fine...you write this
> macrology. Do you think people would use this over something like what I have
> above, which is actually simpler, almost no learning curve, and works for any
> markup?
>

Of course they would not use it. But perhaps they should.

What happens if you want to produce different markup from the same
data? Let's say I want to produce some kind of tabular output. I
probably need to do HTML, maybe several variants of it depending on
the browser. I may need to produce XML sometime. I probably need a
printable one since the HTML output of most browsers is unnacceptably
bad, so maybe I produce PS or TeX for printing. I might also need
some kind of csv output for spreadsheet input.

How do you do that with the string-interpolation approach? 4 or 5
different templates, with some kind of subvariants for the HTML case
at least, each of which depends on the exact details of your data and
has to be edited every time you change it. A software engineering
nightmare!

Alternatively, you build a data structure that represents your table,
then you walk over it with little HTML-spitters or XML-spitters. Now
you have a general purpose table structure, and a general purpose set
of spitters[1], and you are winning.

So I guess the string-interpolation approach works fine if you're only
interested in quite small problems or aren't willing to do the
thinking up front to make all future problems easier.

--tim

[1] I don't know if the term `spitter' is just me or if I got it from
somewhere. I use it to mean something that takes a structure and
spits out some nice view of it usually in a markup language of some
kind.

David Hanley

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to

I can't beleive people are talking so much instead
of just writing this simple function:


(defmacro perl( str )
(let ((escaped nil)
(name nil)
(ss (make-array 1 :element-type 'character))
(vars nil)
(result nil))
(map nil
#'(lambda(char)
(if escaped
(if (char= char #\space)
(progn
(setf escaped nil)
(let* ((sn (coerce (reverse name) 'string))
(syn (read-from-string sn)))
(push `(symbol-value (quote ,syn)) vars)
(push "~d " result)))
(push char name))
(if (char= char #\$)
(setf escaped T)
(progn
(setf (aref ss 0) char)
(push (copy-seq ss) result)))))
(if (symbolp str) (symbol-value str) str) )
`(format nil ,(apply #'concatenate 'string (reverse result)) ,@vars )))


(defun te()
(let ((action "eat"))
(format t (perl "I like to $action toast!"))))

Yes, it does need gensyms but I didn't feel like doing them now. It's
relatively efficent, as the real work is done at compile-time. The
macro substitution results in a simple format statement. In other
words, that is why the code is a bit croddy. Also, the ( apply
#'concatenate ) should be rewritten to a #'reduce as it may
choke on very long strings. Hey, you get what you pay for :) ,
I just wanted to demonstrate that it can be done in lisp, very easily.

Maybe someday someone will pay me to write stuff like this. :)

dave


Barry Margolin

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to
In article <ey3wvvr...@lostwithiel.tfeb.org>,

Tim Bradshaw <t...@tfeb.org> wrote:
>Anyway, is something like this what you are after?
>
>
> (defun stringify (&rest strings/objects)
> (apply #'concatenate 'string
> (mapcar #'(lambda (x)
> (typecase x
> (string x)
> (t (princ-to-string x))))
> strings/objects)))
>
> * (stringify "this is " 'stringify " thing with " 4 " arguments")
> "this is stringify thing with 4 arguments"
>
>? It's not quite as simple as perl, but it's OK I think.

Or how about something like this:

(defmacro interpolate (string)
(do ((i 0)
(format-string (make-array (length string) :type 'character
:fill-pointer 0 :adjustable t))
(format-args '()))
((>= i (length string))
`(format nil ,format-string ,(nreverse format-args)))
(let ((cur-char (char string i)))
(case cur-char
;; Tildes need to be doubled to prevent future interpretation
(#\~ (vector-push-extend cur-char format-string)
(vector-push-extend cur-char format-string)
(incf i))
;; Backslash prevents interpretation of Dollar
(#\\ (incf i)
(vector-push (char string i) format-string)
(incf i))
;; Dollar followed by alphanumeric identifier is variable interpolation
(#\$ (let* ((name-start (1+ i))
(name-end (or (position-if-not #'alphanumericp string
:start name-start)
(length string))))
(push (read-from-string string t nil
:start name-start :end name-end)
format-args)
(vector-push #\~ format-string)
(vector-push #\A format-string)
(setq i name-end)))
;; Everything else is literal
(otherwise (vector-push cur-char format-string)
(incf i))))))

I haven't tested it, but the idea is that:

(interpolate "Foo $abc bar")

should macroexpand into:

(FORMAT NIL "Foo ~A bar" ABC)

I made it a macro so that it will interact properly with lexical
variables. If someone wants to get fancier with the parser, you could
allow more than simple alphanumeric variables after the dollar sign.

--
Barry Margolin, bar...@bbnplanet.com
GTE Internetworking, Powered by BBN, Burlington, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.

David Hanley

unread,
Jul 23, 1999, 3:00:00 AM7/23/99
to

David Hanley wrote:

> (push `(symbol-value (quote ,syn)) vars)

That line should just be (push syn vars)

dave


0 new messages