the underscore dilema

Ernesto Rico Schmidt

unread,

Apr 5, 1996, 3:00:00 AM4/5/96

to

Hi!

I'm working on a hobby project that should provide a (Knuth's)
WEB-like literate programming (or system for structured documentation
if like it) with Modula-3 and LaTeX.

In WEB you can have underscores "_" in Pascal identifiers, they will
be then converted by TANGLE in something Pascal can accept (Pascal
doesn't allows underscores in identifiers) and make everything in
UPERCASE (since Pascal is not case-sensitive).

this_readable_identifier (in WEB-file) appears in the Pascal Code
as:
THISREADABLEIDENTIFIER (ugly isn't it?)

Ok, but where is the dilema here?

1. Modula-3 *is* case sensitive
2. Modula-3 *allows* underscores in identifiers

So that if I write (in say the m3web-file) something like:

this_nice_identifier it could appear as:

this_nice_identifier or as

This_Nice_Identifier or as:

ThisNiceIdentifier or as:

THISNICEIDENTIFIER, etc.

in the Modula-3 code. Understand what I mean? The problem is that they
are all not compatible (say the same) identifier. And what if I write
(to improve the readability of the code) something like:

My_Library.My_Procedure and I have in the INTERFACE somthing like:

MyLibrary.MyProcedure ?

Do you understand the dilema or am I inventig it? I don't think it
would be look nice if I have some identifiers with and some
identifiers without underscores in the same code.

Any Comments?

Thanks for your time and comments,

Ernesto.

--
Ernesto Rico Schmidt
email: ne...@sbox.tu-graz.ac.at

Deutscher

unread,

Apr 5, 1996, 3:00:00 AM4/5/96

to

Ernesto Rico Schmidt (ne...@sbox.tu-graz.ac.at) wrote:
: Hi!

:
: I'm working on a hobby project that should provide a (Knuth's)
: WEB-like literate programming (or system for structured documentation
: if like it) with Modula-3 and LaTeX.

[... snip ...]

: Do you understand the dilema or am I inventig it? I don't think it

: would be look nice if I have some identifiers with and some
: identifiers without underscores in the same code.

IMHO, you are inventing a dilemma. Why not have underscores where you
like them, why bother to take them out if Modula can handle them?
Indeed, I use things like this in C (and even Fortan) to make them more
readable, say, for gloabals I always append _glb to varibable names,
whereas function or Procedure names I use capitalization, say
function GetTheData, and sometimes underscores, function GetF_of_x().

If your language can handle it, why change identifier names at all? In
the ideal case, yuo never need to look at the source code, so why
bother about appearance of the names, as long as it compiles properly.
Sometimes, however, you may need to look at the generated source code,
and it is nice if the names are at least related to the ones in your
web file.

Kind regrads. Stefan
==========================================================================
Stefan A. Deutscher, s...@utk.edu, (001)-423-[522-7845|974-7838|574-5897]
home^ UTK^ ORNL^
==========================================================================
If there is software you'd like to have in a native version, visit the:
OS/2 E-mail Campaign Page http://www.andrews.edu/~boyko/email.html
--------------------------------------------------------------------------

David Kastrup

unread,

Apr 6, 1996, 3:00:00 AM4/6/96

to

ne...@sbox.tu-graz.ac.at (Ernesto Rico Schmidt) wrote:
> I'm working on a hobby project that should provide a (Knuth's)
>WEB-like literate programming (or system for structured documentation
>if like it) with Modula-3 and LaTeX.

First you might want to check out the Spider system which has a parser
for M2 already, if I remember correctly. You might also get as wise as
Norman Ramsey (the author of Spider) and change to the language
independent, simplistic noweb (written by Norman Ramsey).

Much better suited to the task of cranking out Literate Programs in
tolerable time, I think.
--
David Kastrup, Goethestr. 20, D-52064 Aachen Tel: +49-241-72419
Email: d...@pool.informatik.rwth-aachen.de Fax: +49-241-79502

Deutscher

unread,

Apr 7, 1996, 4:00:00 AM4/7/96

to

David Kastrup (d...@pool.informatik.rwth-aachen.de) wrote:

: ne...@sbox.tu-graz.ac.at (Ernesto Rico Schmidt) wrote:
: > I'm working on a hobby project that should provide a (Knuth's)
: >WEB-like literate programming (or system for structured documentation
: >if like it) with Modula-3 and LaTeX.

: First you might want to check out the Spider system which has a parser
: for M2 already, if I remember correctly. You might also get as wise as
: Norman Ramsey (the author of Spider) and change to the language
: independent, simplistic noweb (written by Norman Ramsey).

There is a couple other ones mentioned in the FAQ, and there is nuweb as
well, but I think Ernesto mentioned a _hobby project_, so why not let him
have the fun? Apart from that, it may teach you a big deal to do such
a project despite it has been done already.
There are people who build there own furniture even though
you can but that stuff. For some it's just a hobby, and that I think is
perfectly fine. Stefan

: Much better suited to the task of cranking out Literate Programs in

: tolerable time, I think.
: --
: David Kastrup, Goethestr. 20, D-52064 Aachen Tel: +49-241-72419
: Email: d...@pool.informatik.rwth-aachen.de Fax: +49-241-79502

--

Spencer Allain

unread,

Apr 9, 1996, 3:00:00 AM4/9/96

to

In article <4k2up5$k...@fstgal00.tu-graz.ac.at> ne...@sbox.tu-graz.ac.at (Ernesto Rico Schmidt) writes:

Hi!

In WEB you can have underscores "_" in Pascal identifiers, they will
be then converted by TANGLE in something Pascal can accept (Pascal
doesn't allows underscores in identifiers) and make everything in
UPERCASE (since Pascal is not case-sensitive).

Ok, but where is the dilema here?

1. Modula-3 *is* case sensitive
2. Modula-3 *allows* underscores in identifiers

So that if I write (in say the m3web-file) something like:

this_nice_identifier it could appear as:

this_nice_identifier or as

This_Nice_Identifier or as:

ThisNiceIdentifier or as:

THISNICEIDENTIFIER, etc.

Since the idea is that you don't see the underlying source code that
is generated, it is up to you to come up with a convention. Instead
of TANGLE, you'll have your own thing that will demangle or translate
everything into a consistent syntax.

I don't think it's that important whether the generated source code
looks like

this_nice_identifier or as
This_Nice_Identifier or as
ThisNiceIdentifier

as long as you are consistent -- although I really don't like
THISNICEIDENTIFIER. :-)

Beware though, in Modula-3, it is often a good convention to use the
same name with different capitalizations to handle overrides and the
like.

-Spencer

----------------------------------------------------------------------
Spencer Allain E-mail: spe...@era.com
Engineering Research Associates Phone : (703) 734-8800 x1414
1595 Spring Hill Road Fax : (703) 827-9411
Vienna, VA 22182-2235

<A HREF=http://www.research.digital.com/SRC/modula-3/html/home.html>
Modula-3 Home Page DEC SRC</A>
<A HREF=http://www.vlsi.polymtl.ca/m3/>Modula-3 FAQ, etc. </A>
----------------------------------------------------------------------

Ernesto Rico Schmidt

unread,

Apr 9, 1996, 3:00:00 AM4/9/96

to

s...@utkux.utcc.utk.edu (Deutscher) writes:

David Kastrup (d...@pool.informatik.rwth-aachen.de) wrote:
: ne...@sbox.tu-graz.ac.at (Ernesto Rico Schmidt) wrote:
: > I'm working on a hobby project that should provide a (Knuth's)
: >WEB-like literate programming (or system for structured documentation
: >if like it) with Modula-3 and LaTeX.

: First you might want to check out the Spider system which has a parser
: for M2 already, if I remember correctly. You might also get as wise as
: Norman Ramsey (the author of Spider) and change to the language
: independent, simplistic noweb (written by Norman Ramsey).

There is a couple other ones mentioned in the FAQ, and there is nuweb as
well, but I think Ernesto mentioned a _hobby project_, so why not let him
have the fun? Apart from that, it may teach you a big deal to do such
a project despite it has been done already.
There are people who build there own furniture even though
you can but that stuff. For some it's just a hobby, and that I think is
perfectly fine. Stefan

You got it!, I will check this Spider thing David mentioned, but as
you (Stefan) said *it is* a hobby project of mine and I hope it teachs
me a lot about Literate Programming and becomes (sometime in
not-so-near future) an(other) useful Literate Programming Tool.

As I wrote in a reply to David that never left my computer (I hope
this one does it:-) what I want in this *hobby project* is to provide
a Literate Programming Tool for Modula-3 with thigs noweb *can't* like
pretty printing, macros, etc. and to (hopefully) have fun and learn
more about Literate Programming (and Modula-3).

[...]
--
Ernesto Rico-Schmidt
email: ne...@sbox.tu-graz.ac.at
www: http://www.sbox.tu-graz.ac.at/home/nene

Strive for perfection in everything. Take the best that exists and
make it better. If it doesn't exist, create it. Accept nothing
nearly right or good enough.

- Sir Henry Royce, co-founder of Rolls-Royce

Dr. Thomas Tensi

unread,

Apr 12, 1996, 3:00:00 AM4/12/96

to

Ernesto Rico Schmidt (ern...@Leonardo.neneNet) wrote:

> [...]

> As I wrote in a reply to David that never left my computer (I hope
> this one does it:-) what I want in this *hobby project* is to provide
> a Literate Programming Tool for Modula-3 with thigs noweb *can't* like
> pretty printing, macros, etc. and to (hopefully) have fun and learn
> more about Literate Programming (and Modula-3).

Sometime ago I wrote a spider grammar for Modula-3. If anybody is
interested in that, I can mail or post it.

Thomas

--
----------------------------------------------------------------------------
Dr. Thomas Tensi |s |d &|m | software design & management GmbH & Co. KG
| | | | Thomas-Dehler-Str. 27, D-81737 M"unchen
thomas...@sdm.de | | | | Tel: (089) 63812-313, Fax: (089) 63812-150

Norman Ramsey

unread,

Apr 12, 1996, 3:00:00 AM4/12/96

to

In article <7yu3yt2...@Leonardo.neneNet>,

Ernesto Rico Schmidt <ne...@sbox.tu-graz.ac.at> wrote:
> what I want in this *hobby project* is to provide
>a Literate Programming Tool for Modula-3 with thigs noweb *can't* like
>pretty printing, macros, etc. and to (hopefully) have fun and learn
>more about Literate Programming (and Modula-3).

I can't let this pass. noweb can and does do prettyprinting. Kostas
Oikonomou and Kaelin Colclaesure have written prettyprinting filters.
Felix Gaertner has written a prettyprinter generator that generates
prettyprinters compatible with noweb.

noweb does not currently do macros, but a couple of years ago Lee
Wittenberg put up an excellent proposal for macros (basically treat
[[x]] in a chunk name as a parameter, for any identifier x), which
nobody has ever implemented.

You might be able to have fun and learn about LP while working within
an existing framework, and you might have the added satisfaction of
seeing your stuff widely used.

My current favorite literate-programming problem is: how do you build
cross-file indices between source units? This would be very useful
for Modula-3...

--
Norman Ramsey
http://www.cs.purdue.edu/homes/nr

Marc van Leeuwen

unread,

Apr 15, 1996, 3:00:00 AM4/15/96

to

In article <4kmg91$e...@labrador.cs.purdue.edu>, n...@cs.purdue.edu (Norman
Ramsey) writes:

|> I can't let this pass. noweb can and does do prettyprinting. Kostas
|> Oikonomou and Kaelin Colclaesure have written prettyprinting filters.
|> Felix Gaertner has written a prettyprinter generator that generates
|> prettyprinters compatible with noweb.

Now that you raise the point, maybe you could tell just what prettyprinting
filters can do for you. (I know, I could go and find out for myself, but
there must be others who can tell this directly from experience.)

Knowing CWEB and its sources, it is clear to me that doing a good job (or
even a fairly good job) about prettyprinting is not a simple task. Of course,
recognising keywords and setting them in boldface is a trivial matter
(although this already becomes much more difficult if one wants to include
user defined type identifiers, especially if one also wants to preserve the
freedom of specifying chunks in any order, including those that define and
use type identifiers). One would also expect that math formulas come out
right, with the same kind of spacing subtleties that TeX routinely uses;
this should also be quite easy (as it is essentially a lexical matter). The
things I am more worried about are larger structures, in particular when
line breaks are involved, since my (very brief) experience with noweb has
shown that it seems to be obsessed about line breaks(*). If a source line
has to be broken because of the limited with of my text editor's window, but
the corresponding output will comfortably fit on a line, will it in fact
come out like that (this is something that happens extremely often to me)?
Conversely, if I have very long lines (maybe some huge formula), will line
breaks be chosen in the output in a reasonable way, depending on the actual
page width? What about indentation levels? Somehow the term ``prettyprinting
filters'' suggests to me a more superficial attitude towards prettyprinting
than would be required to get these points right (or would one also call TeX
a ``typesetting filter''?).

Marc van Leeuwen
CWI, Amsterdam

(*) To be specific, it appears that noweb will never introduce or remove a
newline character. In particular, if a formula cited in a documentation
part happens to be split across source lines (which may very well be
caused by paragraph filling of the text editor), then the formula will
also be broken in the printed output. Also whatever transformations
noweb has made to a source line, it appears to believe that the result
can always be passed safely to TeX on a single line, ignoring the fact
that it has a finite input buffer (for the standard version of TeX this
is 500 characters, which is quite enough for any input prepared by
humans, but which can be easily overflowed by the kind of things noweb
emits.)

Norman Ramsey

unread,

Apr 15, 1996, 3:00:00 AM4/15/96

to

In article <DpwA4...@cwi.nl>, Marc van Leeuwen <ma...@cwi.nl> wrote:
>Now that you raise the point, maybe you could tell just what prettyprinting
>filters can do for you. (I know, I could go and find out for myself, but
>there must be others who can tell this directly from experience.)

They can perform either or both of two tasks:
- choose fonts and glyphs to represent each source token
- choose indentation and line breaks of your code
I find these features of more cost than benefit (except possibly when
preparing for book publication), but lots of people like them.

>Knowing CWEB and its sources, it is clear to me that doing a good job (or
>even a fairly good job) about prettyprinting is not a simple task.

No, and the TeX line-breaking algorithm doesn't provide the right
support. You really want an Oppen-style line-breaking algorithm, but
I don't know of any standard implementations. Too bad, because it's
simple dynamic programming.

I had an article in the 9/89 CACM which was mostly about prettyprinting.

> my (very brief) experience with noweb has
>shown that it seems to be obsessed about line breaks(*). If a source line
>has to be broken because of the limited with of my text editor's window, but
>the corresponding output will comfortably fit on a line, will it in fact
>come out like that (this is something that happens extremely often to me)?

>(*) To be specific, it appears that noweb will never introduce or remove a
> newline character.

Noweb takes the reasonable position that the programmer is the best
judge of where to put the line breaks and how much to indent code.
In some languages (Miranda, Haskell, awk, Icon), line breaks and/or
indentation carry meaning, and to change them would be to change the
meaning of the user's program.

>Conversely, if I have very long lines (maybe some huge formula), will line
>breaks be chosen in the output in a reasonable way, depending on the actual
>page width? What about indentation levels? Somehow the term ``prettyprinting
>filters'' suggests to me a more superficial attitude towards prettyprinting
>than would be required to get these points right (or would one also call TeX
>a ``typesetting filter''?).

You would have to write a prettyprinting tool to do this job.
Oppen's back end is the obvious way to do this; relying on TeX (as
CWEB does) works but is clearly second best (*pace* Knuth).

`filter' is simply a noweb technical term meaning that the program
must read and write the noweb intermediate form (never seen by users)
as opposed to say noweb source markup or flat ASCII.

> Also whatever transformations
> noweb has made to a source line, it appears to believe that the result
> can always be passed safely to TeX on a single line, ignoring the fact
> that it has a finite input buffer (for the standard version of TeX this
> is 500 characters, which is quite enough for any input prepared by
> humans, but which can be easily overflowed by the kind of things noweb
> emits.)

I made this choice so that the line numbers used in TeX error messages
would be the same as the line numbers used in the noweb source. Major
win. Today 640K machines are a minimum and almost all TeX users can
afford a 3000-character input buffer. One or two people who have
found the limitation galling have written noweb filters to split very
long lines. I think it's a short sed script. For myself I find it
easier to split the offending chunk (it's always the index information
that runs over), which is usually at least two pages.

Norman

@article{ramsey:building,
author="Norman Ramsey",
title="{L}iterate Programming: {\hskip 0pt plus 0.5em}{W}eaving a
Language-Independent {{\tt WEB}}",
journal=cacm,
month=sep,
volume=32,
number=9,
pages="1051--1055",
year="1989"}

Marc van Leeuwen

unread,

Apr 16, 1996, 3:00:00 AM4/16/96

to

In article <4kuvti$n...@labrador.cs.purdue.edu>, n...@cs.purdue.edu (Norman

Ramsey) wrote:
|> In article <DpwA4...@cwi.nl>, Marc van Leeuwen <ma...@cwi.nl> wrote:
|> >Knowing CWEB and its sources, it is clear to me that doing a good job (or
|> >even a fairly good job) about prettyprinting is not a simple task.
|>
|> No, and the TeX line-breaking algorithm doesn't provide the right
|> support. You really want an Oppen-style line-breaking algorithm, but
|> I don't know of any standard implementations. Too bad, because it's
|> simple dynamic programming.

I'll agree that TeX's line-breaking is not always ideal, although it can do
more than you might think. In particular, I cannot let it make a decision
like the following one for me: if this compound statement (e.g., one
following `else') will fit on the current line, make it so; otherwise lay it
out vertically, aligning opening and closing symbols at the left side. This
is impossible because it involves multiple line breaks that are not
independent (if one is taken, all should be taken). On the other hand, it is
remarkably easy to have TeX break long formulas at good points (i.e., not
within parentheses, and at operators of low priority, if possible); this is
done in CWEBx. I have never heard about Oppen-style line-breaking before,
but would be very interested to find out what it is.

|> Noweb takes the reasonable position that the programmer is the best
|> judge of where to put the line breaks and how much to indent code.

I'd call that the WYSIWYG approach. As (La)TeX users know, a machine can
routinely do a better job of formatting text than the author can, or wants to
deal with, at least in the realm of ordinary (technical) text. I don't claim
this automatically also holds for formatting of computer programs, although
I cannot see any differences of principle; obviously the rules controlling
the formatting will be different, but I would be somewhat embarrassed if I
found that the way I prefer my programs to be formatted cannot be described
(at least approximately) by *any* set of rules. And even if one takes the
position that the programmer decides about line breaks, at least the
possibility should be considered that the programmer needs to insert a line
break into the source for some practical reason, without implying that she
wants a line break in the output as well (this is comparable to string breaks
in C and other languages, in order to allow long strings without newline
characters to be specified conveniently).

Also, I mentioned that noweb preserves line breaks even for code embedded in
a documentation part. Certainly in this case I don't think it is reasonable
to leave decision about line breaks to the programmer, while the line breaks
in the remainder of the paragraph are decided about mechanically.

|> In some languages (Miranda, Haskell, awk, Icon), line breaks and/or
|> indentation carry meaning, and to change them would be to change the
|> meaning of the user's program.

Much as I despise such features in programming languages, they do not imply
that automatic (re)formatting of programs is impossible. If the line breaks
and indentation of the source text carry meaning, then that meaning can be
extracted mechanically, and a prettyprinting program can then make sure that
the meaning is preserved in the reformatted version. This is of course very
langage specific, but prettyprinting always is, isn't it? I am not aware of
the precise way that layout is significant in these languages, but I presume
that the regime is still flexible enough that a line break may be inserted
at any point if the need arises, as long as the indentation following the
break is chosen appropriately.

Marc van Leeuwen

Norman Ramsey

unread,

Apr 16, 1996, 3:00:00 AM4/16/96

to

In article <Dpy59...@cwi.nl>, Marc van Leeuwen <ma...@cwi.nl> wrote:
>I have never heard about Oppen-style line-breaking before,
>but would be very interested to find out what it is.

@article{oppen:prettyprinting,
month=oct,
pages="465--483",
number="4",
author="Derek C. Oppen",
title="Prettyprinting",
journal=toplas,
year="1980",
volume="2"}

>As (La)TeX users know, a machine can
>routinely do a better job of formatting text than the author can, or wants to
>deal with, at least in the realm of ordinary (technical) text.

Horsefeathers. TeX commands like ~ \, \! exist precisely because the
machine isn't always smart enough to know where to insert space and
newlines. Lamport (of latex) has written a technical report on how to
write long formulas (manually).

>I would be somewhat embarrassed if I
>found that the way I prefer my programs to be formatted cannot be described
>(at least approximately) by *any* set of rules.

Many programmers use spacing to highlight some fact about the
semantic structure of programs.
Here are a few trivial examples:

<<exported macro definitions>>=
#define emitfast0(BITS, SIZE) do { \
if ((SIZE) == sizeof(unsigned long)) emitfastul0((BITS)); \
else if ((SIZE) == sizeof(unsigned)) emitfastu0((BITS)); \
else if ((SIZE) == sizeof(unsigned short)) emitfastus0((BITS)); \
else if ((SIZE) == sizeof(unsigned char)) emitfastuc0((BITS)); \
else emitm((BITS), (SIZE)); \
} while(0)
@
<<exported macro definitions>>=
#define emitfast2(BITS, SIZE) do { \
register unsigned char *EMITFAST_p = currb_p; \
((SIZE) == sizeof(unsigned long)) ? Xemitfastul2((BITS)) : \
((SIZE) == sizeof(unsigned)) ? Xemitfastu2((BITS)) : \
((SIZE) == sizeof(unsigned short)) ? Xemitfastus2((BITS)) : \
((SIZE) == sizeof(unsigned char)) ? Xemitfastuc2((BITS)) : \
emitm((BITS), (SIZE)); \
} while(0)

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

case type(c) of {
"Efitsu" : {f := "0x%x"; s := "unsigned"}
"Efitss" : {f := "%d"; s := "signed" }
default : impossible("width condition")
}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

every ipt := inputs_of(cons) do ## emit bit fields
case type(ipt.meaning) of {
"field" : put(pp, "$nunsigned " || ipt.name || ":" || fwidth(ipt.meaning)||";")
"integer" : put(pp, "$nint " || ipt.name || ":" || ipt.meaning ||";")
}
every ipt := inputs_of(cons) do ## emit other inputs
case type(ipt.meaning) of {
"null" : put(pp, "$nint " || ipt.name ||";")
"string" : put(pp, "$nRAddr " || ipt.name ||";")
"constype" : put(pp, "$n" || ipt.meaning.name || "_Instance " || ipt.name ||";")
"field" | "integer" : &fail
default : impossible("input meaning")
}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

There is obvious structure (rules) here, but automatically discovering
what structure the programmer thought important enough to highlight in
this way is a nontrivial task---and, IMO, one not worth automating.
(Note also that in the first two examples changing line breaks would
change the meaning of the program.)

>And even if one takes the
>position that the programmer decides about line breaks, at least the
>possibility should be considered that the programmer needs to insert a line
>break into the source for some practical reason, without implying that she
>wants a line break in the output as well (this is comparable to string breaks
>in C and other languages, in order to allow long strings without newline
>characters to be specified conveniently).

Thereby having one semantics for the input and another for the output?
A dangerous idea...

>Much as I despise [semantically meaningful white space], they do not imply

>that automatic (re)formatting of programs is impossible. If the line breaks
>and indentation of the source text carry meaning, then that meaning can be
>extracted mechanically, and a prettyprinting program can then make sure that
>the meaning is preserved in the reformatted version. This is of course very
>langage specific, but prettyprinting always is, isn't it? I am not aware of
>the precise way that layout is significant in these languages, but I presume
>that the regime is still flexible enough that a line break may be inserted
>at any point if the need arises, as long as the indentation following the
>break is chosen appropriately.

This thread started as a complaint about noweb's default behavior.
I hope I've made my case that the proper default is to leave the
author's spacing untouched. If you want to massage your code
automatically, you can write prettyprinters. It is easy to make
prettyprinters compatible with noweb, and several people have done
so. What is not so easy is working around tools that insist they
know best about newlines, indentation, and white space. They deprive
authors of the ability to use white space to help explain programs.
I think that's an ``anti-literate'' feature.

N

Marc van Leeuwen

unread,

Apr 17, 1996, 3:00:00 AM4/17/96

to

In article <4l11hv$p...@labrador.cs.purdue.edu>, n...@cs.purdue.edu (Norman
Ramsey) wrote a long response to my article <Dpy59...@cwi.nl>, that I
don't want to comment on in detail, as I think points at both sides have
been made sufficiently clear for a discussion in this forum. It was never my
intention to start an argument pro or contra prettyprinting; obviously both
approaches have supporters who would not feel comfortable with tools
supporting the other approach, and fortunately there are some tools
available for both parties, and even some taking an intermediate position.

The main point I wanted to make is of a more philosophical nature, namely
that the contrast between verbatim representation of programs and
prettyprinting more or less parallels that between WYSIWYG preparation of
text and structured markup. In both areas there will be styles of writing
that become either impossible or extremely tedious using the opposite
paradigm. For instance, an author who attaches special meaning to certain
vertical alignments between words in a paragraph is almost forced to use the
WYSIWYG approach; on the other hand, if one prepares text without knowing
formatting parameters such as page width beforehand, some form of structured
markup is more appropriate. I think the examples Norman gave to indicate
that spacing may be used to reveal some semantic fact about the program
illustrate this point nicely. On one hand they show that detailed control of
spacing can be put to a ``literate'' use, on the other hand they seem rather
laborious to prepare, and if much narrower margins would have to be
accommodated, that would cause a serious problem.

|> This thread started as a complaint about noweb's default behavior.
|> I hope I've made my case that the proper default is to leave the
|> author's spacing untouched. If you want to massage your code
|> automatically, you can write prettyprinters. It is easy to make
|> prettyprinters compatible with noweb, and several people have done
|> so. What is not so easy is working around tools that insist they
|> know best about newlines, indentation, and white space. They deprive
|> authors of the ability to use white space to help explain programs.
|> I think that's an ``anti-literate'' feature.

I don't think we have to be that dogmatic about things. No tool needs to be
everything to everyone. People who wish to use white space to help explain
programs should not try to work around the limitations of a tool that
ignores their spacing, but should choose another tool. People who regard
code as a formal specification of an algorithm, and who prefer explicit
commentary to informally indicate noteworthy properties of that algorithm to
human readers, can be quite happy with a system that extracts only the
formally significant aspects of the code parts of the source text (e.g.,
ignores white space if it has no semantic meaning in the programming
language) and displays the program so as to most clearly reveal that formal
structure. Neither point of view is implied by ``literate programming''.

Marc van Leeuwen

Christian Lynbech

unread,

Apr 17, 1996, 3:00:00 AM4/17/96

to

>>>>> "Ernesto" == Ernesto Rico Schmidt <ne...@sbox.tu-graz.ac.at> writes:

Ernesto> In WEB you can have underscores "_" in Pascal identifiers, they will
Ernesto> be then converted by TANGLE in something Pascal can accept (Pascal
Ernesto> doesn't allows underscores in identifiers) and make everything in
Ernesto> UPERCASE (since Pascal is not case-sensitive).

I know there has other answers, but you might also want to consider
going in the direction of Knuth's originally WEB.

Here he insisted that the tangled pascal code should be considered as
*objectcode* rather than normal source. Thus the original tool (don't
about the more modern cweb) produced not only the squashed indetifiers
but also pascal files without linebreaks or perhaps just filled as
closely to 80 chars as was syntactically possible. The philosophy is
that you should never feel tempted to work with the generated pascal
code. You work only with WEB side (which of course could give you
problems in debugging).

As long as you use a consistent convention, and as long as you do not
want to rely on "violations" (such as having two distinct variables
which differs only in case, which IMHO is a highly questionable thing
to do anyway), you do not need to care (much) how the identifiers
actually tangle.

---------------------------+--------------------------------------------------
Christian Lynbech | Computer Science Department, University of Aarhus
Office: R0.32 | Ny Munkegade, Building 540, DK-8000 Aarhus C
Phone: +45 8942 3218 | lyn...@daimi.aau.dk -- www.daimi.aau.dk/~lynbech
---------------------------+--------------------------------------------------
Hit the philistines three times over the head with the Elisp reference manual.
- pet...@hal.com (Michael A. Petonic)

Paolo Ciccone

unread,

Apr 18, 1996, 3:00:00 AM4/18/96

to

Marc van Leeuwen wrote:

> |> Noweb takes the reasonable position that the programmer is the best
> |> judge of where to put the line breaks and how much to indent code.
>

> I'd call that the WYSIWYG approach. As (La)TeX users know, a machine can

> routinely do a better job of formatting text than the author can, or wants to

> deal with, at least in the realm of ordinary (technical) text. I don't claim
> this automatically also holds for formatting of computer programs, although
> I cannot see any differences of principle; obviously the rules controlling

> the formatting will be different, but I would be somewhat embarrassed if I

> found that the way I prefer my programs to be formatted cannot be described
> (at least approximately) by *any* set of rules.

I agree with Norman that the programmer is the best judge. When I write a portion
of code I cannot avoid formmatting it the way I like. That way is what I
consider readable code. I don't like the idea of printing it in a different
way because the typesetter "thinks" otherwise. In addition it's a waste of
time and energy to add a set of rules to the program just to preserve what
you have already written. A pretty printing routine should print the language
elements without changing the appearance of the code or substituting symbols
like in WEB. When you present your code to another programmer
and the operators are changed to something nicer but different, that will
not have a good impact on the reader.

Paolo
--
The opinions expressed here are exclusively my own

felix gaertner

unread,

Apr 19, 1996, 3:00:00 AM4/19/96

to

In article <31767A...@borland.com>, Paolo Ciccone <pcic...@borland.com> writes:
>
> I agree with Norman that the programmer is the best judge [about the
> formatting of source code]. When I write a portion of code I cannot

> avoid formmatting it the way I like. That way is what I consider
> readable code. I don't like the idea of printing it in a different
> way because the typesetter "thinks" otherwise. In addition it's a
> waste of time and energy to add a set of rules to the program just
> to preserve what you have already written. A pretty printing routine
> should print the language elements without changing the appearance
> of the code or substituting symbols like in WEB. When you present
> your code to another programmer and the operators are changed to
> something nicer but different, that will not have a good impact on
> the reader.

Besides enhancing the readability of source code by using
typographical means (like different fonts and local spacing)
prettyprinting has one additional advantage: You can automatically
transform source code into a form that suits a set of formatting
standards that a group of people have agreed upon. This eases
the communication between these people if they have to revise
each others code periodically.

It is a common experience that reading somebody else's source code is
a little painfull if he/she uses obviously different formatting
styles. Using a prettyprinter in these cases can help. For example, if
you've read one CWEB program, you can easily assimilate to others.

In addition, if you have a set of rules that describe your way of
formatting, well, hey, then it's easy to transform any code part
into a style which looks like yours -- and if this isn't readable
code, what else is? ;-)

Felix

felix gaertner

unread,

Apr 19, 1996, 3:00:00 AM4/19/96

to

In article <Dpy59...@cwi.nl>, ma...@cwi.nl (Marc van Leeuwen) writes:
>
> I cannot see any differences of principle [between formatting
> technical text and computer programs]; obviously the rules

> controlling the formatting will be different, but I would be
> somewhat embarrassed if I found that the way I prefer my programs to
> be formatted cannot be described (at least approximately) by *any*
> set of rules.

The papers by Oppen (1980) and by Rose and Welsh (1981) cited below
show, that the major aspects of prettyprinting (local formatting of
identifiers, indentation of code) can be expressed in form of
``formatted grammars''. In short, everything that has to do with the
language syntax can be automatically performed by a prettyprinter.

But this is also the drawback: By merely using formatted grammars
(which are language grammars enhanced by formatting information) you
cannot do ``semantic'' formatting. This starts with typsetting user
types (e.g. `foo' in `class foo' for C++) as if they were reserved
words, continues with the problem of different layout styles of
identical constructs in different parts of the program, and ends with
the problem of aligning a row of assignments for example (although
Woodman [1986] proposes a first attempt on dealing with this problem).

I have not seen any convincing theoretical attempt on dealing with
these problems in a `clean' and structural manner. Many prettyprinters
allow the use of switches to control certain aspects of the layout (as
does CWEBx, at least I follow this from what I have heard). Others
allow the placing of additional `semantic' formatting instructions
within the source code (like Knuths `@/' or `@#'). And after working
on this subject for a while, I am in doubt whether there actually _is_
a way of specifying this on a more abstract level. (This might be a
little embarrassing for Marc then :-).

I've written a project report on the problems of prettyprinting which
contains a large bibliography on this subject. I am interested in
hearing from people who are actually working on `new' approaches in
this area.

Felix

PS: The report is available from me on request. Sorry, it's not on the
net. And sorry that my english language syntax is german ;-)

Bibliography:
-------------

@article{ Oppen80,
author = {Oppen, Derek C.},
title = {Prettyprinting},
journal = "ACM Transactions on Programming Languages and System",
volume = 2,
number = 4,
month = oct,
year = 1980,
pages = {465--483},
}

@article{Rose81,
author = {Rose, G. A. and Welsh, J},
title = {Formatted Programming Languages},
journal = spe,
volume = 11,
number = {},
month = {},
year = 1981,
pages = {651--669},
}

@article{Rubin83,
author = {Rubin, Lisa F.},
title = {Syntax-Directed Pretty Printing --- A First Step Towards a
Syntax-Directed Editor},
journal = "IEEE Transactions on Software Engineering",
volume = "SE-9",
number = 2,
month = mar,
year = 1983,
pages = {119--127},
}

@article{ Woodman86,
author = {Woodman, M.},
title = {Formatted Syntaxes and {Modula-2}},
journal = "Software -- Practice \& Experience",
volume = 16,
number = 7,
month = jul,
year = 1986,
pages = {605--626},
}

Dr E. Buxbaum

unread,

Apr 19, 1996, 3:00:00 AM4/19/96

to

As far as I know, the underscore IS legal in pascal names (at least
in every compiler I ever worked with, don't know about the standard)

Dave Love

unread,

Apr 25, 1996, 3:00:00 AM4/25/96

to

>>>>> On 15 Apr 1996 21:16:18 -0500, n...@cs.purdue.edu (Norman Ramsey) said:

Norman> You really want an Oppen-style line-breaking algorithm, but
Norman> I don't know of any standard implementations. Too bad,
Norman> because it's simple dynamic programming.

I don't know about `standard', but in the functional programming
world, there are tractable Oppen-like (but different) systems in
Paulson's `ML for the Working Programmer' and John Hughes'
prettyprinting library with Haskell compilers
<http://www.cs.chalmers.se/~rjmh/Papers/pretty.html>. Hughes' paper
is interesting per se and Paulson is potentially usefully tutorial.