noweb language independance

Francky Leyn

unread,

Mar 31, 1998, 3:00:00 AM3/31/98

to

Hello,

I'm a convinced noweb user, and I would like to address the language
independency issue: I'm wondering if noweb is really as language
independ as one claims.

I'm using noweb to maintain a library of filters (much the same
way noweb is operating) in multiple languages. These include:
C, Perl, AWK, sh, and ProLog. These filters are documented in
a SINGLE noweb file, and I refuse to split them up according
to the language. I switch from language whenever I find it
appropriate. This makes that an examplary language sequence
looks like "C AWK C Perl C AWK ProLog C". As you can see the
C chunks occur at multiple places between other chunks. Isolating
them would require a separted noweb file for each of them, something
which hampers maintainability and introduces the mixing of formating
issues with the programming issue.

In my opinion, the following conditions have to fullfilled
to obtain true language independence:

1) Executable code, yes/no
One has to be able to make the difference between exectutable
files and files that aren't. One has to be able to make shell
scripts executable from within the literate programming source,
not with external scripts possibly part of a makefile.

I'm solving this using a LaTeX DSC (Document Structuring
Conventions).
Making a script executable is done with the LaTeX comment
%unix chmod +x filename. A noweb filter extracting these
instructions,
executes the commands after the tangling phase.

This could also be achieved by executing a script that is build
up within the literate programming source, removing it in the
weaving fase with the elide filter. I prefer the LaTeX DSC way
because it needs less typing.

2) line directives.
Each language has a specific syntax to specify line directives.
One should be able to specify this within the literate programming
source and not through tangling (notangle) options that are valid for
the
complete literate programming source.

I have no solution for this. It could be done through LaTeX DSC
or based on the extension of the code chunk.

3) (optional) pretty printing. Suppose that literate programming
filters would be aviable for all the languages mentioned (leaving
us something to dream about). In that case one would need a method
to differentiate between the different languages of the code chunks.

Conclusion: in order to obtain language indepence within a single file
with multiple languages used, one has to be able to specify the language
of a specific chunk and one must be able to specify wether a chunk is
executable or not.

Hoping to initiate/triger a discussion on this topic,

Best regards,

Francky

_____________________________________________________________________________

Francky Leyn URL:
http://www.esat.kuleuven.ac.be/~leyn/
K.U.Leuven - ESAT MICAS E-mail :
Franck...@esat.kuleuven.ac.be
Kardinaal Mercierlaan 94 - 91.21 Tel : ++32 - (0)16 32.10.85
__o
B-3001 Heverlee - Belgium Fax : ++32 - (0)16 32.19.75
\<,
____________________________________________________________________()/
()__

Lee Wittenberg

unread,

Mar 31, 1998, 3:00:00 AM3/31/98

to

On 31 Mar 1998, Francky Leyn wrote:

> I'm a convinced noweb user, and I would like to address the language
> independency issue: I'm wondering if noweb is really as language
> independ as one claims.
>
> I'm using noweb to maintain a library of filters (much the same
> way noweb is operating) in multiple languages. These include:
> C, Perl, AWK, sh, and ProLog. These filters are documented in
> a SINGLE noweb file, and I refuse to split them up according
> to the language. I switch from language whenever I find it
> appropriate. This makes that an examplary language sequence
> looks like "C AWK C Perl C AWK ProLog C". As you can see the
> C chunks occur at multiple places between other chunks. Isolating
> them would require a separted noweb file for each of them, something
> which hampers maintainability and introduces the mixing of formating
> issues with the programming issue.
>
> In my opinion, the following conditions have to fullfilled
> to obtain true language independence:
>
> 1) Executable code, yes/no
> One has to be able to make the difference between exectutable
> files and files that aren't. One has to be able to make shell
> scripts executable from within the literate programming source,
> not with external scripts possibly part of a makefile.

I disagree here. The task tangling is deliberately separate from the
task of compiling (or marking a shell script as executable), because
they are two separate steps in the programming process. The purpose
of make is to automate this process, so IMO the chmod belongs in the
makefile, after the tangle step.

BTW, there's no real reason you can't include your makefile as part of
the web, too. I'll admit that it does lead to a philosophically
interesting "chicken and egg" problem, but it's not really a problem
in practice (much like writing the compiler for a new language in the
language itself).

>
> 2) line directives.
> Each language has a specific syntax to specify line directives.
> One should be able to specify this within the literate programming
> source and not through tangling (notangle) options that are valid for
> the
> complete literate programming source.

Unless you use "noweb" instead of "notangle" to tangle your webs, this
shouldn't be a problem (except for those languages that don't support
any kind of #line equivalent). Again, this is a job for the makefile,
and one that make handles quite well with rules. You can make a
.nw.pl rule to change .nw files into .pl scripts, supplying the proper
-L argument in the notangle command to generate the appropriate Perl
directives.

If you don't want .pl as an extension on your executable Perl scripts,
you can have another rule that simply renames the .pl file without
the extension (and, perhaps, chmod's it). It takes more steps, but
the computer's doing the work, not you, and it won't mind (and human
time is much more valuable than computer time).

> 3) (optional) pretty printing. Suppose that literate programming
> filters would be aviable for all the languages mentioned (leaving
> us something to dream about). In that case one would need a method
> to differentiate between the different languages of the code chunks.

This is a tougher problem. The best solution, as you say, seems to be
to put the name of the language into the chunk name, perhaps using the
nocond syntax, i.e., ((Perl)).

-- Lee

------------------------------------------------------------------------
Lee Wittenberg |
Computer Science Department | Routine is the death of security.
Kean University |
Union, NJ 07083 | -- Donald Westlake
| "Smoke" (1995)
le...@samson.kean.edu |
------------------------------------------------------------------------

Dan Schmidt

unread,

Mar 31, 1998, 3:00:00 AM3/31/98

to

Francky Leyn <Franck...@esat.kuleuven.ac.be> writes:

| In my opinion, the following conditions have to fullfilled
| to obtain true language independence:
|
| 1) Executable code, yes/no
| One has to be able to make the difference between exectutable
| files and files that aren't. One has to be able to make shell
| scripts executable from within the literate programming source,
| not with external scripts possibly part of a makefile.

I see the chmod as something that would be better off in the makefile;
I think noweb's only responsibility should be to create the documents.
Why do you think it has to be done by noweb?

| 2) line directives.
| 3) (optional) pretty printing.

I would add 4) indexing.

Some of us brought this up a few months ago. Norman Ramsey introduced
the @language keyword to the noweb intermediate representation for
this purpose, though I don't believe anyone has done anything with it
yet. See the noweb Hacker's Guide at
<http://www.cs.virginia.edu/~nr/noweb/guide.html>.

| Conclusion: in order to obtain language indepence within a single file
| with multiple languages used, one has to be able to specify the language
| of a specific chunk

Agreed.

| and one must be able to specify wether a chunk is executable or not.

I think this is overkill, though, as I mentioned above.

I should note that Norman seems to be rather against the idea of the
noweb source explicitly naming what language each root chunk (and by
implication each chunk) is in, I guess because of the complexity that
introduces. I disagree, but hey, it's his system. :)

--
Dan Schmidt -> df...@harmonixmusic.com, df...@alum.mit.edu
Honest Bob & the http://www2.thecia.net/users/dfan/
Factory-to-Dealer Incentives -> http://www2.thecia.net/users/dfan/hbob/
Gamelan Galak Tika -> http://web.mit.edu/galak-tika/www/

Norman Ramsey

unread,

Mar 31, 1998, 3:00:00 AM3/31/98

to

In article <6fpj59$hu$1...@news.interlog.com>,

Francky Leyn <Franck...@esat.kuleuven.ac.be> wrote:
>I'm using noweb to maintain a library of filters (much the same
>way noweb is operating) in multiple languages. These include:
>C, Perl, AWK, sh, and ProLog. These filters are documented in

>a SINGLE noweb file...

>1) Executable code, yes/no
> One has to be able to make the difference between exectutable
> files and files that aren't.

This is a Unix-ism, and in my opinion, a red herring. If your
operating system is so broken that it can't recognize an executable
script without being told chmod +x mumble, it's not noweb's job to fix
it.

>2) line directives [different for each language]
>3) (optional) pretty printing [different for each language]
and also
4) Indexing for each language.

The architecture of noweb supports this, but nobody has gone the extra
mile to make it work. Here's a sketch of what's needed:

A) Identify the programming language used in each chunk.
For preference I would do that as:
A.1) Identify the languages used in each root chunk.
A.2) Propagate the information from uses to defs
For A.1, the cool way to do it is to look at the tokens in the
chunk and identify the language that way. The easy way to do it
is to develop a naming convention for root chunks. It would be
sensible to use the same conventions that are used in Makefile, e.g.,
<<foo.c>>, <<foo.h>> C
<<foo.m3>>, <<foo.i3>> Modula-3
<<foo.sml>>, <<foo.sig>> Standard ML
<<foo.icn>> Icon
And for languages that don't have Makefile conventions, one could
play a couple of games with naming:
<<perl: htmltoc>> Perl Script named htmltoc
<<awk: noidx>> Awk script named noidx
<<sh: nocount>> Bourne Shell script named nocount
And so on.

If someone reading this group would care to coordinate an effort
to develop a naming convention, I will enshrine it in the
Hacker's guide.

Then somebody has to write filters A.1 and A.2, which will
decorate each code chunk with an @language directive.

Then, to implement the desiderata above, we need

For 2), a more intelligent version of the noweb script. I will try
to include this as part of noweb 3, which should include support for
making this lightning fast (ha ha ha).

For 3), people who write prettyprinters should avoid touching chunks
that are labelled with an incompatible @language directive. (By
continuing to prettyprint unlabelled code chunks, they will preserve
existing behavior in cases where the language hasn't been determined
explicitly.)

For 4), if somebody prods me, I will modify autodefs to avoid
touching chunks that are labelled with an incompatible @language
directive. In a more ambitious world, I should make finduses
sensitive to @language as well---and if I have a really good day, I
should combine all autodefs into a single filter (don't hold your
breath).

Note that I have no time for any of this :-)

Norman

Stephen Parker

unread,

Apr 2, 1998, 3:00:00 AM4/2/98

to

I use both noweb and cweb, and find that the one feature of cweb that I
miss in noweb is the typeset comments. Is there any plan to
incorperate a comment typesetting facility into noweb? I don't want
full pretty-printing.

One idea that occurs to me is to create a noweb comment string such
that (for instance @#) turns the rest of the line into a typeset
comment; ie

<<*>>=
int x; @# Typeset comment to be placed to the right of the code.

Where @#... could be ignored completely by the tangle, but processed
and typeset by the weave.

Is this a practical idea? Any other suggestions?

stephen

Dan Schmidt

unread,

Apr 2, 1998, 3:00:00 AM4/2/98

to

Unregister...@ford.com (Stephen Parker) writes:

| I use both noweb and cweb, and find that the one feature of cweb that I
| miss in noweb is the typeset comments. Is there any plan to
| incorperate a comment typesetting facility into noweb? I don't want
| full pretty-printing.

dpp, my C/C++ pretty-printer for noweb, has an option to feed comments
directly into TeX. It would not be hard to have yet another mode in
which comments are TeX'ed but the rest of the source is not
pretty-printed at all. I'll add it if there's demand.

Norman Ramsey

unread,

Apr 3, 1998, 3:00:00 AM4/3/98

to

In article <6g0r9m$dqh$1...@murdoch.acc.Virginia.EDU>,

Stephen Parker <Unregister...@ford.com> wrote:
> I use both noweb and cweb, and find that the one feature of cweb that I
> miss in noweb is the typeset comments. Is there any plan to
> incorperate a comment typesetting facility into noweb? I don't want
> full pretty-printing.

Such a facility is relatively easy to incorporate using noweb filters.

> One idea that occurs to me is to create a noweb comment string such
> that (for instance @#) turns the rest of the line into a typeset
> comment; ie
>
> <<*>>=
> int x; @# Typeset comment to be placed to the right of the code.
>
> Where @#... could be ignored completely by the tangle, but processed
> and typeset by the weave.

I'm wildly opposed to this idea. Please use the standard comment
syntax of your programming language. For example, if you're writing
C, code something like the following awk filter might work:

/^@begin docs / { code = 0 }
/^@begin code / { code = 1 }
code && /^@text .*\/\*.*\*\// {
if match($0, /\/\*.*\*\//) {
printf("%s\n", substr($0, 1, RSTART-1))
printf("@literal \\comment{%s}\n", substr($0, RSTART, RLENGTH))
printf("@text %s\n", substr($0, RSTART + RLENGTH))
next
}
}
{ print }

Put a suitable definition of \comment in your TeX code and you're all set.

Of course, you have to work harder if you want to avoid picking up
comments in strings, comments split across multiple lines, etc.

If anybody gets this to work, please let me know so I can put it in
the noweb FAQ.

Norman

> I use both noweb and cweb, and find that the one feature of cweb that I
> miss in noweb is the typeset comments. Is there any plan to
> incorperate a comment typesetting facility into noweb? I don't want
> full pretty-printing.

Such a facility is relatively easy to incorporate using noweb filters.

> One idea that occurs to me is to create a noweb comment string such
> that (for instance @#) turns the rest of the line into a typeset
> comment; ie
>
> <<*>>=
> int x; @# Typeset comment to be placed to the right of the code.
>
> Where @#... could be ignored completely by the tangle, but processed
> and typeset by the weave.

I'm wildly opposed to this idea. Please use the standard comment
syntax of your programming language. For example, if you're writing
C, code something like the following awk filter might work:

/^@begin docs / { code = 0 }
/^@begin code / { code = 1 }
code && /^@text .*\/\*.*\*\// {
if match($0, /\/\*.*\*\//) {
printf("%s\n", substr($0, 1, RSTART-1))
printf("@literal \\comment{%s}\n", substr($0, RSTART, RLENGTH))
printf("@text %s\n", substr($0, RSTART + RLENGTH))
next
}
}
{ print }

Put a suitable definition of \comment in your TeX code and you're all set.

Of course, you have to work harder if you want to avoid picking up
comments in strings, comments split across multiple lines, etc.

If anybody gets this to work, please let me know so I can put it in
the noweb FAQ.

Norman

Lee Wittenberg

unread,

Apr 3, 1998, 3:00:00 AM4/3/98

to

On 2 Apr 1998, Stephen Parker wrote:

> I use both noweb and cweb, and find that the one feature of cweb that I
> miss in noweb is the typeset comments. Is there any plan to
> incorperate a comment typesetting facility into noweb? I don't want
> full pretty-printing.
>

> One idea that occurs to me is to create a noweb comment string such
> that (for instance @#) turns the rest of the line into a typeset
> comment; ie
>
> <<*>>=
> int x; @# Typeset comment to be placed to the right of the code.
>
> Where @#... could be ignored completely by the tangle, but processed
> and typeset by the weave.
>

> Is this a practical idea? Any other suggestions?

This is unnecessary. You don't have to fiddle with noweb at all, much
less add extra features. All you need to do is write a fairly simple
filter that replaces all comments in code chunks (@text's between
@begin and @end code's) with @literal lines invoking whatever TeX
macro you decide to create. For example, the (contrived) C code line

@text dist = sqrt(dx*dx+dy*dy); /* Compute the distance ($\sqrt{\Delta x^2+\Delta y^2}$) */

would become

@text ++count;
@literal \C{Compute the distance ($\sqrt{\Delta x^2+\Delta y^2}$)}

(assuming you use CWEB's \C macro for setting your comments), and will
be typeset properly. Noweb's @nl's will take care of line breaks, as
usual. If you want to get a bit more sophisticated, you can figure out
out to deal with multi-line comments, but a simple filter should do
the trick. You probably won't need more than a dozen lines of Awk,
Perl, or Icon.