January 29, 2007
* Second Draft *
Rationale
=========
Problem
-------
Most Forth users find it useful to have a block comment -- a comment
command that covers multiple lines. There are many names for this
because there is no standard.
Names include (( (* /* ( etc.
One standard portable approach is 0 [IF] ..... [THEN] This works on
every standard Forth that implements [IF] [THEN] but it is ugly. It's
easy to rename 0 [IF] to, say, [DOCUMENTATION] and still end with
[THEN] . This solution is not generally accepted as shown by the
existence and use of so many alternatives.
Current Practice
----------------
Many Forth systems provide this capability. I am not yet aware of two
that use the same name.
Solution
--------
Agree on a common name. The more Forth systems that provide a standard
name, the more code will use it and the less trouble it will be
porting that code. Code that uses some other name will still run on
every system that supports it and software can be easily written to
support it elsewhere, but to the extent we use a standard name that
bother can be avoided. All that is needed is that we agree on the
name.
Proposal
========
15.6.2.xxxs (( "comments" TOOLS EXT
Compilation: Perform the execution semantics given below.
Execution: ( "ccc" -- )
Parse text until the word
))
is found or until the current parse area cannot be refilled.
Discard the found )) string. (( is an immediate word.
An ambiguous condition exists if the word )) is not found and the
parse area cannot be refilled.
Typical use
-----------
(( This is a comment, it requires supporting statements.
These statements can go on their own lines.
There might be lots of them. ))
(( ---------------------------------------------------------------
People have lots of styles for comenting.
They like to make their comments stand out as obvious comments.
--------------------------------------------------------------- ))
( ----------------------------------------------------------------)
( They could do that with just ( as traditionally defined. )
( But usually they don't. )
( ----------------------------------------------------------------)
Reference Implementation
------------------------
: MAYBE-REFILL ( S: -- flag ) \ refill if at end of line, 0 if
can't
SOURCE NIP >IN @ > DUP 0= IF DROP REFILL THEN ;
: (( ( S: -- ) \ multi-line comment, ended by ))
BEGIN
[CHAR] ) PARSE 2DROP BL WORD COUNT S" )" COMPARE WHILE
MAYBE-REFILL WHILE
REPEAT
." Multiline comment needed )) and never found it -- ambiguous
condition."
CR ." Your system can do anything at this point and still be a
standard system."
THEN ; IMMEDIATE
WORD can be replaced by PARSE-NAME when the time comes.
It would not be difficult to make a CREATE DOES> word which defines
new words like (( with new ending strings different from )) . I may do
that.
Test Cases
----------
(( 123 . )) 456 .
(( 123 .
456 .
789 . )) .( aaa )
: foo (( 123 .
456 . )) 789 . ;
foo
Remarks
-------
I don't care what name we use provided we can agree on one.
I dislike using /* because there might be some use for files where
Forth code is in C comments and vice versa.
/*
Forth code here
((
*/
C code here
/*
))
More Forth code here
Etc.
That isn't a major concern but /* isn't my top choice.
I don't like to just extend ( ) . One typo in the wrong place with a
looming deadline and you could spend 5 minutes tearing your hair out
figuring out what happened. Also, your multiline comment can't include
a ) which greatly reduces its value.
I kind of like Marcel Hendrix's (* . It encourages intense visuals.
(* ******************************************
* This is a bold comment! *
****************************************** *)
(( )) seems bland and inoffensive, something we could all agree on.
If you prefer something else that's fine.
Mishaps
-------
My first day using APL, somebody came by my keyboard while I was
getting a drink of water, and they started the command that gets a
user string. When I came back nothing I did could get the computer to
notice me. I tried to logout. I tried everything I could think of. I
was stuck. Then somebody suggested I end the string.
Extended comments in files or EVALUATE strings are self-limiting. But
not at the keyboard. If you accidentally get into a comment and the
system seems to be frozen, how long will it take for you to remember
to try )) ? Does your system provide another way to get out of that?
Should and could there be a standard way out?
If there is no obvious way out on your system, a word very similar to
(( could become a password. Type LOCK and when you get a drink of
water the system will ignore all keyboard input until you type XLERB
or whatever you choose for a password. Could that be used maliciously?
Could someone sneak in a bit of Forth code that gets executed during
keyboard entry, and your keyboard freezes until you guess the password
or you abort the Forth?
What if input is redirected to some other source, a streaming source,
and somehow it all gets commented out. All input from that source will
be ignored until it sends the end-comment signal or until you break
the connection. The foreign source can't even tell you that they're
done. Your Forth system hears them but it isn't listening. Is this a
problem?
And I would very much appreciate not having it named /( or, indeed,
anything else beginning with forward-slash.
I was trying to rewrite some code to make it more portable. It had
multiline comments. I wrote a quick routine to handle them. No big
deal, just one more routine cluttering up the files and the system, so
you can use multiline comments that are basicly just like everybody
else's multiline comments but with a different name.
It looked like a good thing to standardise, not any big deal but it
saves time whenever you start to port a file and it doesn't have yet
another brand new name for multiline comments.
> I already have ---[ and will keep on using it
Maybe I should get to work on that CREATE DOES> routine to make new
multiline comments with a single short line of code. Thank you for
being frank about it.
Forth is capable of keying on
2 or more spaces and
other non sense stuff , to key on
comments .
[ B_Register45_99 should be zero ]
/ Comments
Notice how i can use [/] for "line above"
Forth parses spaces and that by itself
is a comment , then sees "and ", and its final !
The solution to stop this debate , is to stop
making precise rules and forcing them on
your opponent !
There is no need to keep Forth parsing
so stupid .
Forth only parses TIB when interacting
with humans , not at run time .
Stop killing Forth with rules ...
A programmer makes his own rules
as he writes .
________________________________________
Here's how I do it in my string parsing library, parsing.fs:
: s-input-after ( s -- flag )
(
Advance the input stream to just after the next occurrence of
the string across lines, and leave true if found. If the string
is not found before the end of the file, leave the input stream
positioned there with false. Based on the $> reference in
PARSE-AREA@.
)
( s) 2>r
BEGIN
parse-area@ 2r@ ( area.s s)
search ( area.s false | &found #rem true)
0= WHILE \ not found
( area.s) 2drop
refill 0= IF \ end of file
( s) 2r> 2drop false EXIT THEN
REPEAT \ found
( &found #rem) 2r> ( s) nip /string parse-area! true ;
: (( ( -- ) s" ))" s-input-after drop ; immediate
I've tried a lot of names for a multiline comment word, and
didn't find many of them very aesthetic. My current favorite
is
: --- ( -- ) s" ---" s-input-after drop ; immediate
Here's an example of its use:
: next-input-name ( "<name>" -- word.s ) \ Wil Baden's NEXT-WORD
---
Parse the next word from the input stream across lines. If
word.s is empty, at most whitespace was found. A parsing
implementation of Wil Baden's NEXT-WORD.
NOTE: The implementation does not echo CR's when the lines
in question are being entered interactively at the terminal.
---
BEGIN parse-name dup IF EXIT THEN refill
WHILE 2drop REPEAT ;
I would not vote against (( if there's substantial support for
it.
-- David
Note that ---[ is a different approach ... that each line is a comment
until an explicit \ comment line indicates that Forth code has
restarted. It is essentially for Forth code embedded in text
commentary, rather than for commentary embedded in Forth code.
Extremely ugly.
> It's easy to rename 0 [IF] to, say,
> [DOCUMENTATION] and still end with [THEN] . This solution
> is not generally accepted as shown by the existence and use
> of so many alternatives.
The situation has not been properly factored. There does not
need to be a new block comment word. There needs to be a word to
tell the system to start interpreting or compiling Forth code
and when to tell it to stop. Everything outside these limits
will be comments. This will allow whatever is needed to print a
Forth program as a document with diagrams, mathematical symbols,
and even as a foreign language that does not use the ASCII Latin alphabet.
--
Michael Coughlin m-cou...@comcast.net Cambridge, MA USA
> The situation has not been properly factored. There does not
> need to be a new block comment word. There needs to be a word to
> tell the system to start interpreting or compiling Forth code
> and when to tell it to stop. Everything outside these limits
> will be comments. This will allow whatever is needed to print a
> Forth program as a document with diagrams, mathematical symbols,
> and even as a foreign language that does not use the ASCII Latin alphabet.
To do that without anything new, you could take your block comment
command and simply start parsing the file with it. Then when you're
ready to comple code you back out of the block comment until you're
ready to comment again.
If you want you could make a word similar to INCLUDED which starts out
in comment mode. That looks extremely easy to me too, the only concern
being that some rare source code might actually start out with code.
So if we have a standard syntax, then it makes sense for files to
start out with the command that starts a block comment, and it makes
sense for file interpreters to start out in comment mode.
What is the difference between what you want and what I want?
> To do that without anything new, you could take your block comment
> command and simply start parsing the file with it. Then when you're
> ready to comple code you back out of the block comment until you're
> ready to comment again.
I find it more natural to have a word that fits into the context of
the non-Forth text.
The difference between text/forth mixing words and block-comment words
is essentially one of the extent of non-interpreted that we have in
mind. A block comment word is trying to embed a comment block in Forth
source ... so it is reasonable to think of the beginning of the block
to be somewhere in view, and even, as with the (* *) naming that I
prefer, with the comment block looking like
(* *****************************
* * If this was a real comment block,
* * this would say something useful
* *)
And note here that in this text/forth mixing style, it is possible to
quote a comment block without confusing anything ... or even code, if
you block indent the code.
> If you want you could make a word similar to INCLUDED which starts out
> in comment mode. That looks extremely easy to me too, the only concern
> being that some rare source code might actually start out with code.
I'm happy with the source starting with the word that is intended to
put it into literate mode ... whether ---[ or <HTML>
OTOH, there is no need to standardise either ---[ or <HTML> ... a
collection of texts that use that can easily include those in its core
toolkit.
> 15.6.2.xxxs (( "comments" TOOLS EXT
[..]
"((" was used in F-PC for something slightly different. AFAIR, it
switched between comments and compiling, and "))" was a word.
The current Win32Forth has "((" like you propose, so that's a plus.
iForth has inherited (( and )) from FysForth, where it is used
as a method to catapult TO for array indexing.
E.g. 12 TO (( 3 )) mydata puts 12 into mydata[3] . I myself do
not object to ignoring this use of (( )) because it is better to
promote the FSL array / matrix notation.
I must point out that your reference implementation does not seem
to allow for nested comments, which can be useful to block out
larger parts of source code (which [if] [then] supports).
Win32Forth has a problem there too:
(( this should (( work )) ))
^^
Error(-13): )) is undefined
BTW: so does iForth :-(
-marcel
> I find it more natural to have a word that fits into the context of
> the non-Forth text.
I may be missing something here. It sounds like you're talking about
an esthetic thing, about the form of the word that switches between
text and Forth. We're talking about the exact same functionality and
the difference is a psychological one, something to do with how we
choose to look at it?
> I'm happy with the source starting with the word that is intended to
> put it into literate mode ... whether ---[ or <HTML>
>
> OTOH, there is no need to standardise either ---[ or <HTML> ... a
> collection of texts that use that can easily include those in its core
> toolkit.
I see a big variety of names for something that's essentially the same
functionality. If some name gets declared standard and more people use
it, that reduces the work of porting code.
It isn't necessary that everybody use the same name, but to the extent
that people do settle on one name it makes things a little easier for
porting.
> I must point out that your reference implementation does not seem
> to allow for nested comments, which can be useful to block out
> larger parts of source code (which [if] [then] supports).
Someone else commented on that too. Thank you. I'll fix it.
Too similar to the forth operator */
Anything more than 2 characters becomes a nuisance. Alphabetic
characters raise issues of case. Using the same character e.g. (( is
prone to mistyping as well as not standing out enough.
Of the one's proposed (* is perhaps most recognizable even by
those that don't use Pascal. In forth, it could be said to build upon
the other comment character (
Thats why i improved Forth to be always
in the RUN mode and never in a COMPILE
mode . It is impossible to crash .
It allowed me to toss [State] and
many other DEFERED Words .
You DO NOT TELL Forth to
1) stop
2) slow down
3) defer
4) change the order of a DoCol
5 ) [ Compile]
6) use stack to "compile" Words
No matter what keys or Forth commands
you type , it wont lock up , even if you're
testing code !
Testing code uses the assembler
to create the OpCodes , so if you misspell
a nuemonic , it will assemble the closest
Match and start traceing code .
At High level , you dont know you are
using an assembler !
No excuses ...
Comments are more important than code. This is an idea that
Forth programmers cannot grasp. I know since I try to explain
this all the time and get nowhere.
Now that we have more powerful computers, we can write
program source that is meant to be understood by human beings
and the computing system can quickly skip thru it to execute the
computer code. Doing this when programs were written on punch
cards was very inconvenient. It made the source decks too big to
handle.
(* and <HTML> do not have exactly the same functionality ... <HTML>
with friends allows you to view the same file as an html text that you
load as a Forth source.
(* and ---[ do not have exactly the same functionality ... ---[ allows
you to quote any snippet of Forth code in the text description, with
restriction on the contents of the code, simply by block indenting. (*
is always sensitive to *)
On the other hand, I'm a big fan of commenting, and if (* *)
encourages more commenting, I'm all for it.
I much prefer (* *) over (( )).
-- David
> > What is the difference between what you want and what I want?
>
> Comments are more important than code. This is an idea that
> Forth programmers cannot grasp. I know since I try to explain
> this all the time and get nowhere.
>
> Now that we have more powerful computers, we can write
> program source that is meant to be understood by human beings
> and the computing system can quickly skip thru it to execute the
> computer code. Doing this when programs were written on punch
> cards was very inconvenient. It made the source decks too big to
> handle.
I see a deep philosophical meaning here. But what operational
difference does it make in the routine to switch between comments and
code?
Not if those characters are the same. For me it's easier to type ((( than (*,
which in fact seems to be tailor-made for american keyboard layouts.
The main point of literate programming is that you never have to look
for the documentation of a source file ... its in the source file
itself.
With many languages, that requires special tools to mix and unmix the
documentation and the code, but defining a literate programming
wordset that allows Forth code to be embedded in its own documentation
is a one-screener.
Now, if the code is embedded in the documentation, then you do not
want to have the start of the Forth segment determined by some two-
character sequence defined in terms of the name of a block comment
word that will often lay pages further up in the file.
So the literate programming "skip the next stuff" words are not
defined as "opening and close delimiters". If you choose to make
<code> the "start interpreting" token, then you can simply define:
\ Skip interpreting until <code>
: </code> ( --- ) <HTML>
and in a case-sensitive forth implementation, you define:
\ the <html> tag is case insensitive
: <HTML> <html> ;
And with the "section title" word, ---[, if you want to have a
"chapter title" and "subsection title" to support automatic conversion
of the plaintext-layout file into html, then you define:
: ----[ ---[ ; \ chapter title
: --[ ---[ ; \ subsection title
The switching into Forth interpretation mode would always be when
there is a "\" comment starting in the first position of the line, and
the name of the section title word doing the work (when INCLUDED) of
passing over the documentation until the Forth code is found does not
matter. That means that the name of that Forth word is able to be
something that is useful for the documentation.
On the other hand, an *actual* comment block really is embedded in the
Forth code, and the definition of the word with a matching "reversed
delimiter" is a clear, intuitive way to show where the block ends.
NO ! NO ! No ! Do not agree on common
names , force them to use your No-Text G.U.I.
If they hack your code , it behoves them to
create new icons , as they are hacking !
The Paradigm , is they can't go back to text !
Impossible ! Try it you'll see what i mean .
You CAN argue English ,
but who argues the Textless ,
wavy line on a traffic sign :
~~~~~~~
( dangerous curve ahead )
everyone , even C prgrammers , slow
down and turn the wheel left , as they should .
My Forth has no text , expecially not Buzz Word
Text .
I agree, not least because I'm entirely likely to include mathematical
expressions in code comments:
(( This word implements the function
f = (a+b)(c*(d+e))
for positive integer values of a,b,c,d,e
))
Ed
> Not if those characters are the same. For me it's easier to type ((( than (*,
> which in fact seems to be tailor-made for american keyboard layouts.
But refer to Ed Beroset's comment on the particular pitfalls of
multiple ))'s, for those of us poor souls who may be commenting a
formula.
Sure. Where's the URL?
Though I admit, I have trouble coming up with names sometimes ...
designing an icon for each new word I define is a bit daunting.
-------------------------------------------------
No ,because you allready have an image
in your brain , of the code frag .
That image cant be shown here , nor
can it be "posted" , because would necessarily
be translated to your left brain , and that
would take 14 pages , and certainly be buggier
than 1 line of C code .
DONT translate it , just start doodling
with a pencil . draw images .
I must warn you ! You will fail , time
and time again , because your Left side
will fight you , like a
Trade Unionist !
Are you following this ?
we humans instinctively , MUST have
everything avail' quickly , in the Left
side , so we can speak it .
that means every intellegent thought
must allready be in the Left side , where
there is insuff' ability to DESCRIBE !!
Because it cant handle value judgements !
Your right side is the only productive
part of your brain , and if you train it,
you will become rich and enjoy a much
higher "peer" group .
Your left side is only trying to figure
how to SPEAK , what the Right side has
accurately mapped out .
Ill make it easy for you ,
lets redefine chars , but not even close
to what they are now , and the rule
is they must be very general VERBS .
> does NOT mean LT , we cant use
it because it would confuse ,
so instead , lets do
/ means the line above or
something above and NOT this line .
\ lower line or lower something ..
! caution , an error , store a number
# a structure a pattern , a array .
The objective is to redfine , so ppl
will NOT struggle nor argue on
the redefine . I dont want them
to stop for 1 milisecond , trying to
decipher it , so it must be totaly
new definition .
Example , if i used ^ to mean [UP]
what would i use for [down] !
No good ...
General , so one can suffix to
narrow the "range"
[ / line 32 ]
noice how this takes the burden
off the Left side , it must read , present
line to figure if line 32 is
ABOVE or is BELOW the present line .
This is how you learn to program ,
with your right side , by turning OFF
or preventing the left side from
"calculating" anything .
Left side is your worst enemy , it wants
to become a college proff and make $28,000
a year , while your
Right side "see's" a picture of people making $$ ,
this image has linking to show you all the closed
doors , in making that $$ .
Thus all successful ppl , use their R side .
and the left side can "explain" why he
can't get a job .
R side has NO excuses, only plans
I am a Rich E.E., programmer ,
developer , cause i trained
my R side to toss excuses .
Wanna join a "no-excuses" peer
group ?
>15.6.2.xxxs (( "comments" TOOLS EXT
>
> Compilation: Perform the execution semantics given below.
>
> Execution: ( "ccc" -- )
>
> Parse text until the word ))
> is found or until the current parse area cannot be refilled.
> Discard the found )) string. (( is an immediate word.
>
> An ambiguous condition exists if the word )) is not found and the
>parse area cannot be refilled.
I presume that the phrase "the word ))" means that it is white-space
delimited. This matches the MPE implementation, which has 15-20 years
of history.
Stephen
--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Don't bother. Non-nestable supports 95+% of the problem.
"0 [if] ... [then]" fixes the rest.
Please note that by convention the )) is white space delimited,
which is rare in formulae.
By "whitespace" do you mean blank or blank|tab|cr|control-character|
etc?
It isn't that rare for a commented formula to end with )) or ))) etc.
For that matter I sometimes include blanks in commented formulas
because that makes them easier ofr me to read.
You have lots of code with (( and changing its meaning on you might
break some of it.
So if we wind up with a standard nestable multiline comment it should
have a different name.
> So if we wind up with a standard nestable multiline comment it should
> have a different name.
(* .. *)
or
\\ .. \\ (as a personal matter of taste I like this one more)
would be fine with me
Without too much thought, probably okay with me, too.
How about leaving it up to the user, and proposing instead a
defining word (maybe JET already has a DOES> reference version?)
that would be some variation on
: comment: ( "introducer.name" terminator.s -- )
(
Define a word named by the next word in the parse area, which
skips text across lines, and leaves the input stream positioned
just after the ANS Forth string terminator.s, or at the end of
the file if terminator.s is not found.
)
... ;
Syntax example:
s" ))" comment: ((
Or make it an entirely nonparsing word with syntax:
s" ((" s" ))" comment
Or one with entirely parsing syntax:
comment: (( ))
comment: \\ \\
I personally prefer the entirely parsing version, with the
entirely nonparsing version as runner-up.
Then it would be easy to have multiple comment delimiters in the
same document. I don't know what would be the best name for
COMMENT: or COMMENT ...
-- David
>By "whitespace" do you mean blank or blank|tab|cr|control-character|
>etc?
blank|tab|cr|control-character|...
I won't have a problem with nesting, white-space delimited
multiline-comments, but my implementation is different, and doesn't need
white space delimiters (however, it also doesn't nest, similar to C's /*
*/). I have three *-containing multiline comments, (* *), /* */ and \* *\
: ?refill source nip >in @ = IF refill 0= ELSE false THEN ;
: parse* parse + source + over <> swap 1- c@ '* = ;
: _*) BEGIN ') parse* and ?EXIT ?refill UNTIL ;
: _*/ BEGIN '/ parse* and ?EXIT ?refill UNTIL ;
: _*\ BEGIN '\ parse* and ?EXIT ?refill UNTIL ;
: Com: Create ' A, immediate DOES> perform ;
Com: (* _*) Com: /* _*/ Com: \* _*\
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
> How about leaving it up to the user, and proposing instead a
> defining word (maybe JET already has a DOES> reference version?)
> that would be some variation on
>
> : comment: ( "introducer.name" terminator.s -- )
We could do that. And it needs no standardisation whatsoever.
The advantage I see to a standard version is that to the extent people
use it, the code will port without having to be touched at all. It's
one less thing to worry about.
But with a defining word for multiline comments it would at least be
easy to deal with idiosyncratic commenting.
> Or make it an entirely nonparsing word with syntax:
> s" ((" s" ))" comment
>
> Or one with entirely parsing syntax:
>
> comment: (( ))
> comment: \\ \\
>
> I personally prefer the entirely parsing version, with the
> entirely nonparsing version as runner-up.
If it's a tool to deal with the crazy things other people do, I'd
rather the ending part at least be parsed first. Just in case they
want to have spaces in it. I can have it last and parse to the end of
the line, but there's always the chance I'll need to use a text editor
that lets me have spaces at the end of the line.
These are all minor rare aggravations, though, and for code that isn't
standardised but available to anybody to modify as they see fit it
doesn't matter much.
Again though, when multiline comments follow some standard format they
don't cause any aggravation at all. It's one less thing to deal with.
February 2, 2007
* Third Draft *
Significant Differences from 2nd Version
============================
Name change from (( to (*
Nesting, (* (* *) *) works.
Rationale
=========
Problem
-------
Most Forth users find it useful to have a block comment -- a comment
command that covers multiple lines. There are many names for this
because there is no standard.
Names include (( (* /* ( etc.
One standard portable approach is 0 [IF] ..... [THEN] This works on
every standard Forth that implements [IF] [THEN] but it is ugly. It's
easy to rename 0 [IF] to, say, [DOCUMENTATION] and still end with
[THEN] . This solution is not generally accepted as shown by the
existence and use of so many alternatives.
Current Practice
----------------
Many Forth systems provide this capability. MPE's Forths, Win32Forth,
and FPC have used the name (( for a version that does not nest, but
several commenters prefer a standard commenting command that does
nest.
Solution
--------
Agree on a common name. The more Forth systems that provide a standard
name, the more code will use it and the less trouble it will be
porting that code. Code that uses some other name will still run on
every system that supports it and software can be easily written to
support it elsewhere, but to the extent we use a standard name that
bother can be avoided.
Proposal
========
15.6.2.xxxs (* "comments" TOOLS EXT
Compilation: Perform the execution semantics given below.
Execution: ( "ccc" -- )
Parse text until the whitespace-delimited string
*)
is found or until the current parse area cannot be refilled.
Discard the found *) string. (* commands are nestable.
(* is an immediate word.
An ambiguous condition exists if the word *) is not found and the
parse area cannot be refilled.
Typical use
-----------
(* This is a comment, it requires supporting statements.
These statements can go on their own lines.
There might be lots of them. *)
(* ***************************************************************
People have lots of styles for comenting.
They like to make their comments stand out as obvious comments.
*************************************************************** *)
(* Sometimes people want to increase the range of their comments.
: BUGGY-CODE 0 >R ;
(* What was I thinking when I wrote this? *)
: MORE-BUGGY-CODE DO :
(* What a headache. I must have been having fun, why can't I remember?
*)
Better comment out this whole section. *)
Reference Implementation
------------------------
: MAYBE-REFILL ( S: -- flag ) \ refill if at end of line, 0 if
can't
SOURCE NIP >IN @ > DUP 0= IF DROP REFILL THEN ;
: (* ( S: -- ) \ multi-line comment, ended by *)
BEGIN
BL WORD COUNT 2DUP S" *)" COMPARE WHILE
S" (*" COMPARE 0= IF RECURSE THEN
MAYBE-REFILL WHILE
REPEAT
0 0 CR ." Your system can do anything at this point and still be
a standard system."
THEN 2DROP ; IMMEDIATE
BL WORD COUNT can be replaced by PARSE-NAME when the time comes.
It would not be difficult to make a CREATE DOES> word which defines
new words like (* with new ending strings different from *) . I may do
that.
Test Cases
----------
(* CR .( Don't print this.) *) CR .( Print first comment )
(* CR .( Don't print this either. )
CR .( And don't print this. ) *) CR .( Print second comment )
: foo (* CR .( Don't print while compiling )
CR .( Or the second line. ) *) CR ." Print third comment " ;
foo
(* CR .( Don't print another ) (*
CR .( Don't print nested. )
*) CR .( Still don't print ) *) CR .( Print fourth and last comment )
Remarks
-------
I don't care what name we use provided we can agree on one.
I dislike using /* because there might be some use for files where
Forth code is in C comments and vice versa. Also */ is a Forth word.
(( has been in use for many years by MPE and by Win32Forth as a non-
nestable multiline comment. A nestable version should have a different
name. Also, (( is used in iForth for something completely different.
Other suggestions for names: \\ --- ( in both cases, the next
instance toggles rather than nests)
Thanks
------
Thanks to Robert Epprecht, George Hubert, Michael Hore, Bruce
McFarling, David N. Williams, Marcel Hendrix, Ed Beroset, Stephen
Pelc, and Michael Coughlin..
Comments
========
<snip>
> Or make it an entirely nonparsing word with syntax:
> s" ((" s" ))" comment
Won't always work, there are (still ?) ISO Forths that provide only one
location to hold S"-strings in interpret state ;-(
--
Coos
CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html
One post here mentioned internesting of Forth comments and C comments
so that both languages can exist within the other's comment fields.
Using *) for a closing delimiter in Forth might create problems with C
conventions such as (int *) other_var. Perhaps another pair of
delimiters should be used. I don't know that this is a *major*
concern regarding block comments, but I think it is worth
consideration.
I have two thoughts on the matter. One is that block comment
delimiters don't have to be all that short considering that they are
not typed on every line. So a longer delimiter such as ----{ }----
or whatever had been suggested should not be very onerous and should
help to prevent any overlap with constructs not yet devised or from
other languages.
The other is do we even need block comment delimiters? One of the
things I have given some thought to as I learn about forth is how Jeff
Fox has explained some of the advanced work they have done. When they
come across something that cam be moved from run time to compile time
or from compile time to edit time, they do it. There are not many
editors that are not capabile of using line comment delimiters to
comment out a block of text. I use this myself and have not found it
to be a burden. So why are these words needed? If you add them, make
sure they don't do any harm (such as making it hard to embed C source
in a Forth file or vice versa).
BTW, I do think that if you add block comments, it is important to
make them nestable. The main reason I have used block comments in the
past is to take out sections of code. If they aren't nestable and the
code is already using these, it don't work!
> On Feb 2, 9:42 am, "David N. Williams" <willi...@umich.edu> wrote:
[..]
> Again though, when multiline comments follow some standard format they
> don't cause any aggravation at all. It's one less thing to deal with.
Exactly. Please don't give up now. Standardization is never easy.
You wouldn't have started this thread if it was *really* completely
stress-free to port multiline comments.
-marcel
> Op Fri, 02 Feb 2007 14:42:47 GMT schreef David N. Williams:
> <snip>
>> Or make it an entirely nonparsing word with syntax:
>> s" ((" s" ))" comment
> Won't always work, there are (still ?) ISO Forths that provide only one
> location to hold S"-strings in interpret state ;-(
RENAME-FILE FILE EXT
( c-addr1 u1 c-addr2 u2 -- ior )
Rename the file named by the first character string specified by c-addr1
u1 to the name in the second character string. ior is the
implementation-defined I/O result code.
An ISO Forth that doesn't implement RENAME-FILE , or only with
compiled strings? That way one can't debug from the command line.
I guess that 'feature' isn't too loudly advertized, then :-)
-marcel
> One post here mentioned internesting of Forth comments and C comments
> so that both languages can exist within the other's comment fields.
> Using *) for a closing delimiter in Forth might create problems with C
> conventions such as (int *) other_var. Perhaps another pair of
> delimiters should be used. I don't know that this is a *major*
> concern regarding block comments, but I think it is worth
> consideration.
Good idea. (* is the name that's gotten the most support so far, but
it's still only a few people. You coudl suggest another name.
> I have two thoughts on the matter. One is that block comment
> delimiters don't have to be all that short considering that they are
> not typed on every line. So a longer delimiter such as ----{ }----
> or whatever had been suggested should not be very onerous and should
> help to prevent any overlap with constructs not yet devised or from
> other languages.
That's true. I didn't want ----[ because Bruce said it has other side
effects that I'm not proposing. Of course, we could propose to add
those effects....
> The other is do we even need block comment delimiters?
My concern is that I see code that does use block comments, and code
from different people uses different delimiters. It's an irritant,
though not a giant one. To the extent that people agree on a single
delimiter (or a short list) that irritant is removed.
If you're right that we don't need block comments at all, and if we
have a standard block comment command that can be written in standard
Forth, then programmers can simply not use it. Then implementors will
be stuck putting a feature in their languages that nobody uses, a bad
thing. Only, they can have standard systems if they do as little as
put the code for the multiline-comment command in their documentation!
It isn't a giant imposition on them, either.
It might be that this proposal wastes my time and the time of people
who read and comment about it. But I don't think the drawbacks go much
beyond that. In the best case we get one less little glitch when we
port code from other people. In the worst case nothing happens and we
waste the time of a few people who thought that something might
happen.
> BTW, I do think that if you add block comments, it is important to
> make them nestable. The main reason I have used block comments in the
> past is to take out sections of code. If they aren't nestable and the
> code is already using these, it don't work!
Agreed. The proposal now reflects that, and if someone wants to argue
that they shouldn't be nestable then we can argue it out.
> An ISO Forth that doesn't implement RENAME-FILE , or only with
> compiled strings? That way one can't debug from the command line.
> I guess that 'feature' isn't too loudly advertized, then :-)
Of course you can debug from the command line, you just have to use an
implementation-specific way to do it. T" ..." or " ..." or whatever.
And in normal use the existing file name is stored somewhere, so S" is
not needed to name it.
(* *)
If someone wants to start a section of forth code with a a block
comment that intersperses the block comments used below and more
commentary on how they fit together ... why, I'm perfectly happy if my
preferred block comment naming supports that.
But the )) in JETs code did not seem to be ... parse for a ")", then
check if it is followed by ") "
... and if there is ANY implementation of (( ... )) that permits it to
not be white space delimited, in a closer analogue to ( ...), then
standardizing that )) *must* be white space delimited is breaking some
existing code somewhere.
I see a nestable (* *) as being a loop of PARSE-WORD's which then test
the word found for being *), in which case EXIT, or (*, in which case
RECURSE. So that would be naturally a white-space delimited *) as a
delimiter.
That should not be a reason not to use it if people wanted to ... I
just like the way that:
---[ The Display Subsystem ]--------------------[ 17May2007 ]---
looks ... where, obviously, ]--- is where the header line ends, and
where the rest of the block comment starts! I am experimenting with
plaintext layout that converts easily to html (and, with restrictions
on the html, conversely), I prefer to have the plaintext layout be
something that I like the look of.
But its not in any published code yet, so I can easily switch to
something else.
Just please don't use /( because I still hope to get back around to
SLIDERS and I really like that way that looks for starting a substring
match inside a string match.
If *) is problematic for interspersing C and Forth (not something I
would do, but still ...), then
(-- and --) would give an alternative style of "box drawing".
Indeed, it would be even cleared for interspersing C and Forth, with
------- being box drawing for a Forth comment and ******* being box
drawing for a C comment.
That is on the right track, but has a similar problem, "--)" can
occur in C as part of (foo--). I don't see "**" anywhere in the C
reference. How about (** and **)? Or maybe (== and ==)? Would -
{ and }- make people happy.
Actually the idea of mixing Forth and C in one file that can be used
under either langage is **very** appealing to me. It is a good way to
be able to move between the two languages using one as a rapid
prototyping tool and the other as the formal software deliverable...
which do you think would be which? ;^)
Anyone see a problem with -{ and }- other than the fact that you have
to use the SHIFT key? I guess C can be written without an intervening
space which means }- could occur?
** looks like a pointer to a pointer to me, but I don't recall whether
that is just in use, or whether you can do that in a declaration with
*(*) ... its been a long time since I did any C-coding.
==) should be safe ... its a binary operator, so its defined so that
there must be something between a == and a )
(== ==) inherits the same argument as (-- --) in terms of comment
boxing.
/* *********************************
* Obviously this is a C block comment
* ...
* ...
********************************* */
(== ===================
|| Obviously this is a Forth block comment
|| (== with a nested comment ==)
|| ...
================== ==)
Indeed, one who wanted to intersperse Pascal and Forth or Modula-2 and
Forth might not want (* *) either!
> Anyone see a problem with -{ and }- other than the fact that you have
> to use the SHIFT key? I guess C can be written without an intervening
> space which means }- could occur?
That would be the problem ... better to use a binary operator like ==
so you know that something has to lay between it and ( or ).
With { is used in locals and } is a common operator in arrays, I'd
rather leave { } alone.
I still prefer (* *), but the (-- --) or (== ==) is also OK by me.
How is } used in arrays? I know { and } are used for blocks, but what
else?
> I still prefer (* *), but the (-- --) or (== ==) is also OK by me.
Yeah, I like (== and ==). In mixed mode it would be
/* (== **************************************************** ==)
Forth code goes here,
forth would also have to define "/*" to be a no-op.
(== **********************************************************/
C code goes here
/********************************************************** ==)
More forth code
(== **********************************************************/
More C code
/* The end of the final forth comment would be the EOF */
Not as pretty as I would like, but it would work.
IMO it's better to choose a name one likes and works well in the
Forth environment rather than avoiding all possible conflict.
The few percent of cases where conflict arises can be worked
around with \ or 0 [IF]..[THEN].
I imagine that C and Pascal had to make similar compromises.
It's also a good idea to rank one's conflicts. Which are you
more likely to use in comments - math, C, Pascal, other etc
> How is } used in arrays? I know { and } are used for blocks, but what
> else?
Dereference. } vector dereference, }} rectangular matrix dereference.
And then things that take the same stack input as } are by convention
given names starting with }, and those with the same stack input as }}
are by convention given names starting with }}.
The reason it doesn't conflict with { } for locals is that there is no
{ array word, just the convention of ending the name of an vector as,
eg, Y{ and a two-dimensional matric as, eg, IO{{
> > I still prefer (* *), but the (-- --) or (== ==) is also OK by me.
>
> Yeah, I like (== and ==). In mixed mode it would be
>
> /* (== **************************************************** ==)
: /* (== ;
and then every switch can be
/* ********************************************************* ==)
and
(== **********************************************************/
or even:
/* *** TO FORTH ********************************************** ==)
Forth code
(== ** TO C **************************************************/
C code
/* *** TO FORTH ********************************************** ==)
> On Feb 2, 11:36 am, "J Thomas" <jethom...@gmail.com> wrote:
>> RfD -- ((
>>
>> February 2, 2007
>> * Third Draft *
>>
>> Significant Differences from 2nd Version
>> ============================
>> Name change from (( to (*
>>
>> Nesting, (* (* *) *) works.
> ...snip...
>> Comments
>> ========
>
> One post here mentioned internesting of Forth comments and C comments
> so that both languages can exist within the other's comment fields.
> Using *) for a closing delimiter in Forth might create problems with C
> conventions such as (int *) other_var. Perhaps another pair of
> delimiters should be used. I don't know that this is a *major*
> concern regarding block comments, but I think it is worth
> consideration.
(Just raising an idea that came to me while lurking in c.l.f.)
Here's a potential solution that preserves the new syntax: The number of
starting comment asterisks must match the number of ending comment
asterisks.
So while "typical use" still applies:
(* This is a comment, it requires supporting statements.
These statements can go on their own lines.
There might be lots of them. *)
(* ***************************************************************
People have lots of styles for commenting.
They like to make their comments stand out as obvious comments.
*************************************************************** *)
The last comment could be rewritten as:
(*****************************************************************
People have lots of styles for commenting.
They like to make their comments stand out as obvious comments.
*****************************************************************)
Only the sequence
*****************************************************************)
will terminate the comment. In particular, neither (int *) other_var nor
(int ****************************************************************)
other_var will end the block comment.
(Relurking)
Regards,
Adam
That describes a non-nesting (* comment syntax. iForth implements
that. BTW, maybe you should also change the Subject of the postings
to reflect the word name change.
> (* commands are nestable.
That is in contradiction to the description above. It is also quite
imprecise; e.g., is there any requirement on delimiters around "(*"?
BTW, for those people worrying about nesting C, Forth or other languages:
char *a="consider that any string can occur in C";
s" and in Forth as well, without respect for nesting: (* */ (( etc."
>BL WORD COUNT can be replaced by PARSE-NAME when the time comes.
The time is already here.
>It would not be difficult to make a CREATE DOES> word which defines
>new words like (* with new ending strings different from *) . I may do
>that.
IMO (* is already unnecessary: I use \ instead, and we have ( ...) and
0 [IF}...[THEN] as well. A word for defining custom comment
delimiters seems excessive, and the main effect would be to reduce
readability.
- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
Forth-Tagung 2007: http://www.complang.tuwien.ac.at/anton/forth-tagung07/
EuroForth 2007: September 13-16, 2007
pad s" old.txt" tuck pad swap cmove s" new.txt" rename-file
is a pretty awful workaround, but it works on all ANS systems with the
file-access word set. I guess almost everyone falls at least once into the
trap of using
s" old.txt" s" new.txt" rename-file
on systems that provide only one location for interpreter strings. It might
be worth considering changing the standard to require at least two
locations.
Regards,
Stephan
What I said before was daft. Its
: /*** ;
: ***/ ;
Start of file: /*** (== ***/
End of file: /*** ==) ***/
/***************************************************************
* ... C block comment, creation date, dependencies ... all the normal
stuff
* ...
* ...
***************************************************************/
... C code here if desired
/********************************************************* ==)
... Forth code
(== ****************************************************/
... C code
/********************************************************* ==)
... (and so on)
... Forth code
(== ****************************************************/
... C code here if desired
/*** ==) ***/
No dangling block comment at the EOF.
> BTW, for those people worrying about nesting C, Forth or other languages:
>
> char *a="consider that any string can occur in C";
> s" and in Forth as well, without respect for nesting: (* */ (( etc."
Yes. I believe that worrying about compatibility with other
programming languages might best be a low priority.
> IMO (* is already unnecessary: I use \ instead, and we have ( ...) and
> 0 [IF}...[THEN] as well.
\ only works for one line. So using \ for long blocks of text require
the text processor to behave in a certain way. Not all of them do.
One advantage of block comment words like (* ... *) over 0 [if] ...
[then] is when doing a LOCATE. Not all Forths store the file offsets
of each definition. So when performing a LOCATE without an offset
(i.e., search from the beginning of the file) it is much easier to
skip over (* ... *) than 0 [if] or 0 [if] to avoid hits on
comments.
I observe that several Forths already use (* ... *): Win32Forth,
Carbon MacForth, and PowerMops. There may be more. Other Forths use
the equivalent of (* .. *), e.g., (( ... )). Apparently there are
plenty of people that find these words useful. I am one of them.
Besides, I'd much rather type (* ... *) than 0 [if] ... [then]
> pad s" old.txt" tuck pad swap cmove s" new.txt" rename-file
> is a pretty awful workaround.
So you are saying that PAD" is:
: PAD" ( IN: text" | -- ) PAD S" TUCK PAD SWAP CMOVE ;
Why is PAD" a pretty awful workaround? Seems perfectly reasonable to
me.
PAD" old.txt" S" new.txt" RENAME-FILE
But having a second string buffer handy is one reason that I've always
defined T" as a "write-over" stack using an explicit TDUP.
Literate programming? Phooie! What does that mean? Start out
thinking that source code needs to be read by human beings. Then
the first lines of a program will always be plain text telling
you what the program is supposed to do. After you've explained
yourself you can switch to Forth code. So you need a Forth word
to start interpreting/compiling Forth code the same way you need
a Forth word to switch to a Forth machine code assembler.
When the system starts reading Forth source, it interprets
it as Forth words. This is the wrong way to start. I'm
contending a Forth system should start reading the source as
being comments until told otherwise. Since I have never used a
system that acted that way I don't know what the disadvantages
might be.
> Now, if the code is embedded in the documentation, then you
> do not want to have the start of the Forth segment determined
> by some two-character sequence defined in terms of the name
> of a word that will often lay pages further up
> in the file.
>
> Now, if the code is embedded in the documentation, then you
> do not want to have the start of the Forth segment determined
> by some two-character sequence defined in terms of the name
> of a block comment word that will often lay pages further up
> in the file.
Right. A two-character sequence will not indicate that you
are using modern Forth which has turned into a programming
language that is easy to read. But what is this business about
"block comment" and "pages further up in the file"? Are you
thinking of Forth block code that loads numbered screens? That
is the spaghetti screen problem. Bad style. A problem that needs
further research to solve.
> So the literate programming "skip the next stuff" words are
> not defined as "opening and close delimiters". If you choose
> to make <code> the "start interpreting" token, then you can
> simply define:
>
> \ Skip interpreting until <code>
> : </code> ( --- ) <HTML>
>
> and in a case-sensitive forth implementation, you define:
> \ the <html> tag is case insensitive
> : <HTML> <html> ;
>
[snip]
Now you've lost me. A discussion of how to write program
code clearly should be clearly written. The subject is a matter
of style, the sort of thing English teachers and editors
complain about. Its not something you can solve by making up
rules or making more new Forth words. Forth is about using few
words. Unfortunately this style of writing cannot be applied in
the same way to writing documentation.
--
Michael Coughlin m-cou...@comcast.net Cambridge, MA USA
> > The main point of literate programming is that you never have
> > to look for the documentation of a source file ... its in the
> > source file itself.
> Literate programming? Phooie! What does that mean? Start out
> thinking that source code needs to be read by human beings. Then
> the first lines of a program will always be plain text telling
> you what the program is supposed to do. After you've explained
> yourself you can switch to Forth code. So you need a Forth word
> to start interpreting/compiling Forth code the same way you need
> a Forth word to switch to a Forth machine code assembler.
How does a Forth word to start interpreting/compiling every get
interpreted, if you are not interpreting when you start? How does it
execute? You haven't started interpreting/compiling yet.
Literate programming means that the source intended to be shared by
the computer system and human beings, and so the beginning of the file
*is* plain text explaining what the program is supposed to do.
The computer doesn't need to see that, so an executable word that
says, "ignore this stuff until I start talking Forth" is all that is
required. However, it is preferable if it is an integral part of the
text layout.
> > Now, if the code is embedded in the documentation, then you
> > do not want to have the start of the Forth segment determined
> > by some two-character sequence defined in terms of the name
> > of a block comment word that will often lay pages further up
> > in the file.
> Right. A two-character sequence will not indicate that you
> are using modern Forth which has turned into a programming
> language that is easy to read. But what is this business about
> "block comment" and "pages further up in the file"? Are you
> thinking of Forth block code that loads numbered screens? That
> is the spaghetti screen problem. Bad style. A problem that needs
> further research to solve.
Pages, as in, "page up", as in, "read this and see what you think". By
pages up I mean pages up. If I said "pages up" with respect to a
novel, you'd understand it ... same thing. "back there somewhere out
of sight".
> Now you've lost me. A discussion of how to write program
> code clearly should be clearly written. The subject is a matter
> of style, the sort of thing English teachers and editors
> complain about. Its not something you can solve by making up
> rules or making more new Forth words. Forth is about using few
> words. Unfortunately this style of writing cannot be applied in
> the same way to writing documentation.
The omitted text wasn't a discussion of who to write program code, it
was a discussion of the code from the perspective of the Forth
compiler. "Ignore this stuff" is all that is required from the side of
the Forth compiler.
As to the matter of style, I would expect that if I published code
using these tools, I would be more on the receiving side of style
advice than the delivery side.
Are they nesting or non-nesting?
In PowerMops they are definitely nesting. In Win23Forth and Carbon
MacForth they appear to be not nesting (I could be wrong here if there
is a user settable switch). Regardless, at least two ways of readily
making them nesting have been demonstrated, which of course would be
backwardly compatible.
Oh right ... I was thinking the other direction, but as long as its
backwardly compatible, then a prelude file for Win32Forth and
CarbonMacForth could redefine (* *) without worrying about breaking
any existing code written specifically for those implementations.
That would suggest that if (* *) is standardized, it *ought* to be
nesting, because otherwise could interfere with the native PowerMops
(* *)
It certainly is easier with editor support (although not required).
If your editor does not support it, get a better editor.
The advantage of \ is that it is easy to see what is commented and
what is not. Programmers in programming languages with only
bracket-comments often put something on every line to gain that
advantage, e.g.
(* Slaying entails certain sacrifices,
* blah,
* blah,
* bitty blah,
* I'm so stuffy,
* give me a scone. *)
C started with bracket comments, and C++ adopted them for
compatibility, but also added line comments (starting with "//").
And, true to C++'s name, C eventually also aquired that comment
syntax. If bracket-style comments were so great, why did they add
line comments?
>One advantage of block comment words like (* ... *) over 0 [if] ...
>[then] is when doing a LOCATE. Not all Forths store the file offsets
>of each definition. So when performing a LOCATE without an offset
>(i.e., search from the beginning of the file) it is much easier to
>skip over (* ... *) than 0 [if] or 0 [if] to avoid hits on
>comments.
A high-quality Forth system should deal with all standard forms of
comments. So adding another form of comment will make life harder for
Forth systems that do not store the position of the definition.
Can you explain your reasoning?
As far as I can see, if all Forth systems that have a (* have it
non-nesting, then we should standardize on a non-nesting (* or
standardize on a nesting thing with a different name (but if they are
all non-nesting, then apparently the nesting feature was not important
enough in practice).
If there are some that have a nesting (* and some that have a
non-nesting (*, then we must not standardize (* (just like we did not
standardize NOT), and have to use a different name for the new comment
syntax (whether nesting or non-nesting); or we could standardize on (*
and make the occurence of (* inside a comment an ambiguous condition.
> > Parse text until the whitespace-delimited string
> > *)
> > is found or until the current parse area cannot be refilled.
> > Discard the found *) string.
>
> That describes a non-nesting (* comment syntax. iForth implements
> that.
I wasn't sure how to say it. Do you have a suggestion?
> > (* commands are nestable.
>
> That is in contradiction to the description above. It is also quite
> imprecise; e.g., is there any requirement on delimiters around "(*"?
If it's a command, I would have thought there would be. I need to
learn to look at all the weird interpretations other people might put
on the wording of things that are claimed to be standard.
> BTW, for those people worrying about nesting C, Forth or other languages:
>
> char *a="consider that any string can occur in C";
> s" and in Forth as well, without respect for nesting: (* */ (( etc."
I'm not too concerned about finding ways to make it work with every
possible C code.
But you're right, any string can be present in a S" ." ABORT" etc
string that isn't intended to affect comments but that would affect a
multiline coment. And it looks like it would take extreme diligence to
prevent all such cases to interfere with (* or for that matter any
similar command.
So my thought is to accept that possibility. This is a programming
tool. Similar tools are already used by many people. If your code has
a rare glitch that makes it fail to work correctly, you can deal with
it.
> >It would not be difficult to make a CREATE DOES> word which defines
> >new words like (* with new ending strings different from *) . I may do
> >that.
>
> IMO (* is already unnecessary: I use \ instead, and we have ( ...) and
> 0 [IF}...[THEN] as well. A word for defining custom comment
> delimiters seems excessive, and the main effect would be to reduce
> readability.
My thinking is that such commands are already in widespread use. I'd
prefer that one such command be used rather than multiple ones. But if
we're going to get multiple versions anyway, then I may have a use for
a tool to import them even easier than cut-and-paste.
> Oh right ... I was thinking the other direction, but as long as its
> backwardly compatible, then a prelude file for Win32Forth and
> CarbonMacForth could redefine (* *) without worrying about breaking
> any existing code written specifically for those implementations.
Yes.
> That would suggest that if (* *) is standardized, it *ought* to be
> nesting,
I strongly agree. There is no downside to making them nesting, even
if up till now they weren't. When nestable, they are easier to use
and have greater utility.
That is stated in a fairly arrogant way.
For a while I was spending a bit of time bringing gforth and eforthl
up on one-floppy and few-floppy Linux distributions, and in many cases
my editor was e3, because of the lack of problems with binary-library
dependencies and very small binary.
Now I am sure that there are plenty of console-based options that can
wrap text inside a text box of arbitrary characters that will work
with the grotesque vi or the obese emacs, but someone should be able
to edit using whatever text editor they are most comfortable as a
document editor. And in resource constrained or boot-up contexts, the
most functional editor available may not have all the bells and
whistles.
> The advantage of \ is that it is easy to see what is commented and
> what is not. Programmers in programming languages with only
> bracket-comments often put something on every line to gain that
> advantage, e.g.
>
> (* Slaying entails certain sacrifices,
> * blah,
> * blah,
> * bitty blah,
> * I'm so stuffy,
> * give me a scone. *)
This logic breaks down with your suggestion that we use 0 [IF] ...
[THEN].
Or did you mean to do something like this?:
0 [IF] Slaying entails certain sacrifices,
* blah,
* blah,
* bitty blah,
* I'm so stuffy,
* give me a scone. [THEN]
> So adding another form of comment will make life harder for
> Forth systems that do not store the position of the definition.
Adding a couple of simple definitions doesn't seem like much of a
hardship. Especially when one then realizes the benefits of more
easily dealing with standardized source.
> >That would suggest that if (* *) is standardized, it *ought* to be
> >nesting, because otherwise could interfere with the native PowerMops
> >(* *)
>
> Can you explain your reasoning?
>
> As far as I can see, if all Forth systems that have a (* have it
> non-nesting, then we should standardize on a non-nesting (* or
> standardize on a nesting thing with a different name (but if they are
> all non-nesting, then apparently the nesting feature was not important
> enough in practice).
>
> If there are some that have a nesting (* and some that have a
> non-nesting (*, then we must not standardize (* (just like we did not
> standardize NOT), and have to use a different name for the new comment
> syntax (whether nesting or non-nesting); or we could standardize on (*
> and make the occurence of (* inside a comment an ambiguous condition.
Here's the way I understand what they're saying.
We want to avoid breaking existing code.
Ideally we would also avoid changing existing Forth compilers, but
that's perhaps too much to ask for.
Existing code that uses a non-nesting (* will have things that look
like:
(* *) (* *)
which will still work with a nesting version.
Existing nesting code will not work with a non-nesting version.
(* (* *) *)
will stop at the first *) and the second will cause an error if
nothing causes one before that.
The case that still might not work for code written with nonnesting (*
is:
(* (* (* (* *)
Now the nesting version will keep everything commented out to the end
of the file, the number of *) will never catch up to get out of
comment mode.
This might not happen very often because the tested code is likely to
have the extra (* removed. But it could happen.
My intention in having a standard word is to encourage Forth
programmers to use the same syntax for mutiline comments, so their
code will port easier. I'd naturally prefer to use the version which
is already used the most. If the standard version is something brand
new then it adds one to the list and might not actually reduce the
variety at all.
So I will try an argument which might be partly bogus, in case it
works. Comments are a vital part of human communication, which is in
turn vital for portability. But nested comments are mostly a
programmer convenience. You comment out big blocks of things that you
don't need at the moment, or that happen to be in the way at the
moment. Good when it's easy to comment them out. But by the time your
code is ready for somebody else to port to a different system, maybe
those nested comments will be mostly gone? You will have deleted the
things that are permenently in the way, and uncommented the ones that
don't need to be commented.
So maybe we can accept a little bit of broken code. Maybe nested
multiline comments are rare enough in polished code that we can go
ahead and use (* (or some other commonly-used expression) even though
current practice isn't all the same about nesting?
>> "J Thomas" <jethom...@gmail.com> writes:
>>> Parse text until the whitespace-delimited string
>>> *) is found or until the current parse area cannot be
>>> refilled. Discard the found *) string.
> > That describes a non-nesting (* comment syntax. iForth implements
> > that.
> I wasn't sure how to say it. Do you have a suggestion?
Parse text until the whitespace-delimited string *) is found or until
the current parse area cannot be refilled. Discard the found *)
string.
If while parsing, the whitespace delimited string (* is found, parse
the following text until the whitespace-delimited string *) is found
or until the current parse area cannot be refilled, discard the found
*) string, and proceed with parsing text.
OR
If while parsing, the whitespace delimited string (* is found, then
repeat the above semantics recursively.
IN ADDITION, the above is not what has been implemented ... the
implementations have ignored "or until the current parse area cannot
be refilled", under which generation the word should complete silently
when the parse area is exhausted. If the definition is to permit,
CR ." Warning: unterminated block comment."
it would be something along the lines of :
Parse text until the whitespace-delimited string *) is found. Discard
the found *) string.
If while parsing, the whitespace delimited string (* is found, parse
the following text until the whitespace-delimited string *) is found,
discard the found *) string, and proceed with parsing text.
Ambiguous conditions:
The action of if the current parse area current parse area cannot be
refilled is undefined.
I believe "(*" (or whatever) should be used for comments. Using
(* to remove a block of code is actually a conditional compilation
hack and thus a job for "0 [IF] ... [THEN]". IMHO, when a nesting (*
is needed, the need for [ELSE] will not be very far away.
Maybe those with nestable comments can check when/why they have used
them.
-marcel
What is 0 [IF] conditional on, the zero-ness of 0? If 0 becomes less
zero-ey, the block comment text should be executed? 8-)#
0 [IF] used for a block comment is the conditional compiler hack,
using a conditional compiler word to generate uncondtional non-
compilation.
For blocking out code (as opposed to giving a definition in a comment
as a form of communication with someone else who may be using the
code), I *do agree* that 0 [IF] ...[THEN] is best, allowing you to
search and replace "0 [IF]" with "BUGGY? [IF]" if you want to
benchtest whether you are making headway on correctly integrating a
new capability into an existing working code-base.
FALSE TO BUGGY? INCLUDE ...
TRUE TO BUGGY? INCLUDE ...
Mind you, having that search bring up every single actual block
comment in the file could be a nuisance.
<SNIP>
>
>As far as I can see, if all Forth systems that have a (* have it
>non-nesting, then we should standardize on a non-nesting (* or
>standardize on a nesting thing with a different name (but if they are
>all non-nesting, then apparently the nesting feature was not important
>enough in practice).
\ \ Whoever needs blocks comments doesn't have a good editor.
\ \ The only reason I can imagine for nested block comments is when
\ \ commenting out code. That is for people who don't know how to
\ \ use a version control system properly.
\
\ I defined a key to comment the above out, in under a minute.
\ If I did this more often, I would save my key definitions.
\ (And yes, I have columns, such that a couple of key-strokes
\ does away with it. I just did, then added the comment again.
And of course I can comment it out, presto even nested comments.
Bottom line, IMHO block comments are unforthisch.
And of course we already have a perfectly fine block comment: (.
It only suffers from the same problem all block comments have,
there is a delimiter. But at least you know where the danger is.
The only slightly reasonable block comment symbols I ever encountered
where
DOC
Anything goes!
: OEPS this won't get compiled ;
ENDDOC
I look with amazement at the ever more bizarre tokens that are
proposed to evade problems with ever more bizarre programming
languages.
My two cents.
>
>- anton
>--
--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
> Bottom line, IMHO block comments are unforthisch.
> And of course we already have a perfectly fine block comment: (.
> It only suffers from the same problem all block comments have,
> there is a delimiter. But at least you know where the danger is.
>
> The only slightly reasonable block comment symbols I ever encountered
> where
>
> DOC
>
> Anything goes!
> : OEPS this won't get compiled ;
>
> ENDDOC
>
> I look with amazement at the ever more bizarre tokens that are
> proposed to evade problems with ever more bizarre programming
> languages.
OK. I noticed block comments in other people's code and I thought if
one version was standard then I wouldn't have to deal with it, it
would just work.
But you point out that it isn't worth a lot of trouble dealing with it
because in each individual case it's very easy.
Soppose we had DOC ENDDOC . Then whenever somebody uses some other
block comment we could just do DOC ENDDOC around theirs, and it ought
to work. Very little effort. Or put a line of \'s across the whole
block.
And if someday we get Forth libraries with 50 files, each of them with
multiple block comments, you can easily do a multi-file global find-
and-replace to get DOC ENDDOC or \'s where you want them. Still
essentially no effort. So we don't need any standards effort, the time
we waste dealing with other people's block comments will never add up
to as much time as we'd spend keeping it from happening.
You make a powerful argument. I'd hoped to increase the percentage of
libraries and other portable code that would simply run with no time
at all spent massaging source code. But maybe it's easier to just fix
the trivial problems as they show up, each time.
I think this is something that is best standardized at the collection
level ... given that Forth200x will provide perfectly good support for
the definition, it is specified in the submission guide, and defined
in the [collection-name]-COMMON wordset.
> Bottom line, IMHO block comments are unforthisch.
Well, of course. We must be as Forthish as possible. And we all know
what that means, surely.
Let's start by simplifying. That's a very Forthish thing to do. We
really don't need the comment words \ or ( or ) or anything like
(( )) or (* *). So we should remove all of those from Forth.
Instead we can just use the 0 [IF] ... [THEN] construct. It will work
fine in all commenting situations. We can use text/blocks-files
editors that automatically insert/remove the 0 [IF] .. [THEN] as
appropriate with a simple command.
For example:
: 6+ 0 [IF] n -- n+6 [THEN]
6 + ;
Simplify, simplify. Now we are free to define better uses for \ and
( and ) .
> : 6+ 0 [IF] n -- n+6 [THEN]
> 6 + ;
THINK!
That should be
: 6+ [ 0 ] [IF] n -- n+6 [THEN] 6 + ;
Hmmm ...
You must make sure that some joker didn't redefine 0 as in : 0 1 ;
There's obviously a fundamental flaw in [IF]. Maybe we should remove
it. (Write no comments at all, they're wrong anyway. Don't use
conditional compilation, it wastes disk space). A win!
-marcel
You could use an advanced editor and make all the comments a different
font and color. Then when you want to compile, with one keystroke the
editor can remove all the comments. So you can have all the literate
programming you want but the Forth is dead-simple!
Similarly, the editor can do the conditional compilation by removing
everything but the text you want to compile.
> THINK!
> That should be
> : 6+ [ 0 ] [IF] n -- n+6 [THEN] 6 + ;
Thank you Marcel. I hadn't had my 2nd cup of coffee yet.
> Similarly, the editor can do the conditional compilation by removing
> everything but the text you want to compile.
Thank you for so indirectly, and elegantly, pointing out one of the
things I like so much about "just plain old Forth" forth. You can get
source written in 72 columns, rewrapped into 60 columns to make it
ready for import into a block file, and then rewrapped again into 20
columns to prepare it for editing in a small screen PDA, and except
for a handful of cases, which you can readily search for as long as
your text editor supports search and replace (LP does!), you will be
close to working code.
Having started with a dim understanding of Forth and a fig-Forth
modified to the C64 by providing 25 rows by 40 character text blocks,
when "open-source code" meant typing in text blocks from fig
Dimensions or the occassional Byte or Dr. Dobb's Forth issue, this is
something that impressed me early
Of course, the big exceptions are strings that are just too long,
where you either have to have support for wider-than-screen strings in
whatever text editor you have at hand, or else work-around, and "\ ".
The other big exception are parsing words, which is one reason I
prefer to leave parsing words for the active command line and use non-
parsing options, when they are available, inside source files.
Definitely need a [0] for this case, [ 0 ] is too inconvenient to
type. 8-)#
There really is no settling the fight between the vocabulary expanders
and the vocabulary trimmers ... its like the ongoing fight between the
lumpers and the dividers in the definitions of group/family/species/
subspecies in biology. That pot is going to boil into the indefinite
future.
>Bottom line, IMHO block comments are unforthisch.
>And of course we already have a perfectly fine block comment: (.
>It only suffers from the same problem all block comments have,
>there is a delimiter. But at least you know where the danger is.
We have this terrible habit of including the documentation in
the source code. We mostly do this using \ comments which are
(I hope) Forthish. But we have to add these. So, sometimes we
want to include formulae and telephone numbers such as
+44 (0)23 8063 1441
not to mention text that uses brackets. Hence block comments.
Every "big system" Forth has developed block comments of one
form or another because they are needed. Some people feel the
need to standardise them. Based on common practice rather than
anything else, the unnested (( ... )) or (* ... *) seem to be
the candidates. If a proposer wants to mandate nesting then
choose something else and see who adopts the reference
implementation. If you don not want it in the distribution
of your system, either ignore it or copy and paste the
reference implementation into a file.
Having used the unnested (( ... )) comment for many years,
I find insufficient need to mandate nesting, but more need
improve editors' syntax colouring.
Stephen
--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
> On Feb 4, 1:10 pm, "Doug Hoffman" <dhoff...@talkamerica.net> wrote:
>> On Feb 4, 9:57 am, m...@iae.nl (Marcel Hendrix) wrote:
>>
>> > "Doug Hoffman" <dhoff...@talkamerica.net> writes Re: Request for
>> > Discussion -- (( and multiline comments
>> > > : 6+ 0 [IF] n -- n+6 [THEN]
>> > > 6 + ;
>> > THINK!
>> > That should be
>> > : 6+ [ 0 ] [IF] n -- n+6 [THEN] 6 + ;
>>
>> Thank you Marcel. I hadn't had my 2nd cup of coffee yet.
>
>
> Definitely need a [0] for this case, [ 0 ] is too inconvenient to
> type. 8-)#
You could use [ELSE].
: 6+ [ELSE] n -- n+6 [THEN] 6 + ;
will work.
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
> : 6+ [ELSE] n -- n+6 [THEN] 6 + ;
> will work.
And then break, if you should happen to do an actual 0 [IF] ...
[ELSE] ... [THEN] around it!
Having nestable version was suggested on the basis that it is a
handy *debugging* feature and adds little if any cost. That is
also the way it happens to be implemented in [Turbo] Pascal.
You say (* was not meant for conditional compilation. Perhaps
so. However if someone does happen to use it for that purpose
then what's the harm? It is surely no more "inappropriate" than,
say, using 0 [IF] ... [THEN] for comments!
As others have stated the nesting (* is inherently backward
compatible so existing systems and their source are safe.
Updating to a nested (* is as simple redefining.
Personally I don't mind which version is chosen. I'll use either.
Nevertheless I'd hate to see a feature discarded on the basis of
idealogy. Practical use should be the guide.
A minor observation.
When there are no intervening instructions between WHILE REPEAT
it generates a redundant jump. A better construct to use in such cases
is 0= UNTIL.
> BL WORD COUNT can be replaced by PARSE-NAME when the time comes.
When the time does come, let's hope it changes back to the more aptly
named PARSE-WORD.
Correction:
(* is not nestable in Turbo Pascal.
> As others have stated the nesting (* is inherently backward
> compatible so existing systems and their source are safe.
> Updating to a nested (* is as simple redefining.
> ...
> Personally I don't mind which version is chosen. I'll use either.
> Nevertheless I'd hate to see a feature discarded on the basis of
> idealogy. Practical use should be the guide.
I assumed [wrongly] that TP had a nestable comment and therefore
was of practical use. Does anyone know of any popular languages
that have nestable comments?
If not, then perhaps it's a dubious feature after all (?)
Yes, It looks better if you define a word called PAD". But it still destroys
the contents of PAD, which might become a problem in some cases. Your
suggestion with a second buffer and T" is better. However, an even better
solution would be to have S" transparently using more than one buffer,
because then you don't need to think about which buffer to use.
My original point is that there's an inconsistency between the fact that the
standard specifies a word that expects two character strings, which is
intended to be used in interpretation state, and on the other hand allowing
S" to have have one string buffer.
Regards,
Stephan
> > As others have stated the nesting (* is inherently backward
> > compatible so existing systems and their source are safe.
> > Updating to a nested (* is as simple redefining.
> > ...
> > Personally I don't mind which version is chosen. I'll use either.
> > Nevertheless I'd hate to see a feature discarded on the basis of
> > idealogy. Practical use should be the guide.
>
> I assumed [wrongly] that TP had a nestable comment and therefore
> was of practical use. Does anyone know of any popular languages
> that have nestable comments?
>
> If not, then perhaps it's a dubious feature after all (?)
Why does it matter what other languages do? What matters is the
effect. What would be the downside of having nestable block
comments? The upside is being able to easily block out a larger chunk
of text without having to worry about blocked out chunks contained
within. Again, what problem is caused by using the (backwards
compatible) nestable type? Btw, PowerMops has been using nestable
(* ... *) for years with great success and they work splendidly.
Again, no downside.
Having said that, I'll also say that I'm not dogmatic on this point.
Nestable or not, let's get on with at least having a set of multiline
block comments as standard.
>My thinking is that such commands are already in widespread use. I'd
>prefer that one such command be used rather than multiple ones. But if
>we're going to get multiple versions anyway, then I may have a use for
>a tool to import them even easier than cut-and-paste.
As the proposer of this, it's up to you to do the survey of
existing Forth practice. I've just surveyed (* ... *) in the
following systems
VFX4 - non nesting
SF3.0 - not present
iForth 2.1 - non-nesting
W32F6.11 - non-nesting
gForth doesn't have it in the three year old version on my box.
We have been told that Mops uses a nesting version. The sensible
conclusion from this limited survey is that if you want nestable
block comments you should choose another name pair.
It's quicker to do the research beforehand than to face the
ensuing firestorm later.
>VFX4 - non nesting
>SF3.0 - not present
>iForth 2.1 - non-nesting
>W32F6.11 - non-nesting
spForth - non-nesting
> It's quicker to do the research beforehand than to face the
> ensuing firestorm later.
I'm not getting my feelings hurt and I didn't expect that to be the
last (successful) draft. I apologise that this is taking so much time
for such a trivial concern.
So, at this point it looks like (* is the single most popular name,
and some people want a nesting multiline comment. But (* has a non-
nesting history just as (( does.
So we should settle on a nonnesting (* (though at least one
implementation does nest) or a nesting version with some other name.
I'm a little concerned about making a nonnesting (* standard because
of the code that might break. How important is nesting with (* in
legacy code? It looks hard to find out.
If you have code that does (* (* (* (* *) and it gets
broken by a nesting (* you can fix it by adding *) *) *) at the *) .
It's a rote process and could be automated. But still, the code is
broken.
If you have code that does (* (* (* *) *) *) and it
gets broken by a nonnesting (* then you can fix it by removing all but
one of the *) . That's also a rote process and could be automated, but
the code is broken until that happens.
It's good to go with a name that's in common use that people like, and
(* is that name. Make up a brand new name and make it standard and
it's only the official standard label that gets people to use it -- if
that doesn't work well enough then the net effect is one *more* name
on top of the others.
So my natural thought is to make (* standard and leave it unspecified
whether it nests or not. I expect that to rarely be an issue -- code
that uses (* for conditional compilation etc is probably not ready to
be released for other people to port. It could stand some cleaning up
first. And if each system that already has it, keeps doing it their
own way then the legacy code keeps working.
That sort of feels like wimping out but it looks to me like the best
solution so far, under the circumstances.
I'll repeat the reasoning:
1. Better to make standard the most popular word.
2. The most popular word has common practice that conflicts in a very
minor way.
3. Better to accept rare, easily-fixed porting problems by not
specifying this minor matter than to specify it in a way that breaks
code, or to propose a standard word that's new to almost everyone.
Does that seem generally acceptable?
I don't get what you are saying. Who showed that (* is inherently
backwards compatible? I think a post showed that the nesting and non-
nesting versions have definite cases where either code will break on
the other system.
(* may be in use for some code, but I expect this is not commonly used
and is a small fraction of existing code.
> Personally I don't mind which version is chosen. I'll use either.
> Nevertheless I'd hate to see a feature discarded on the basis of
> idealogy. Practical use should be the guide.- Hide quoted text -
> On 3 Feb 2007 11:34:29 -0800, "J Thomas" <jeth...@gmail.com> wrote:
>
>>My thinking is that such commands are already in widespread use. I'd
>>prefer that one such command be used rather than multiple ones. But if
>>we're going to get multiple versions anyway, then I may have a use for
>>a tool to import them even easier than cut-and-paste.
>
> As the proposer of this, it's up to you to do the survey of
> existing Forth practice. I've just surveyed (* ... *) in the
> following systems
>
> VFX4 - non nesting
> SF3.0 - not present
> iForth 2.1 - non-nesting
> W32F6.11 - non-nesting
>
> gForth doesn't have it in the three year old version on my box.
But bigFORTH has a non-nesting version of it. Including spForth, we have six
systems with a non-nesting (*, and none with a nesting one.
PowerMops was claimed by Doug Hoffman to use a nesting version of (* .
I believe we're approaching something like a consensus that (* should
not be required to nest.
Some here seem to be willing to standardize a potentially broken block
comment in that it does not need to consider the result of commenting
out code which may contain a quoted string with the comment terminator
inside. This may not be a common problem, but when it happens, it is
a PITA. This is one of the things that gets ignored when you define
your problem in terms of an easy implementation, rather than an
effective one.
> I think a post showed that the nesting and non-
> nesting versions have definite cases where either code will break on
> the other system.
In case that didn't get completely clear, here are examples.
(* code that requires nesting block comments.
(* This code will do everything Werty every claimed, it will turn
Forth into something that writes itself and needs no documentation, a
true DWIM system.
*)
: FOO BEGIN the hell with it, I'll write this tomorrow.
*)
On a nonnesting system, : FOO BEGIN will be compiled and will result
in errors. Even if the code does nothing bad, the extra *) will
probably cause trouble. It doesn't need to be defined as a word so it
can be an error just to interpret it.
(* code that requires nonnesting block comments.
(* Here comes the application, the stuff you wanted to compile. This
is it.
*)
: APPLICATION ..... :
A nesting (* will not compile anything here until it finds an extra *)
which will probably not happen at all. APPLICATION will not be
compiled.
I don't know how much legacy code there is that uses (* (* with no
intervening *) that might fail one way or the other, but the point is
that there could be some.
> Some here seem to be willing to standardize a potentially broken block
> comment in that it does not need to consider the result of commenting
> out code which may contain a quoted string with the comment terminator
> inside. This may not be a common problem, but when it happens, it is
> a PITA. This is one of the things that gets ignored when you define
> your problem in terms of an easy implementation, rather than an
> effective one.
That's true.
But then, we're talking about comments in portable code. If you want
to comment out code that includes quoted strings that contain your
comment terminator (or in the case of nested comments, may contain
your comment initiator) then you can use 0 [IF]. Or end your comment
on the previous line and use \ for this line and then start again..
Or even, cut-and-paste the offending code into another document. It's
very rare that you need to accept a PITA by keeping code in comments
that gets badly in your way.
It would be possible to write a Forth interpreter to run inside
comments, that checks to make sure that the comment terminators aren't
actually inside of text strings etc. Presumably all its compiled
definitions should go into a special wordset that's only searched
while compiling, and maybe its stack items should be on a special
stack? But what if you have errors in the code you've commented out?
It might be that a lot of the time you have code that's commented out,
you actually do want that code to be ignored until you provide a
comment terminator.
We ignore that problem with traditional comments.
( : FOO S" abcdefghijklmnopq )
The comment end at the first ) even if that ) happens to be part of
(.) or (interpreter) or ( x y -- z ) .
I'm really not sure what the specs should be for an improved block
comment, but if you specify what you want I might try to code it. But
maybe for a standard effort we'd do better with something closer to
common practice.
As I noted, in an application you will normally have the previous name
of the file stored somewhere, or else you will be storing the new name
of the file somewhere (or both), so this is about benchtesting what
you are doing before that is finished, or fixing something up while
debugging.
And more important:
> However, an even better
> solution would be to have S" transparently using more than one buffer,
> because then you don't need to think about which buffer to use.
I'll thank you very much to leave the size of the minimum string, PAD,
numeric conversion and input buffers alone. It's already a substantial
chunk of real estate in a resource constrained setting.
> My original point is that there's an inconsistency between the fact that the
> standard specifies a word that expects two character strings, which is
> intended to be used in interpretation state, and on the other hand allowing
> S" to have have one string buffer.
There would be if the S" buffer was a bottleneck. But since you can
only specify one string at a time on the command line, its not a
bottleneck ... its a loading dock. You can implement as many buffers
as you want, of whatever size you want, and use S" to acquire strings
on the command line to put strings in each of those buffers.
The problem is not in the definition of S", which is a perfectly
sufficient facility to provide as many buffers as anyone should need,
given their resource constraints. The problem is the failure to
establish a portable, re-usable library system, so that someone that
is not in a resource constrained setting can say,
S" niclos1.f" REQUIRED
3 4 S" toolbelt" ACQUIRED
and you have T", or FILE1" and FILE2", or whatever is defined in
toolbelt, version 3.4 or greater, at the drop of a hat. Or even
RENAME[ ... ][ ... ]NAME
if that is provided in toolbelt 3.4 or greater.
> I don't know how much legacy code there is that uses (* (* with no
> intervening *) that might fail one way or the other, but the point is
> that there could be some.
That's a concern for language standardization that is not a concern
for a wordset collection ... a wordset collection can simply define it
as nesting (following one existing practice) ... it may also provide a
lint tool to identify files with (* ... (* ... *) code to simplify the
importing of code from those implementations that have non-nesting (*
*).
Since it appears that all the implementations with (( )) define it as
a non-nesting block comment ... that is, as a "outer" ( ) ... which is
indeed an intuitive way to read its name, if (( that were to be
standardized, it would be standardized as a non-nesting "outer (" if
it was standardized.
Since there appear to be implementations that have nesting and non-
nesting (* *), it is easy to rule it out from use as a standard
nesting block comment. (== ==) is still in the running ... it works
for formulas in block comments, which is my main concern for )) ... it
works for interpolating code from most C or Pascal families ... it
supports a distinctive comment box.
And since either (( or (== can be readily defined in standard ANS
Forth, either can be easily standardized at the collection level.
Neither really requires standardization in Forth200x, but I would
certainly not oppose the standardization of either ... with a higher
priority on ((, to avoid the accidental definition of (( as a nesting
comment.