January 29, 2007
* Second Draft *
Rationale
=========
Problem
-------
Most Forth users find it useful to have a block comment -- a comment
command that covers multiple lines. There are many names for this
because there is no standard.
Names include (( (* /* ( etc.
One standard portable approach is 0 [IF] ..... [THEN] This works on
every standard Forth that implements [IF] [THEN] but it is ugly. It's
easy to rename 0 [IF] to, say, [DOCUMENTATION] and still end with
[THEN] . This solution is not generally accepted as shown by the
existence and use of so many alternatives.
Current Practice
----------------
Many Forth systems provide this capability. I am not yet aware of two
that use the same name.
Solution
--------
Agree on a common name. The more Forth systems that provide a standard
name, the more code will use it and the less trouble it will be
porting that code. Code that uses some other name will still run on
every system that supports it and software can be easily written to
support it elsewhere, but to the extent we use a standard name that
bother can be avoided. All that is needed is that we agree on the
name.
Proposal
========
15.6.2.xxxs (( "comments" TOOLS EXT
Compilation: Perform the execution semantics given below.
Execution: ( "ccc" -- )
Parse text until the word
))
is found or until the current parse area cannot be refilled.
Discard the found )) string. (( is an immediate word.
An ambiguous condition exists if the word )) is not found and the
parse area cannot be refilled.
Typical use
-----------
(( This is a comment, it requires supporting statements.
These statements can go on their own lines.
There might be lots of them. ))
(( ---------------------------------------------------------------
People have lots of styles for comenting.
They like to make their comments stand out as obvious comments.
--------------------------------------------------------------- ))
( ----------------------------------------------------------------)
( They could do that with just ( as traditionally defined. )
( But usually they don't. )
( ----------------------------------------------------------------)
Reference Implementation
------------------------
: MAYBE-REFILL ( S: -- flag ) \ refill if at end of line, 0 if
can't
SOURCE NIP >IN @ > DUP 0= IF DROP REFILL THEN ;
: (( ( S: -- ) \ multi-line comment, ended by ))
BEGIN
[CHAR] ) PARSE 2DROP BL WORD COUNT S" )" COMPARE WHILE
MAYBE-REFILL WHILE
REPEAT
." Multiline comment needed )) and never found it -- ambiguous
condition."
CR ." Your system can do anything at this point and still be a
standard system."
THEN ; IMMEDIATE
WORD can be replaced by PARSE-NAME when the time comes.
It would not be difficult to make a CREATE DOES> word which defines
new words like (( with new ending strings different from )) . I may do
that.
Test Cases
----------
(( 123 . )) 456 .
(( 123 .
456 .
789 . )) .( aaa )
: foo (( 123 .
456 . )) 789 . ;
foo
Remarks
-------
I don't care what name we use provided we can agree on one.
I dislike using /* because there might be some use for files where
Forth code is in C comments and vice versa.
/*
Forth code here
((
*/
C code here
/*
))
More Forth code here
Etc.
That isn't a major concern but /* isn't my top choice.
I don't like to just extend ( ) . One typo in the wrong place with a
looming deadline and you could spend 5 minutes tearing your hair out
figuring out what happened. Also, your multiline comment can't include
a ) which greatly reduces its value.
I kind of like Marcel Hendrix's (* . It encourages intense visuals.
(* ******************************************
* This is a bold comment! *
****************************************** *)
(( )) seems bland and inoffensive, something we could all agree on.
If you prefer something else that's fine.
Mishaps
-------
My first day using APL, somebody came by my keyboard while I was
getting a drink of water, and they started the command that gets a
user string. When I came back nothing I did could get the computer to
notice me. I tried to logout. I tried everything I could think of. I
was stuck. Then somebody suggested I end the string.
Extended comments in files or EVALUATE strings are self-limiting. But
not at the keyboard. If you accidentally get into a comment and the
system seems to be frozen, how long will it take for you to remember
to try )) ? Does your system provide another way to get out of that?
Should and could there be a standard way out?
If there is no obvious way out on your system, a word very similar to
(( could become a password. Type LOCK and when you get a drink of
water the system will ignore all keyboard input until you type XLERB
or whatever you choose for a password. Could that be used maliciously?
Could someone sneak in a bit of Forth code that gets executed during
keyboard entry, and your keyboard freezes until you guess the password
or you abort the Forth?
What if input is redirected to some other source, a streaming source,
and somehow it all gets commented out. All input from that source will
be ignored until it sends the end-comment signal or until you break
the connection. The foreign source can't even tell you that they're
done. Your Forth system hears them but it isn't listening. Is this a
problem?
And I would very much appreciate not having it named /( or, indeed,
anything else beginning with forward-slash.
I was trying to rewrite some code to make it more portable. It had
multiline comments. I wrote a quick routine to handle them. No big
deal, just one more routine cluttering up the files and the system, so
you can use multiline comments that are basicly just like everybody
else's multiline comments but with a different name.
It looked like a good thing to standardise, not any big deal but it
saves time whenever you start to port a file and it doesn't have yet
another brand new name for multiline comments.
> I already have ---[ and will keep on using it
Maybe I should get to work on that CREATE DOES> routine to make new
multiline comments with a single short line of code. Thank you for
being frank about it.
Forth is capable of keying on
2 or more spaces and
other non sense stuff , to key on
comments .
[ B_Register45_99 should be zero ]
/ Comments
Notice how i can use [/] for "line above"
Forth parses spaces and that by itself
is a comment , then sees "and ", and its final !
The solution to stop this debate , is to stop
making precise rules and forcing them on
your opponent !
There is no need to keep Forth parsing
so stupid .
Forth only parses TIB when interacting
with humans , not at run time .
Stop killing Forth with rules ...
A programmer makes his own rules
as he writes .
________________________________________
Here's how I do it in my string parsing library, parsing.fs:
: s-input-after ( s -- flag )
(
Advance the input stream to just after the next occurrence of
the string across lines, and leave true if found. If the string
is not found before the end of the file, leave the input stream
positioned there with false. Based on the $> reference in
PARSE-AREA@.
)
( s) 2>r
BEGIN
parse-area@ 2r@ ( area.s s)
search ( area.s false | &found #rem true)
0= WHILE \ not found
( area.s) 2drop
refill 0= IF \ end of file
( s) 2r> 2drop false EXIT THEN
REPEAT \ found
( &found #rem) 2r> ( s) nip /string parse-area! true ;
: (( ( -- ) s" ))" s-input-after drop ; immediate
I've tried a lot of names for a multiline comment word, and
didn't find many of them very aesthetic. My current favorite
is
: --- ( -- ) s" ---" s-input-after drop ; immediate
Here's an example of its use:
: next-input-name ( "<name>" -- word.s ) \ Wil Baden's NEXT-WORD
---
Parse the next word from the input stream across lines. If
word.s is empty, at most whitespace was found. A parsing
implementation of Wil Baden's NEXT-WORD.
NOTE: The implementation does not echo CR's when the lines
in question are being entered interactively at the terminal.
---
BEGIN parse-name dup IF EXIT THEN refill
WHILE 2drop REPEAT ;
I would not vote against (( if there's substantial support for
it.
-- David
Note that ---[ is a different approach ... that each line is a comment
until an explicit \ comment line indicates that Forth code has
restarted. It is essentially for Forth code embedded in text
commentary, rather than for commentary embedded in Forth code.
Extremely ugly.
> It's easy to rename 0 [IF] to, say,
> [DOCUMENTATION] and still end with [THEN] . This solution
> is not generally accepted as shown by the existence and use
> of so many alternatives.
The situation has not been properly factored. There does not
need to be a new block comment word. There needs to be a word to
tell the system to start interpreting or compiling Forth code
and when to tell it to stop. Everything outside these limits
will be comments. This will allow whatever is needed to print a
Forth program as a document with diagrams, mathematical symbols,
and even as a foreign language that does not use the ASCII Latin alphabet.
--
Michael Coughlin m-cou...@comcast.net Cambridge, MA USA
> The situation has not been properly factored. There does not
> need to be a new block comment word. There needs to be a word to
> tell the system to start interpreting or compiling Forth code
> and when to tell it to stop. Everything outside these limits
> will be comments. This will allow whatever is needed to print a
> Forth program as a document with diagrams, mathematical symbols,
> and even as a foreign language that does not use the ASCII Latin alphabet.
To do that without anything new, you could take your block comment
command and simply start parsing the file with it. Then when you're
ready to comple code you back out of the block comment until you're
ready to comment again.
If you want you could make a word similar to INCLUDED which starts out
in comment mode. That looks extremely easy to me too, the only concern
being that some rare source code might actually start out with code.
So if we have a standard syntax, then it makes sense for files to
start out with the command that starts a block comment, and it makes
sense for file interpreters to start out in comment mode.
What is the difference between what you want and what I want?
> To do that without anything new, you could take your block comment
> command and simply start parsing the file with it. Then when you're
> ready to comple code you back out of the block comment until you're
> ready to comment again.
I find it more natural to have a word that fits into the context of
the non-Forth text.
The difference between text/forth mixing words and block-comment words
is essentially one of the extent of non-interpreted that we have in
mind. A block comment word is trying to embed a comment block in Forth
source ... so it is reasonable to think of the beginning of the block
to be somewhere in view, and even, as with the (* *) naming that I
prefer, with the comment block looking like
(* *****************************
* * If this was a real comment block,
* * this would say something useful
* *)
And note here that in this text/forth mixing style, it is possible to
quote a comment block without confusing anything ... or even code, if
you block indent the code.
> If you want you could make a word similar to INCLUDED which starts out
> in comment mode. That looks extremely easy to me too, the only concern
> being that some rare source code might actually start out with code.
I'm happy with the source starting with the word that is intended to
put it into literate mode ... whether ---[ or <HTML>
OTOH, there is no need to standardise either ---[ or <HTML> ... a
collection of texts that use that can easily include those in its core
toolkit.
> 15.6.2.xxxs (( "comments" TOOLS EXT
[..]
"((" was used in F-PC for something slightly different. AFAIR, it
switched between comments and compiling, and "))" was a word.
The current Win32Forth has "((" like you propose, so that's a plus.
iForth has inherited (( and )) from FysForth, where it is used
as a method to catapult TO for array indexing.
E.g. 12 TO (( 3 )) mydata puts 12 into mydata[3] . I myself do
not object to ignoring this use of (( )) because it is better to
promote the FSL array / matrix notation.
I must point out that your reference implementation does not seem
to allow for nested comments, which can be useful to block out
larger parts of source code (which [if] [then] supports).
Win32Forth has a problem there too:
(( this should (( work )) ))
^^
Error(-13): )) is undefined
BTW: so does iForth :-(
-marcel
> I find it more natural to have a word that fits into the context of
> the non-Forth text.
I may be missing something here. It sounds like you're talking about
an esthetic thing, about the form of the word that switches between
text and Forth. We're talking about the exact same functionality and
the difference is a psychological one, something to do with how we
choose to look at it?
> I'm happy with the source starting with the word that is intended to
> put it into literate mode ... whether ---[ or <HTML>
>
> OTOH, there is no need to standardise either ---[ or <HTML> ... a
> collection of texts that use that can easily include those in its core
> toolkit.
I see a big variety of names for something that's essentially the same
functionality. If some name gets declared standard and more people use
it, that reduces the work of porting code.
It isn't necessary that everybody use the same name, but to the extent
that people do settle on one name it makes things a little easier for
porting.
> I must point out that your reference implementation does not seem
> to allow for nested comments, which can be useful to block out
> larger parts of source code (which [if] [then] supports).
Someone else commented on that too. Thank you. I'll fix it.
Too similar to the forth operator */
Anything more than 2 characters becomes a nuisance. Alphabetic
characters raise issues of case. Using the same character e.g. (( is
prone to mistyping as well as not standing out enough.
Of the one's proposed (* is perhaps most recognizable even by
those that don't use Pascal. In forth, it could be said to build upon
the other comment character (
Thats why i improved Forth to be always
in the RUN mode and never in a COMPILE
mode . It is impossible to crash .
It allowed me to toss [State] and
many other DEFERED Words .
You DO NOT TELL Forth to
1) stop
2) slow down
3) defer
4) change the order of a DoCol
5 ) [ Compile]
6) use stack to "compile" Words
No matter what keys or Forth commands
you type , it wont lock up , even if you're
testing code !
Testing code uses the assembler
to create the OpCodes , so if you misspell
a nuemonic , it will assemble the closest
Match and start traceing code .
At High level , you dont know you are
using an assembler !
No excuses ...
Comments are more important than code. This is an idea that
Forth programmers cannot grasp. I know since I try to explain
this all the time and get nowhere.
Now that we have more powerful computers, we can write
program source that is meant to be understood by human beings
and the computing system can quickly skip thru it to execute the
computer code. Doing this when programs were written on punch
cards was very inconvenient. It made the source decks too big to
handle.
(* and <HTML> do not have exactly the same functionality ... <HTML>
with friends allows you to view the same file as an html text that you
load as a Forth source.
(* and ---[ do not have exactly the same functionality ... ---[ allows
you to quote any snippet of Forth code in the text description, with
restriction on the contents of the code, simply by block indenting. (*
is always sensitive to *)
On the other hand, I'm a big fan of commenting, and if (* *)
encourages more commenting, I'm all for it.
I much prefer (* *) over (( )).
-- David
> > What is the difference between what you want and what I want?
>
> Comments are more important than code. This is an idea that
> Forth programmers cannot grasp. I know since I try to explain
> this all the time and get nowhere.
>
> Now that we have more powerful computers, we can write
> program source that is meant to be understood by human beings
> and the computing system can quickly skip thru it to execute the
> computer code. Doing this when programs were written on punch
> cards was very inconvenient. It made the source decks too big to
> handle.
I see a deep philosophical meaning here. But what operational
difference does it make in the routine to switch between comments and
code?
Not if those characters are the same. For me it's easier to type ((( than (*,
which in fact seems to be tailor-made for american keyboard layouts.
The main point of literate programming is that you never have to look
for the documentation of a source file ... its in the source file
itself.
With many languages, that requires special tools to mix and unmix the
documentation and the code, but defining a literate programming
wordset that allows Forth code to be embedded in its own documentation
is a one-screener.
Now, if the code is embedded in the documentation, then you do not
want to have the start of the Forth segment determined by some two-
character sequence defined in terms of the name of a block comment
word that will often lay pages further up in the file.
So the literate programming "skip the next stuff" words are not
defined as "opening and close delimiters". If you choose to make
<code> the "start interpreting" token, then you can simply define:
\ Skip interpreting until <code>
: </code> ( --- ) <HTML>
and in a case-sensitive forth implementation, you define:
\ the <html> tag is case insensitive
: <HTML> <html> ;
And with the "section title" word, ---[, if you want to have a
"chapter title" and "subsection title" to support automatic conversion
of the plaintext-layout file into html, then you define:
: ----[ ---[ ; \ chapter title
: --[ ---[ ; \ subsection title
The switching into Forth interpretation mode would always be when
there is a "\" comment starting in the first position of the line, and
the name of the section title word doing the work (when INCLUDED) of
passing over the documentation until the Forth code is found does not
matter. That means that the name of that Forth word is able to be
something that is useful for the documentation.
On the other hand, an *actual* comment block really is embedded in the
Forth code, and the definition of the word with a matching "reversed
delimiter" is a clear, intuitive way to show where the block ends.
NO ! NO ! No ! Do not agree on common
names , force them to use your No-Text G.U.I.
If they hack your code , it behoves them to
create new icons , as they are hacking !
The Paradigm , is they can't go back to text !
Impossible ! Try it you'll see what i mean .
You CAN argue English ,
but who argues the Textless ,
wavy line on a traffic sign :
~~~~~~~
( dangerous curve ahead )
everyone , even C prgrammers , slow
down and turn the wheel left , as they should .
My Forth has no text , expecially not Buzz Word
Text .
I agree, not least because I'm entirely likely to include mathematical
expressions in code comments:
(( This word implements the function
f = (a+b)(c*(d+e))
for positive integer values of a,b,c,d,e
))
Ed
> Not if those characters are the same. For me it's easier to type ((( than (*,
> which in fact seems to be tailor-made for american keyboard layouts.
But refer to Ed Beroset's comment on the particular pitfalls of
multiple ))'s, for those of us poor souls who may be commenting a
formula.
Sure. Where's the URL?
Though I admit, I have trouble coming up with names sometimes ...
designing an icon for each new word I define is a bit daunting.
-------------------------------------------------
No ,because you allready have an image
in your brain , of the code frag .
That image cant be shown here , nor
can it be "posted" , because would necessarily
be translated to your left brain , and that
would take 14 pages , and certainly be buggier
than 1 line of C code .
DONT translate it , just start doodling
with a pencil . draw images .
I must warn you ! You will fail , time
and time again , because your Left side
will fight you , like a
Trade Unionist !
Are you following this ?
we humans instinctively , MUST have
everything avail' quickly , in the Left
side , so we can speak it .
that means every intellegent thought
must allready be in the Left side , where
there is insuff' ability to DESCRIBE !!
Because it cant handle value judgements !
Your right side is the only productive
part of your brain , and if you train it,
you will become rich and enjoy a much
higher "peer" group .
Your left side is only trying to figure
how to SPEAK , what the Right side has
accurately mapped out .
Ill make it easy for you ,
lets redefine chars , but not even close
to what they are now , and the rule
is they must be very general VERBS .
> does NOT mean LT , we cant use
it because it would confuse ,
so instead , lets do
/ means the line above or
something above and NOT this line .
\ lower line or lower something ..
! caution , an error , store a number
# a structure a pattern , a array .
The objective is to redfine , so ppl
will NOT struggle nor argue on
the redefine . I dont want them
to stop for 1 milisecond , trying to
decipher it , so it must be totaly
new definition .
Example , if i used ^ to mean [UP]
what would i use for [down] !
No good ...
General , so one can suffix to
narrow the "range"
[ / line 32 ]
noice how this takes the burden
off the Left side , it must read , present
line to figure if line 32 is
ABOVE or is BELOW the present line .
This is how you learn to program ,
with your right side , by turning OFF
or preventing the left side from
"calculating" anything .
Left side is your worst enemy , it wants
to become a college proff and make $28,000
a year , while your
Right side "see's" a picture of people making $$ ,
this image has linking to show you all the closed
doors , in making that $$ .
Thus all successful ppl , use their R side .
and the left side can "explain" why he
can't get a job .
R side has NO excuses, only plans
I am a Rich E.E., programmer ,
developer , cause i trained
my R side to toss excuses .
Wanna join a "no-excuses" peer
group ?
>15.6.2.xxxs (( "comments" TOOLS EXT
>
> Compilation: Perform the execution semantics given below.
>
> Execution: ( "ccc" -- )
>
> Parse text until the word ))
> is found or until the current parse area cannot be refilled.
> Discard the found )) string. (( is an immediate word.
>
> An ambiguous condition exists if the word )) is not found and the
>parse area cannot be refilled.
I presume that the phrase "the word ))" means that it is white-space
delimited. This matches the MPE implementation, which has 15-20 years
of history.
Stephen
--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
Don't bother. Non-nestable supports 95+% of the problem.
"0 [if] ... [then]" fixes the rest.
Please note that by convention the )) is white space delimited,
which is rare in formulae.
By "whitespace" do you mean blank or blank|tab|cr|control-character|
etc?
It isn't that rare for a commented formula to end with )) or ))) etc.
For that matter I sometimes include blanks in commented formulas
because that makes them easier ofr me to read.
You have lots of code with (( and changing its meaning on you might
break some of it.
So if we wind up with a standard nestable multiline comment it should
have a different name.
> So if we wind up with a standard nestable multiline comment it should
> have a different name.
(* .. *)
or
\\ .. \\ (as a personal matter of taste I like this one more)
would be fine with me
Without too much thought, probably okay with me, too.
How about leaving it up to the user, and proposing instead a
defining word (maybe JET already has a DOES> reference version?)
that would be some variation on
: comment: ( "introducer.name" terminator.s -- )
(
Define a word named by the next word in the parse area, which
skips text across lines, and leaves the input stream positioned
just after the ANS Forth string terminator.s, or at the end of
the file if terminator.s is not found.
)
... ;
Syntax example:
s" ))" comment: ((
Or make it an entirely nonparsing word with syntax:
s" ((" s" ))" comment
Or one with entirely parsing syntax:
comment: (( ))
comment: \\ \\
I personally prefer the entirely parsing version, with the
entirely nonparsing version as runner-up.
Then it would be easy to have multiple comment delimiters in the
same document. I don't know what would be the best name for
COMMENT: or COMMENT ...
-- David