Stephen
RfD - Enhanced local variable syntax
====================================
Stephen Pelc - 20 August 2006
Problem
=======
1) The current LOCALS| ... | notation explicitly forces all locals
to be initialised from the data stack.
2) 1) The current LOCALS| ... | notation defines locals in reverse
order to the normal stack notation.
3) When programming large applications, especially those interfacing
with a host operating system, there is a frequent need for temporary
buffers.
4) Current implementations show that creation and destruction of
local buffers are much faster than using ALLOCATE (14.6.1.0707)
and FREE (14.6.1.1605).
Solution
========
Base version
------------
The following syntax for local arguments and local variables is
proposed. The sequence:
{ ni1 ni2 ... | lv1 lv2 ... -- o1 o2 }
defines local arguments, local variables, and outputs. The local
arguments are automatically initialised from the data stack on
entry, the rightmost being taken from the top of the data stack.
Local arguments and local variables can be referenced by name within
the word during compilation. The output names are dummies to allow
a complete stack comment to be generated.
The items between { and | are local arguments.
The items between | and -- are local variables or buffers.
The items between -- and } are outputs.
Local arguments and variables return their values when referenced,
and must be preceded by TO to perform a store.
Local buffers may be defined in the form:
arr[ <expr> ]
Any name ending in the '[' character will be treated as an buffer,
the expression up to the terminating ']' will be interpreted to
provide the size of the buffer. Local buffers only return their base
address, all operators such as TO generate an ambiguous condition.
In the example below, a and b are local arguments, a+b and a*b are
local variables, and arr[ is a 10 byte local buffer.
: foo { a b | a+b a*b arr[ 10 ] -- }
a b + to a+b
a b * to a*b
cr a+b . a*b .
arr[ 10 erase
s" Hello" arr[ swap cmove
;
Local types
-----------
Some current Forth systems use indicators to define local variables
of sizes other than a cell. It is proposed that any name ending in a
':'
(colon) be reserved for this use.
: foo { a b | F: f1 F: f2 -- c }
...
;
Discussion
==========
The '|' (ASCII 0x7C) character is widely used as the separator
between local arguments and local variables. Other characters
accepted in current Forth implementations are '\' (ASCII 0x5C) and
'Ś' (ASCII 0xA6).. Since the ANS standard is defined in terms of
7 bit ASCII, and with regard to internationalistion, we propose only
to consider the '|' and '\' characters further. Only recognition of
the '|' separator is mandatory.
The use of local types is contentious as they only become useful
if TO is available for these. In practice, many current systems
permit TO to be used with floats (children of FVALUE) and other
data types. Such systems often provide additional operators such
as +TO (add from stack to item) for children of VALUE and FVALUE.
Standardisation of operators with (for example) floats needs to
be done before the local types extension can be incorporated into
Forth200x. Apart from forcing allocation of buffer space, no
additional functionality is provided by local types that cannot
be obtained using local buffers. More preparatory standardisation
needs to be done before local types.
Apart from { (brace) itself, the proposal introduces one new
word BUILDLV. The definition of this word is designed for future
enhancements, e.g. more local data types, without having to
introduce more new words.
Forth 200x text
===============
13.6.2.xxxx {
brace LOCAL EXT
Interpretation: Interpretation semantics for this word are undefined.
Compilation:
( "<spaces>arg1" ... "<spaces>argn" | "<spaces>lv1" ... "<spaces>lvn"
-- )
Create up to eight local arguments by repeatedly skipping leading
spaces, parsing arg, and executing 13.6.2.yyyy BUILDLV. The list of
local arguments to be defined is terminated by "|", "--" or "}".
Append the run-time semantics for local arguments given below to the
current definition. If a space delimited '|' is encountered, create
up to eight local variables or buffers by repeatedly skipping
leading spaces, parsing lv, and executing 13.6.2.yyyy BUILDLV. The
list of local variables and buffers to be defined is terminated by
"--" or "}". Append the run-time semantics for local variables and
local buffers given below to the current definition. If "--" has
been encountered, further text between "--" and } is ignored.
Local buffers have names that end in the '[' character. They define
their size by parsing the text string up to the next ']' character,
and passing that string to 7.6.1.1360 EVALUATE to obtain the size
of the storage in address units.
Local argument run-time: ( x1 ... xn -- )
Local variable run-time: ( -- )
Local buffer run-time: ( -- )
Initialize up to eight local arguments as described in 13.6.2.yyyy
BUILDLV. Local argument arg1 is initialized with x1, arg2 with x2 up
to argn from xn, which is on the top of the data stack. When
invoked, each local argument will return its value. The value
of a local argument may be changed using 13.6.1.2295 TO.
Initialize up to eight local variables or local buffers as described
in 13.6.2.yyyy BUILDLV. The initial contents of local variables and
local buffers are undefined. When invoked, each local variable
returns its value. The value of a local variable may be changed
using 13.6.1.2295 TO. The size of a local variable is a cell.
When invoked, each local buffer will return its address. The user
may make no assumption about the order and contiguity of local
variables and buffers in memory.
Ambiguous conditions:
The { ... } text extends over more than one line.
The expression for local buffer size does not return a single
cell.
13.6.2.yyyy BUILDLV
build-l-v LOCAL EXT
Interpretation: Interpretation semantics for this word are undefined.
Execution: ( c-addr u +n mode -- )
When executed during compilation, BUILDLV passes a message to the
system identifying a new local argument whose definition name is
given by the string of characters identified by c-addr u. The size
of the data item is given by +n address units, and the mode
identifies the construction required as follows:
0 - finish construction of initialisation and data storage
allocation code. C-addr and u are ignored. +n is 0
(other values are reserved for future use).
1 - identify a local argument, +n = cell
2 - identify a local variable, +n = cell
3 - identify a local buffer, +n = storage required.
4+ - reserved for future use
-ve - implementation specific values
The result of executing BUILDLV during compilation of a definition
is to create a set of named local arguments, variables and/or
buffers, each of which is a definition name, that only have
execution semantics within the scope of that definition's source.
local argument execution: ( -- x )
Push the local argument's value, x, onto the stack. The local
argument's value is initialized as described in 13.6.2.xxxx { and may
be
changed by preceding the local argument's name with TO.
local variable execution: ( -- x )
Push the local variables's value, x, onto the stack. The local
variable is not initialised. The local variable's value may be
changed by preceding the local variable's name with TO.
local buffer execution: ( -- a-addr )
Push the local buffer's address, a-addr, onto the stack. The address
is aligned as in 3.3.3.1. The contents of the buffer are not
initialised.
Note: This word does not have special compilation semantics in the
usual sense because it provides access to a system capability for
use by other user-defined words that do have them. However, the
locals facility as a whole and the sequence of messages passed
defines specific usage rules with semantic implications that are
described in detail in section 13.3.3 Processing locals.
Note: This word is not intended for direct use in a definition to
declare that definition's locals. It is instead used by system or
user compiling words. These compiling words in turn define their
own syntax, and may be used directly in definitions to declare
locals. In this context, the syntax for BUILDLV is defined in terms
of a sequence of compile-time messages and is described in detail
in section 13.3.3 Processing locals.
Note: The LOCAL EXT word set modifies the syntax and semantics of
6.2.2295 TO as defined in the Core Extensions word set.
See: 3.4 The Forth text interpreter
Ambiguous conditions:
a local argument, variable or buffer is executed while in
interpretation state.
Reference implementation
=========================
(currently untested)
: TOKEN \ -- caddr u
\ Get the next space delimited token from the input stream.
BL PARSE
;
: LTERM? \ caddr u -- flag
\ Return true if the string caddr/u is "--" or "}"
2dup s" --" compare 0= >r
s" }" compare 0= r> or
;
: LBSIZE \ -- +n
\ Parse up to the terminating ']' and EVALUATE the expression
\ not including the terminating ']'.
[char] ] parse evaluate
: LB? \ caddr u -- flag
\ Return true if the last character of the string is '['.
+ 1 chars - c@ [char} [ =
;
: LSEP? \ caddr u -- flag
\ Return true if the string caddr/u is the separator between
\ local arguments and local variables or buffers.
2dup s" |" compare 0= >r
s" \" compare 0= r> or
;
: { ( -- )
0 >R \ indicate arguments
BEGIN
TOKEN 2DUP LTERM? 0=
WHILE \ -- caddr len
2DUP LSEP? IF \ if '|'
R> DROP 1 >R \ change to vars and buffers
ELSE
R@ 0= IF \ argument?
CELL 1
ELSE \ variable or buffer
LB?
IF LBSIZE 3 ELSE CELL 2 THEN
THEN
BUILDLV
THEN
REPEAT
BEGIN
S" }" COMPARE
WHILE
TOKEN
REPEAT
0 0 0 0 BUILDLV
R> DROP
; IMMEDIATE
--
Stephen Pelc, steph...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads
[snip]
Intersting. I've been slowly working on a Forth-based lanaguage in
which I aquire local variables using slightly different notation. To
reuse your example, essentially:
: foo
'b 'a
a' b' + 'a+b
a' b' * 'a*b
cr a+b' . a*b' .
;
Such that 'var pops the value off the stack and stores it. And var'
puts the value (or actually a reference to the value) back onto the
stack. Yes, the initial stores are backward but I don;t find that too
problematic.
I have not consider the treatment of fixed sized buffers however.
~Trans.
P.S. I am still relatively new to Forth.
Many thanks Stephen, for a fine proposal!
Fast comments:
1. { is error prone on todays' display hardware and given
Forth programmers' poor eyesight; viz. ( { (
Don't believe me?
I noticed this ( } ] ) in your own reference implementation:
> + 1 chars - c@ [char} [ =
2. Why only 8 local parameters and AGAIN 8 local variables or buffers?
We all now where the '8' came from (# registers available on 68000).
This was inappropriate, but the proposed limitations make it worse...
3. IMHO, the definition must be allowed to spread over more than 1 line.
If not, with the maximum of locals and say 2 output values, 18 names
must be fit on a single line, maybe combined with types (block users
won't have problems with it, though :-)
4. Why not guarantee that a local var is initialized to 0? Now even in
trivial cases we may see lots of "0 TO ape 0 TO bar" etc., or the old
custom of : test ( a b -- ) 0 0 { a b c d -- } ... ; i. e. misusing
local parameters.
5. name[ is a bad idea, as it suggests name[ 3 ] @ etc., which is not
allowed. The only allowed syntax is "name[ 10 erase" etc., which is ugly.
6. It seems a local buffer needs a constant argument (!) which is parsed
at compile time (!). I hate that :-) Why not something like: buff_18 or
buff18 or buff(18) (not much worse than the requirement of last char is ']'),
where the actual name of the buffer is "buff" (not buff_18, buff18 or buff(18) ).
Another possibility would be in line with future F: W: etc.: Supply a creating
word BUFFERN: where it is required to define local buffer definers before use.
18 BUFFERN: buffer18:
: test { F: bar | buffer18: foo -- } bar foo f! foo f@ f. ;
7. The above points to a problem with this syntax: it can become confusing
what is on the integer and floating point stacks. At least, it is much less
clear than:
: test ( n1 n2 n3 -- ) ( F: a b c -- d )
LOCALS| m n o |
FLOCALS| c b a | ... ;
Thanks again for your effort,
-marcel
PS: The ASCII $A6 is a halfsized underlined "a". In the Windows character
set $A6 is the broken bar character or pipe symbol "|". I am confused.
Some Forths use curly braces for multiline comments. instead of the
ugly 0 [IF]
I am comfortable with L( ... ) for defining locals. Also it resembles
the stack annotation. And as other posted, braces are hard to read.
I see problems with output locals. In my previous experiments I
noticed that they were seldom of practical use.
: TEST L( a -- o1 o2 )
a to o1
1
a 2 + to o2 ;
0 TEST
Will the output be 1 0 2 or 0 2 1 ?
Andreas
-------
A computer is like an old Greek god, with a lot of rules and no mercy.
> At long last I've started on my Forth200x tasks. Here's the first
> cut of the enhanced local variables using the { ... } notation.
>
> SNIP <
> Discussion
> ==========
> The '|' (ASCII 0x7C) character is widely used as the separator
> between local arguments and local variables. Other characters
> accepted in current Forth implementations are '\' (ASCII 0x5C) and
> '¦' (ASCII 0xA6).. Since the ANS standard is defined in terms of
> 7 bit ASCII, and with regard to internationalistion, we propose only
> to consider the '|' and '\' characters further. Only recognition of
> the '|' separator is mandatory.
Win32Forth support's both '|' and '\'.
regards
Dirk Busch
Coincidentally, I have been working on the locals implementation in the
W32F native code version this last week. Timely; thank you.
>
> Stephen
>
> RfD - Enhanced local variable syntax
> ====================================
> Stephen Pelc - 20 August 2006
>
> Problem
> =======
> 1) The current LOCALS| ... | notation explicitly forces all locals
> to be initialised from the data stack.
> 2) 1) The current LOCALS| ... | notation defines locals in reverse
> order to the normal stack notation.
> 3) When programming large applications, especially those interfacing
> with a host operating system, there is a frequent need for temporary
> buffers.
> 4) Current implementations show that creation and destruction of
> local buffers are much faster than using ALLOCATE (14.6.1.0707)
> and FREE (14.6.1.1605).
Agree with the above, and would add
5) Current implementations of { are diverging in the facilities they
provide.
6) { and } are visually confusing. Perhaps select another character or
string, which would allow existing implementations to do their own
thing unmolested.
>
> Solution
> ========
> Base version
> ------------
> The following syntax for local arguments and local variables is
> proposed. The sequence:
> { ni1 ni2 ... | lv1 lv2 ... -- o1 o2 }
> defines local arguments, local variables, and outputs. The local
> arguments are automatically initialised from the data stack on
> entry, the rightmost being taken from the top of the data stack.
> Local arguments and local variables can be referenced by name within
> the word during compilation. The output names are dummies to allow
> a complete stack comment to be generated.
IMHO the locals should really not replace the stack commentary (and
perhaps it's time there was an RFD for those too that can replace { }
?). The maximal requirement should be
{ ni1 ni2 ... | lv1 lv2 ... }
otherwise dummies may be taken for real values (and I also prefer the
word values to variables above to avoid ambiguity). They're suggestive
of automatic return values, which they are not.
> The items between { and | are local arguments.
> The items between | and -- are local variables or buffers.
> The items between -- and } are outputs.
>
> Local arguments and variables return their values when referenced,
> and must be preceded by TO to perform a store.
That's the bit I hate, but it's too late to argue for local variables I
know.
>
> Local buffers may be defined in the form:
> arr[ <expr> ]
> Any name ending in the '[' character will be treated as an buffer,
> the expression up to the terminating ']' will be interpreted to
> provide the size of the buffer. Local buffers only return their base
> address, all operators such as TO generate an ambiguous condition.
I do like the idea of being able to declare variables as opposed to
values.
I don't like the requirement to have a special token recogniser in the
parse a for the [ on grounds of aesthetics if nothing else. Plus, other
syntaxes can be envisaged in which a modified EVALUATE/QUIT with a
minor modification can handle the string between { and } in compile
mode; DPANS section 3.4 clause "d) If unsuccessful, an ambiguous
condition exists" would be "d) if unsuccesful, declare a local cell
sized value". It also makes f: or float: and other type modifiers
supportable as parsing immediate words.
So, does the [ have to be part of the name? Here's a parsing example;
{ a
[ 10 ] buffer: b
float: f
variable: v
| value: c }
Question: is this
: x [ 128 ] { a arr[ ] } ;
valid? And is this
128 : X { a arr[ ] } ;
ambiguous or invalid?
[A note. In Forths like W32F that have a MOPS based OOP, [ is becoming
overworked as it is also used for late binding. I can see potentially
ambiguous situations should locals be extended to include objects.
That, however, is a problem for another discusssion.]
>
> In the example below, a and b are local arguments, a+b and a*b are
> local variables, and arr[ is a 10 byte local buffer.
>
> : foo { a b | a+b a*b arr[ 10 ] -- }
> a b + to a+b
> a b * to a*b
> cr a+b . a*b .
> arr[ 10 erase
> s" Hello" arr[ swap cmove
Ouch. That [ really grates.
> ;
>
> Local types
> -----------
> Some current Forth systems use indicators to define local variables
> of sizes other than a cell. It is proposed that any name ending in a
> ':'
> (colon) be reserved for this use.
>
> : foo { a b | F: f1 F: f2 -- c }
> ...
> ;
>
> Discussion
> ==========
> The '|' (ASCII 0x7C) character is widely used as the separator
> between local arguments and local variables. Other characters
> accepted in current Forth implementations are '\' (ASCII 0x5C) and
> '¦' (ASCII 0xA6).. Since the ANS standard is defined in terms of
> 7 bit ASCII, and with regard to internationalistion, we propose only
> to consider the '|' and '\' characters further. Only recognition of
> the '|' separator is mandatory.
Use of \ should be discouraged on the grounds of ambiguity. It also
gives parsing & colouring editors indigestion. 0xA6 is outside the
range of ASCII.
[snipped]
> 0 0 0 0 BUILDLV
Is 0 0 (LOCALS) an equivalent?
--
Regards
Alex McDonald
>1. { is error prone on todays' display hardware and given
>Forth programmers' poor eyesight; viz. ( { (
True, but brace has been common practise for many years
- we have change notes back to 1994, and MPE did not
originate the notation. Do I really want to break millions
of lines of code? NO.
>> + 1 chars - c@ [char} [ =
Oops! My only excuse is that I changed to a wide LCD monitor a
few weeks ago. Now it's so far away I need new glasses which
have not arrived yet.
>2. Why only 8 local parameters and AGAIN 8 local variables or buffers?
>We all now where the '8' came from (# registers available on 68000).
>This was inappropriate, but the proposed limitations make it worse...
I agree, but can only fight one battle at a time.
>3. IMHO, the definition must be allowed to spread over more than 1 line.
>If not, with the maximum of locals and say 2 output values, 18 names
>must be fit on a single line, maybe combined with types (block users
>won't have problems with it, though :-)
I agree, and we have already modified VFX Forth to do this at
customer request.
>4. Why not guarantee that a local var is initialized to 0? Now even in
>trivial cases we may see lots of "0 TO ape 0 TO bar" etc., or the old
>custom of : test ( a b -- ) 0 0 { a b c d -- } ... ; i. e. misusing
>local parameters.
So that we don't have to initialise local buffers as well. Under
host operating systems, we have seen local buffers of up to 1024
bytes, usually for translating caddr/len strings to z strings.
Personally, I believe that initialisation should be explicit.
>5. name[ is a bad idea, as it suggests name[ 3 ] @ etc., which is not
>allowed. The only allowed syntax is "name[ 10 erase" etc., which is ugly.
No uglier than some FSL stuff and the [ ... ] declaration isn't bad.
I have no intention of breaking existing code. The big advantage of
the notation is that it allows expressions such as
floc[ 18 floats ]
with reasonably readable wordsmithing.
>6. It seems a local buffer needs a constant argument (!) which is parsed
>at compile time (!). I hate that :-) Why not something like: buff_18 or
>buff18 or buff(18) (not much worse than the requirement of last char is ']'),
>where the actual name of the buffer is "buff" (not buff_18, buff18 or buff(18) ).
Existing practise.
>Another possibility would be in line with future F: W: etc.: Supply a creating
>word BUFFERN: where it is required to define local buffer definers before use.
...
But to get there someone has to go through the process of
standardising TO and friends for FLOATS. I'm not prepared to
do that yet. There isn't an FVALUE yet, so over to ... you?
>7. The above points to a problem with this syntax: it can become confusing
>what is on the integer and floating point stacks. At least, it is much less
>clear than:
>
> : test ( n1 n2 n3 -- ) ( F: a b c -- d )
> LOCALS| m n o |
> FLOCALS| c b a | ... ;
There is often considerable resistance to adding new words. The brace
proposal adds the minimum I could get to. I don't see common practise
for F: and friends. Local buffers can provide the functionality
without syntactic elegance.
>PS: The ASCII $A6 is a halfsized underlined "a". In the Windows character
>set $A6 is the broken bar character or pipe symbol "|". I am confused.
That's why we choose to ignore it.
Stephen
>Some Forths use curly braces for multiline comments. instead of the
>ugly 0 [IF]
The interpretation semantics of { are undefined. The use of { for
comments was introduced by Forth Inc after the locals { as far as
I know. Their usage is always during interpretation. As far as
language lawyers are concerned, their is no conflict!
>I am comfortable with L( ... ) for defining locals. Also it resembles
>the stack annotation. And as other posted, braces are hard to read.
Existing practice in VFX Forth, Win32Forth, gForth and others
over many years is convincing.
>I see problems with output locals. In my previous experiments I
>noticed that they were seldom of practical use.
Everything after '--' is discarded and ignored. It's defined
so that formal comments can be used. I'll update the text.
Stephen
>1. { is error prone on todays' display hardware and given
>Forth programmers' poor eyesight; viz. ( { (
True, but brace has been common practise for many years
- we have change notes back to 1994, and MPE did not
originate the notation. Do I really want to break millions
of lines of code? NO.
>> + 1 chars - c@ [char} [ =
Oops! My only excuse is that I changed to a wide LCD monitor a
few weeks ago. Now it's so far away I need new glasses which
have not arrived yet.
>2. Why only 8 local parameters and AGAIN 8 local variables or buffers?
>We all now where the '8' came from (# registers available on 68000).
>This was inappropriate, but the proposed limitations make it worse...
I agree, but can only fight one battle at a time.
>3. IMHO, the definition must be allowed to spread over more than 1 line.
>If not, with the maximum of locals and say 2 output values, 18 names
>must be fit on a single line, maybe combined with types (block users
>won't have problems with it, though :-)
I agree, and we have already modified VFX Forth to do this at
customer request.
>4. Why not guarantee that a local var is initialized to 0? Now even in
>trivial cases we may see lots of "0 TO ape 0 TO bar" etc., or the old
>custom of : test ( a b -- ) 0 0 { a b c d -- } ... ; i. e. misusing
>local parameters.
So that we don't have to initialise local buffers as well. Under
host operating systems, we have seen local buffers of up to 1024
bytes, usually for translating caddr/len strings to z strings.
Personally, I believe that initialisation should be explicit.
>5. name[ is a bad idea, as it suggests name[ 3 ] @ etc., which is not
>allowed. The only allowed syntax is "name[ 10 erase" etc., which is ugly.
No uglier than some FSL stuff and the [ ... ] declaration isn't bad.
I have no intention of breaking existing code. The big advantage of
the notation is that it allows expressions such as
floc[ 18 floats ]
with reasonably readable wordsmithing.
>6. It seems a local buffer needs a constant argument (!) which is parsed
>at compile time (!). I hate that :-) Why not something like: buff_18 or
>buff18 or buff(18) (not much worse than the requirement of last char is ']'),
>where the actual name of the buffer is "buff" (not buff_18, buff18 or buff(18) ).
Existing practise.
>Another possibility would be in line with future F: W: etc.: Supply a creating
>word BUFFERN: where it is required to define local buffer definers before use.
...
But to get there someone has to go through the process of
standardising TO and friends for FLOATS. I'm not prepared to
do that yet. There isn't an FVALUE yet, so over to ... you?
>7. The above points to a problem with this syntax: it can become confusing
>what is on the integer and floating point stacks. At least, it is much less
>clear than:
>
> : test ( n1 n2 n3 -- ) ( F: a b c -- d )
> LOCALS| m n o |
> FLOCALS| c b a | ... ;
There is often considerable resistance to adding new words. The brace
proposal adds the minimum I could get to. I don't see common practise
for F: and friends. Local buffers can provide the functionality
without syntactic elegance.
>PS: The ASCII $A6 is a halfsized underlined "a". In the Windows character
>set $A6 is the broken bar character or pipe symbol "|". I am confused.
That's why we choose to ignore it.
Stephen
>
> Local arguments and variables return their values when referenced,
> and must be preceded by TO to perform a store.
>
If we're having local variables, why can't they leave their address on
the stack so that @ and ! etc can be used as with global variables
instead of TO. I would have thought consistency with globals was a
definite thing to aim for.
Gerry
>Stephen Pelc wrote:
Because that's what happens with LOCALS| ... | and it's common
practice in all current { ... } implementations.
Stephen
> On 21 Aug 2006 06:24:38 -0700, "GerryJ"
> <ge...@jackson9000.fsnet.co.uk> wrote:
>
> >Stephen Pelc wrote:
>
> >> Local arguments and variables return their values when referenced,
> >> and must be preceded by TO to perform a store.
> >
> >If we're having local variables, why can't they leave their address on
> >the stack so that @ and ! etc can be used as with global variables
> >instead of TO. I would have thought consistency with globals was a
> >definite thing to aim for.
>
> Because that's what happens with LOCALS| ... | and it's common
> practice in all current { ... } implementations.
a change in syntax, but yet a requirement to have an easy search and
replace?? umm
any implementation should provide a simple textual replace definition
too to replace all the old standard syntax?
personally i don't need names for locals, just a fixed base or top for
referancing numbers from, as use of too many locals usualy indicates
little factorization.
a better soulution would maybe to concentrate on giving words names,
which do not interfere with global names.
eg
: %localfn ... ;
or some other standardized local prefix.
ooh we just love stack frames :-) , so i suppose a C calling convention
would be best :-)
cheers
Consider that in some implementations the stacks are not directly
accessible. For example : EMIT SP@ 1 EMIT DROP ; does not work there.
Value-type locals are portable everywhere.
My 16-bit implementation has a separate section (segment in x86 parlance)
for the three stacks (data, return and locals).
--
Coos
CHForth, 16 bit DOS applications
http://home.hccnet.nl/j.j.haak/forth.html
> So that we don't have to initialise local buffers as well. Under
> host operating systems, we have seen local buffers of up to 1024
> bytes, usually for translating caddr/len strings to z strings.
> Personally, I believe that initialisation should be explicit.
With an ERASE speed of 500 Mbytes/sec, what's the problem?
I suppose you ALLOCATE the buffer, then it's an OS flag?
Explicit initialization is still possible (but unnecessary if 0).
>> 5. name[ is a bad idea, as it suggests name[ 3 ] @ etc., which is not
>> allowed. The only allowed syntax is "name[ 10 erase" etc., which is ugly.
> No uglier than some FSL stuff and the [ ... ] declaration isn't bad.
> I have no intention of breaking existing code. The big advantage of
> the notation is that it allows expressions such as
> floc[ 18 floats ]
> with reasonably readable wordsmithing.
Sorry, but ...
18 floats []floc
... does the same thing, and given your ref. impl. then having floc as the
buffer name is no extra work :-)
Breaking code is a valid concern, but those at the front get shot more easily.
[..]
> But to get there someone has to go through the process of
> standardising TO and friends for FLOATS. I'm not prepared to
> do that yet. There isn't an FVALUE yet, so over to ... you?
I wouldn't live without it, but if I'm the only one, why standardize?
[..]
-marcel
Because @ and ! would *require* local variables to *have* an address.
They could not live in registers then (unless you redefine @ ! +! F@ F!
MOVE ERASE ... ). TO hides this detail. *I* like it :-)
-marcel
In the great tradition of weighing in where angels fear to,
Re: limit of 8 locals.
Is a word that needs more than 8 a properly factored word?
Re: local fp variables, what is wrong with FLOCALS| ... |
and/or FTO ?
I realize the curly braces notation is a reasonable one, but
I happen to dislike it, first because as eyes age I find it
hard to distinguish reliably from ( ) ; and second, because
it is part of the array protocol used by the Forth Scientific
Library. That is, it would be impossible to use the latter
on a system where } was a word rather than a terminal char-
acter. Some locals implementers might like to make it a word.
Would you insist that only the terminal character method is
legitimate?
--
Julian V. Noble
Professor Emeritus of Physics
University of Virginia
>> No uglier than some FSL stuff and the [ ... ] declaration isn't bad.
Note that the common use of { as the last character of an array in the FSL
is not at all mandatory. It can be present if you like the look of
LIST{ 13 } but nothing stops you from using a different name and having
LIST 13 } and using the name LIST{ for something else entirely.
The same issue might come up with (from the original post
<44e8cb2c....@news.demon.co.uk> )
>> Local types
>> -----------
>> Some current Forth systems use indicators to define local variables
>> of sizes other than a cell. It is proposed that any name ending in a
>> ':'
>> (colon) be reserved for this use.
>>
>> : foo { a b | F: f1 F: f2 -- c }
>> ...
>> ;
If these restrictions are adopted, it should be made very clear in the
docs that they apply only during the parsing of a locals definition. In
the case of [ terminating local block definitions, Marcel has pointed out
several better alternatives, such as the one below:
>> I have no intention of breaking existing code. The big advantage of
>> the notation is that it allows expressions such as
>> floc[ 18 floats ]
>> with reasonably readable wordsmithing.
>
> Sorry, but ...
>
> 18 floats []floc
>
> ... does the same thing, and given your ref. impl. then having floc as
> the buffer name is no extra work :-)
>
> [..]
>
> -marcel
regards cgm
Why not call them local values then? Differences between different
types of variables is inconsistent and that is *bad* in any language.
Explaining to a newcomer to Forth that they have to be handled
differently is likely to confuse and reinforce any impression that
Forth is a silly language.
Anyway, while standards have to implementable, an implementation
driving the standard is a bit like the tail wagging the dog.
Gerry
> On 21 Aug 2006 06:24:38 -0700, "GerryJ"
> <ge...@jackson9000.fsnet.co.uk> wrote:
>
> >Stephen Pelc wrote:
>
> >> Local arguments and variables return their values when referenced,
> >> and must be preceded by TO to perform a store.
> >
> >If we're having local variables, why can't they leave their address on
> >the stack so that @ and ! etc can be used as with global variables
> >instead of TO. I would have thought consistency with globals was a
> >definite thing to aim for.
>
> Because that's what happens with LOCALS| ... | and it's common
> practice in all current { ... } implementations.
>
I'm well aware of that. I suppose I'm just quibbling about calling them
local variables, calling them local values would remove any
inconsistency with global variables. Why introduce inconsistency if it
can be avoided?
Gerry
>Why not call them local values then? Differences between different
>types of variables is inconsistent and that is *bad* in any language.
You have a point.
>Anyway, while standards have to implementable, an implementation
>driving the standard is a bit like the tail wagging the dog.
Not so. At least during the ANS Forth process, it was emphasised
to us by the upper level of ANS that a standard should (where
possible) encapsulate existing common practice. The brace notation
without local buffers has been idependently implemented on several
systems.
The notation for local buffers was developed ten or more years
ago for ProForth for Windows and has been stable for many
years. Once local variables/values are available, it is a
simple extension to provide local buffers.
Stephen
> 18 floats []floc
Lets examine what I think you mean:
: foo { a b | 18 floats []floc -- d }
...
;
Now the compiler has to know that 18 and floats are not names.
Redefinition may generate a warning but is not an error. I like
the idea that the expression for the size of the expression is
syntactically bounded. How would your proposal handle the below?
6 constant poo
8 value goo
: foo { a b | poo floats []floc goo cells []doo -- d }
...
;
I say it's ambiguous (are poo and goo local var names?), whereas
the proposal is not.
: foo { a b | floc[ poo floats ] doo[ goo cells ] -- d }
...
;
>> But to get there someone has to go through the process of
>> standardising TO and friends for FLOATS. I'm not prepared to
>> do that yet. There isn't an FVALUE yet, so over to ... you?
>
>I wouldn't live without it, but if I'm the only one, why standardize?
Chicken and egg - there's no champion. I'm with you on wanting
FVALUE, but we don't have a reference implementation with a long
history. Perhaps someone can explain why it didn't make it into ANS.
> 18 floats []floc
Lets examine what I think you mean:
: foo { a b | 18 floats []floc -- d }
...
;
Now the compiler has to know that 18 and floats are not names.
Redefinition may generate a warning but is not an error. I like
the idea that the expression for the size of the expression is
syntactically bounded. How would your proposal handle the below?
6 constant poo
8 value goo
: foo { a b | poo floats []floc goo cells []doo -- d }
...
;
I say it's ambiguous (are poo and goo local var names?), whereas
the proposal is not.
: foo { a b | floc[ poo floats ] doo[ goo cells ] -- d }
...
;
>> But to get there someone has to go through the process of
>> standardising TO and friends for FLOATS. I'm not prepared to
>> do that yet. There isn't an FVALUE yet, so over to ... you?
>
>I wouldn't live without it, but if I'm the only one, why standardize?
Chicken and egg - there's no champion. I'm with you on wanting
FVALUE, but we don't have a reference implementation with a long
history. Perhaps someone can explain why it didn't make it into ANS.
Stephen
And;
7) LOCALS| has the following restrictions on the use of the rstack;
13.3.3.2 Syntax restrictions
c) Locals shall not be declared until values previously placed on the
return stack within the definition have been removed;
d) After a definition's locals have been declared, a program may place
data on the return stack. However, if this is done, locals shall not be
accessed until those values have been removed from the return stack;
Can this restriction be lifted with { locals? W32F has no such
restriction, although the current implementation is not optimal. W32F
keeps an rstack base pointer at run time as it supports variable length
locals, something that isn't really required and isn't proposed in ths
RfD.
It's only an issue for implementations that use the rstack for locals;
for them, removing this restriction would simply require r> >r 2>r and
2r> to communicate to the locals at compile time what they're doing
with the return stack.
--
Regards
Alex McDonald
> On Mon, 21 Aug 2006 21:03:32 GMT, m...@iae.nl (Marcel Hendrix) wrote:
>> 18 floats []floc
> Lets examine what I think you mean:
> : foo { a b | 18 floats []floc -- d }
> ...
> ;
You got me there, but it can be fixed with
: foo 18 floats { a b | floc -- d } ... ;
A local buffer would support a size that is NOT known at compile time.
This creates an implementation problem. However, I personally would
never put arbritrary size buffers on the locals stack. These would have
to be allocated, and therefore a local buffer is a single cell on the
locals stack.
[..]
>>> But to get there someone has to go through the process of
>>> standardising TO and friends for FLOATS. I'm not prepared to
>>> do that yet. There isn't an FVALUE yet, so over to ... you?
>> I wouldn't live without it, but if I'm the only one, why standardize?
> Chicken and egg - there's no champion. I'm with you on wanting
> FVALUE, but we don't have a reference implementation with a long
> history. Perhaps someone can explain why it didn't make it into ANS.
FVALUE has always been in tForth and iForth. I meant that for a standard
there must at least be two conflicting implementations.
-marcel
> Local types
> -----------
> Some current Forth systems use indicators to define local variables
> of sizes other than a cell. It is proposed that any name ending in a
> ':'
> (colon) be reserved for this use.
>
> : foo 爗 a b | F: f1 F: f2 -- c }
> ...
> ;
IMHO, the typed locals are quite important, and can offer a way to be used
for more complex locals (like structures and other sort of temporary
buffers).
In bigFORTH, I've the following types:
W: normal word, the default if no type is selected
F: floating point number
D: double number
R: Record. Usage R: <record_name> <local>. Records are always uninitialized.
I think this is universal. Whatever buffer you need, you can always define a
structure/record for it, and declare a local to hold it.
--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/
Then everything after -- is comment. IMHO one should not mix
unnecessarily compiler instructions and comments within the same
syntax. You could put any misleading things after --.
>Then everything after -- is comment. IMHO one should not mix
>unnecessarily compiler instructions and comments within the same
>syntax. You could put any misleading things after --.
It is very common practice to put a stack comment on the definition
line of a word.
: foo ( a b c | e f -- d }
\ *G FOO does something useful
Modern systems, e.g. DocGen and Brad Eckert's DexH extension,
can use this line to produce documentation. We do not
want such systems to have to provide two lines with the
same information
: foo \ a b c -- d
\ *G FOO does something useful
( a b c | e f -- d }
Having two definitions of the same thing is a recipe for error.
Enabling the tail after '--' saves keystrokes. Note that '--'
is not mandatory.
>> Agree with the above, and would add
>>
>> 5) Current implementations of { are diverging in the facilities they
>> provide.
It's the function of a standard to encourage people to use the
same names for the same function, but it is not the purpose to
stop new facilities being added to implementations.
>> 6) { and } are visually confusing. Perhaps select another character or
>> string, which would allow existing implementations to do their own
>> thing unmolested.
Common practice is to use { ... }. Is changing a name in common use
because we need new glasses or bigger monitors good enough reason
for change?
>7) LOCALS| has the following restrictions on the use of the rstack;
>
>13.3.3.2 Syntax restrictions
>c) Locals shall not be declared until values previously placed on the
>return stack within the definition have been removed;
Removing this restriction would break code. I will add it.
>d) After a definition's locals have been declared, a program may place
>data on the return stack. However, if this is done, locals shall not be
>accessed until those values have been removed from the return stack;
Do any current systems suffer if this restriction is removed?
Implementers please respond! Current MPE systems will not suffer
unless { ... } is used a second time - should this be an ambiguous
condition?
Similarly, many current systesms, including MPE's, will break
if { ... } is used inside a control structure, e.g.
: foo \ ...
... if
{ a b c -- }
...
else
...
then
;
I propose to add this as an amiguous condition as below.
Ambiguous conditions:
a) The { ... } text extends over more than one line.
b) The expression for local buffer size does not return a single
cell.
c) { ... } shall not be declared until values previously placed
on the return stack within the definition have been removed.
d) { ... } is declared within a control structure.
>FVALUE has always been in tForth and iForth. I meant that for a standard
>there must at least be two conflicting implementations.
Conflict argues against standardisation. Do a proposal for
FVALUE and another for +TO. You may be pleasantly surprised.
I'd prefer that the locals definition must be the first definition
within a word, i.e. appear right after : NAME ..
This can easily be checked by the compiler.
>d) After a definition's locals have been declared, a program may place
>data on the return stack. However, if this is done, locals shall not be
>accessed until those values have been removed from the return stack;
This is an unnecessary and contraproductive restriction. I'd consider
such locals implementations seriously flawed.
Andreas
-------
1 + 1 = 3, for large values of 1.
> In the great tradition of weighing in where angels fear to,
>
> Re: limit of 8 locals.
>
> Is a word that needs more than 8 a properly factored word?
I agree that 8 is usually more than enough for passed in arguments. But
if you also want to keep track of a few truly "local" data items as
well.
At least two FSL files use 9 or 10 locals in a couple of words, as I
recall - one of the few non-standard usages that I encountered in the
whole library. Of course, many of the FSL words are not what I would
call "well-factored" ;-)
gaussj.seq
SVD.seq
It can't be parsed for as it would break a lot of code. It can be
detected only if { can ascertain that there's been no code generated
between : and { . Would that not give optimising compilers some
difficulties?
--
Regards
Alex McDonald
> >FVALUE has always been in tForth and iForth. I meant that for a standard
> >there must at least be two conflicting implementations.
> Conflict argues against standardisation. Do a proposal for
> FVALUE and another for +TO. You may be pleasantly surprised.
I may be wrong, but I had the idea that he was making the cynical
observation that until there are two conflicting implementations no one
will put in the effort to try to make one of them standard.
> > I'd prefer that the locals definition must be the first definition
> > within a word, i.e. appear right after : NAME ..
> > This can easily be checked by the compiler.
> It can't be parsed for as it would break a lot of code. It can be
> detected only if { can ascertain that there's been no code generated
> between : and { . Would that not give optimising compilers some
> difficulties?
If you want to check for that, you could have { be something that :
:NONAME and DOES> check. Any other time it doesn't work or gives an
error message.
It wouldn't break code for the standard to say not to define locals
elsewhere -- code which does that would be just as portable as it is
now. Any implementator who wants to let people define locals elsewhere
could do so. The question is whether it would be good to require
implementors to allow locals definition elsewhere. And if so, which
places should locals definition be allowed?
Good optimising compilers will put locals in available registers and
will only move them out of the registers when the registers must be
used for something else -- for example a called word that uses locals,
or a word in another task that uses locals.
But less-optimised Forths will tend to put locals on the return stack,
and if they do there's a question what other return-stack uses might
interfere with locals or vice versa. What I remember for that is direct
use of the return stack ( >R etc), loops, and calls. Calls are no
problem, you won't use your locals inside a word you call and the
return stack will be clean when that word returns.
Implementations that call a deeper routine when they switch to compile
state are already nonstandard so that shouldn't be a problem.
: ]FOO .... local1 ] local2 .... ;
: TEST [ do-something ]FOO ; ought to work just fine.
[ snip ]
>>>> But to get there someone has to go through the process of
>>>> standardising TO and friends for FLOATS. I'm not prepared to
>>>> do that yet. There isn't an FVALUE yet, so over to ... you?
>
>>> I wouldn't live without it, but if I'm the only one, why standardize?
>
>> Chicken and egg - there's no champion. I'm with you on wanting
>> FVALUE, but we don't have a reference implementation with a long
>> history. Perhaps someone can explain why it didn't make it into ANS.
>
> FVALUE has always been in tForth and iForth. I meant that for a standard
> there must at least be two conflicting implementations.
>
> -marcel
>
I note that Win32Forth has FVALUE and FTO . Gforth does not.
I doubt whether Win32Forth's version conflicts with Marcel's in
its usage.
> On 21 Aug 2006 14:15:00 -0700, "GerryJ"
> <ge...@jackson9000.fsnet.co.uk> wrote:
>
> >Why not call them local values then? Differences between different
> >types of variables is inconsistent and that is *bad* in any language.
>
> You have a point.
>
> >Anyway, while standards have to implementable, an implementation
> >driving the standard is a bit like the tail wagging the dog.
>
> Not so. At least during the ANS Forth process, it was emphasised
> to us by the upper level of ANS that a standard should (where
> possible) encapsulate existing common practice. The brace notation
> without local buffers has been idependently implemented on several
> systems.
>
This misses the point of what I was trying to say. I was responding to
a statement by Marcel Hendrix which said:
> Because @ and ! would *require* local variables to *have* an address.
> They could not live in registers then (unless you redefine @ ! +! F@ F!
> MOVE ERASE ... ). TO hides this detail. *I* like it :-)
He states that because some implementations of locals puts them in
registers, which cannot have an address, local variables that leave
their address on the stack must not be included in the standard. I
disagree with that.
As other implementations implement locals in memory, their local
variables could leave an address on the stack. Notwithstanding the aim
to encapsulate existing common practice I think assuming locals will
be implemented in registers is going too far if it means a useful bit
of functionality cannot be included in the standard. After all if local
variables leaving an address on the stack were part of the standard,
compilers that would otherwise allocate them to registers could place
them in memory and the programmer would have to take the performance
hit. Presumably compilers that use registers for locals must have a
mechanism for using memory anyway as they must run out of registers in
some cases.
Anyway if you start calling them local values in your proposal the
problem goes away and everybody is happy.
Gerry
> It is very common practice to put a stack comment on the definition
> line of a word.
> : foo ( a b c | e f -- d }
> \ *G FOO does something useful
I see that your glasses still didn't arrive :-)
> Modern systems, e.g. DocGen and Brad Eckert's DexH extension,
> can use this line to produce documentation. We do not
> want such systems to have to provide two lines with the
> same information
> : foo \ a b c -- d
> \ *G FOO does something useful
> ( a b c | e f -- d }
And another one ...
-marcel
> On 21 Aug 2006 16:04:53 -0700, "Alex McDonald"
> <alex...@btopenworld.com> wrote:
>> d) After a definition's locals have been declared, a program may place
>> data on the return stack. However, if this is done, locals shall not be
>> accessed until those values have been removed from the return stack;
> Do any current systems suffer if this restriction is removed?
> Implementers please respond! Current MPE systems will not suffer
> unless { ... } is used a second time - should this be an ambiguous
> condition?
No problem in iForth:
FORTH> : test locals| a b | a . b . 22 >R a . b . R> . ; ok
FORTH> 111 222 test 222 111 222 111 22 ok
FORTH> : test2 88 >R locals| a b | a . b . R@ . a . b . R> . ; ok
FORTH> 111 222 test2 222 111 88 222 111 88 ok
FORTH> : test3 locals| b | 33 >R locals| a | a . b . R@ . a . b . R> . ; ok
FORTH> 111 222 test3 111 222 33 111 222 33 ok
SwiftForth crashes badly (couldn't copy and paste the result),
Win32F 6.10.04 doesn't compile test3 and aborts on test2,
VFX doesn't compile test3 and aborts on test2,
gForth-fast:
: test locals| a b | a . b . 22 >R a . b . R> . ; ok
: test2 88 >R locals| a b | a . b . R@ . a . b . R> . ; ok
: test3 locals| b | 33 >R locals| a | a . b . R@ . a . b . R> . ; ok
111 222 test 222 111 222 111 22 ok
111 222 test2 222 111 88 222 111 88 ok
111 222 test3 111 222 33 111 222 33 ok
-marcel
>I see that your glasses still didn't arrive :-)
...
>And another one ...
A cut and waste error?
Exactly. FVALUE is an extension. If it is not picked up, why standardize it?
Especially for FP ( which always has SF@ DF@ F@, and sometimes XF@ DD@ etc. ),
it would be nice to have source code that works regardless of the size of a
float. Here FVALUE (and a smart TO) can help.
-marcel
> "J Thomas" <jeth...@gmail.com> writes Re: RfD - Enhanced local variable syntax (long)
>
> > Stephen Pelc wrote:
> >> On Tue, 22 Aug 2006 01:37:36 GMT, m...@iae.nl (Marcel Hendrix) wrote:
>
> >> >FVALUE has always been in tForth and iForth. I meant that for a standard
> >> >there must at least be two conflicting implementations.
>
> >> Conflict argues against standardisation. Do a proposal for
> >> FVALUE and another for +TO. You may be pleasantly surprised.
>
> > I may be wrong, but I had the idea that he was making the cynical
> > observation that until there are two conflicting implementations no one
> > will put in the effort to try to make one of them standard.
>
> Exactly. FVALUE is an extension. If it is not picked up, why standardize it?
right on dude ;-)
> Especially for FP ( which always has SF@ DF@ F@, and sometimes XF@ DD@ etc. ),
> it would be nice to have source code that works regardless of the size of a
> float. Here FVALUE (and a smart TO) can help.
less core , more midget word sets!!
For Win32F 6.11.09 I get
111 222 test2 222 111 111 222 111 111
EXCEPTION 0xC0000005 ACCESS_VIOLATION
so it gets b instead of 88 from the return stack and aborts on UNNESTP
(compiled by ;) since the 88 is still on the R stack so it tries to
return there.
For test3 I get
: test3 locals| b | 33 >R locals| a | a . b . R@ . a . b . R> . ;
^^^^^^^
Error(-300): LOCALS| locals defined twice
Which is what I'd expect (the standard says no executable code should
be compiled between starting and ending the locals declaration so it's
clearly incorrect since the 33 >R is in the middle).
>
> VFX doesn't compile test3 and aborts on test2,
>
I.e. the same as W32F
>
> -marcel
George Hubert
I would not use locals, except in case of a word that implements
an algorithm, where factoring can be extremely cumbersome.
(Try Brents zero finder.)
Then limiting locals to a number of 8 could be well ... limiting.
(And there is the situation where you copy an algorithm
from some other language without really understanding it. )
<SNIP>
>--
>Julian V. Noble
>Professor Emeritus of Physics
>University of Virginia
Groetjes Albert
--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- like all pyramid schemes -- ultimately falters.
alb...@spenarnc.xs4all.nl http://home.hccnet.nl/a.w.m.van.der.horst
Gforth has a separate locals stack.
BTW: While you *can* use TO with Gforth's locals, it's not seen as good
usage. Locals in Gforth should have single assignments. For that, we ensure
that locals have a lifetime with the "come through" concept - a local lives
only when all possible pathes backward through the program (word) can reach
its declaration. So if you make a loop where a local will change each time,
you just define
: foo ( n -- ) BEGIN { a } ... a WHILE ... a 1- REPEAT ;
and this will make sure that a counts down to zero. When REPEAT jumps back
to BEGIN, the compiler realizes that at BEGIN, a does not life, and adjusts
the locals stack accordingly. But a lives still after the REPEAT, since
this point can only be reached through WHILE, and at that point, a is
clearly alive.