Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
RfD: c-addr/len
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 61 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Peter Knaggs  
View profile  
 More options Sep 11 2009, 3:08 pm
Newsgroups: comp.lang.forth
From: Peter Knaggs <p...@bcs.org.uk>
Date: Fri, 11 Sep 2009 20:08:40 +0100
Local: Fri, Sep 11 2009 3:08 pm
Subject: RfD: c-addr/len
c-add/len
=========

2009-09-09  Rendered into RfD form, added Forth200x words
1999-06-22  Original Text by John Rible

Problem
=======
A large number of words use "c-add u" to indicate the address of a
string (c-addr) and its length (u) on the stack.  With the
introduction of variable width characters, it is not clear if "u" is
referring to the number of characters or address units.

Solution
========
Introduce a new pseudo-type ("len") into the document of these words
to clarify the intent.  Replacing the "u" with a "len" should improve
the documentation of these words.  The words effected are:

    6.1.0040   #>
    6.1.0570   >NUMBER
    6.1.0980   COUNT
    6.1.1345   ENVIRONMENT?
    6.1.1360   EVALUATE
    6.1.1540   FILL
    6.1.2165   S"
    6.1.2216   SOURCE
    6.1.2310   TYPE
    6.2.2008   PARSE
    6.2.xxxx   PARSE-NAME
11.6.1.1010   CREATE-FILE
11.6.1.1190   DELETE-FILE
11.6.1.1718   INCLUDED
11.6.1.1970   OPEN-FILE
11.6.1.2080   READ-FILE
11.6.1.2090   READ-LINE
11.6.1.2165   S"
11.6.1.2480   WRITE-FILE
11.6.1.2485   WRITE-LINE
11.6.2.1524   FILE-STATUS
12.6.1.0558   >FLOAT
11.6.2.2130   RENAME-FILE
11.6.2.xxxx   REQUIRED
12.6.1.2143   REPRESENT
13.6.1.0086   (LOCAL)
16.6.1.2192   SEARCH-WORDLIST
17.6.1.0170   -TRAILING
17.6.1.0245   /STRING
17.6.1.0780   BLANK
17.6.1.0910   CMOVE
17.6.1.0920   CMOVE>
17.6.1.0935   COMPARE
17.6.1.2191   SEARCH
17.6.1.2212   SLITERAL

Proposal
========

1. Add the following to table 3.1 - Data Types

     len         character-string length               1 cell

2. Add the following to 3.1.1 Data-type relationships

     len => u => x

3. Replace "u" with "len" in 3.1.4.2 Character strings:

     A string is specified by a cell pair (c-addr len) representing
     its starting address and length in characters.

4. Add the following to table 3.5 - Environmental Query Strings:

     /CHARACTER-STRING   n  yes  maximum size of len in characters

5. Change "u" to "len" in the stack description, definition and
    rationale of the words listed under the Solution.

6. Replace "u" with "len" in section A.3.1.3.4 Counted Strings.

7. Change "u" to "len" in the rationale for A.6.2.0855 C".

Author
======
Peter Knaggs, P.J.Kna...@exeter.ac.uk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Josh Grams  
View profile  
 More options Sep 12 2009, 8:29 am
Newsgroups: comp.lang.forth
From: Josh Grams <j...@qualdan.com>
Date: Sat, 12 Sep 2009 12:29:07 GMT
Local: Sat, Sep 12 2009 8:29 am
Subject: Re: RfD: c-addr/len

Peter Knaggs wrote: <4AAAA038.2050...@bcs.org.uk>

> c-add/len

c-addr

>=========

> 2009-09-09  Rendered into RfD form, added Forth200x words
> 1999-06-22  Original Text by John Rible

> Problem
>=======
> A large number of words use "c-add u" to indicate the address of a

"c-addr u"

> string (c-addr) and its length (u) on the stack.  With the
> introduction of variable width characters, it is not clear if "u" is
> referring to the number of characters or address units.

Er...unless I missed a decision to do away with the distinction between
"1 CHARS" and "address units", isn't the ambiguity between "variable
width characters" and "characters"?  I don't see that this proposal
actually clarifies that.

At any rate, I think the definition at 3.1.4.2 Character strings makes
it clear that "c-addr u" as a unit means something special, so I don't
see any reason to replace the "u" with "len".

> Solution
>========
> Introduce a new pseudo-type ("len") into the document of these words
> to clarify the intent.  Replacing the "u" with a "len" should improve
> the documentation of these words.  The words effected are:

affected

> 3. Replace "u" with "len" in 3.1.4.2 Character strings:

>      A string is specified by a cell pair (c-addr len) representing
>      its starting address and length in characters.

In 2.1 Definitions of Terms, we have:

character:
    Depending on context, either 1) a storage unit capable of holding a
        character; or 2) a member of a character set.

I think that the presence of an address (i.e. the location of some
storage) makes it pretty clear that sense 1 is meant here, but if people
are confused by that, you might want to clarify.

----

Instead of adopting this (and that "pchar" rename proposal), I think it
would make much more sense to clarify things by leaving the existing
"char" and "character" alone, and instead adopting new terminology for
variable width characters.

As I see it, there's no reason to go changing terminology on people when
you could instead just adopt new terminology for the new concept.  Much
less potential for confusion that way.

--Josh


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bernd Paysan  
View profile  
 More options Sep 12 2009, 3:10 pm
Newsgroups: comp.lang.forth
Followup-To: comp.lang.forth
From: Bernd Paysan <bernd.pay...@gmx.de>
Date: Sat, 12 Sep 2009 21:10:23 +0200
Local: Sat, Sep 12 2009 3:10 pm
Subject: Re: RfD: c-addr/len

Josh Grams wrote:
> Instead of adopting this (and that "pchar" rename proposal), I think
> it would make much more sense to clarify things by leaving the
> existing "char" and "character" alone, and instead adopting new
> terminology for variable width characters.

We are pretty much there - the extended characters are called "extended
characters" or short "xchars".  An xchar in memory may consist of
several characters (primitive characters, that is).  I think it's easier
to deal with the name "pchar" when the "storage unit" is meant than name
it "character", but outside the xchar proposal, the terminology is not
needed.

The c-addr/len makes live easier as it definitely states that the length
is meant to be in characters (pchars), i.e. the storage unit as meant in
2.1. character 1).

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
m_l_g3  
View profile  
 More options Sep 13 2009, 2:59 pm
Newsgroups: comp.lang.forth
From: m_l_g3 <m_l...@yahoo.com>
Date: Sun, 13 Sep 2009 11:59:51 -0700 (PDT)
Local: Sun, Sep 13 2009 2:59 pm
Subject: Re: RfD: c-addr/len
Peter Knaggs 写了:

> Problem
> =======
> A large number of words use "c-add u" to indicate the address of a
> string (c-addr) and its length (u) on the stack.  With the
> introduction of variable width characters, it is not clear if "u" is
> referring to the number of characters or address units.

> Solution
> ========
> Introduce a new pseudo-type ("len") into the document of these words
> to clarify the intent.

Sorry, but I do not see from your proposal what sort of length "len"
denotes: is it "length in characters", "length in logical (multi-
byte)
characters", or "length in address units"?

>  Replacing the "u" with a "len" should improve
> the documentation of these words.

In fact, in many cases words are commented as taking and/or
leaving ( addr len ) rather than ( c-addr u ), so there is existing
practice.

IMO replacing "u" with "len" does improve readability, but does
not resolve the "which length" puzzle.

au-length, log-length, c-length ?

CHARS ( c-length -- au-length )
and so on...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bernd Paysan  
View profile  
 More options Sep 13 2009, 4:43 pm
Newsgroups: comp.lang.forth
Followup-To: comp.lang.forth
From: Bernd Paysan <bernd.pay...@gmx.de>
Date: Sun, 13 Sep 2009 22:43:03 +0200
Local: Sun, Sep 13 2009 4:43 pm
Subject: Re: RfD: c-addr/len

m_l_g3 wrote:
> IMO replacing "u" with "len" does improve readability, but does
> not resolve the "which length" puzzle.

> au-length, log-length, c-length ?

> CHARS ( c-length -- au-length )
> and so on...

Char, as it is now is:

6.1.0898 CHARS
( n1 -- n2 )
n2 is the size in address units of n1 characters.

IMHO, the stack effect is at least misleading.  I find it difficult to
get a correct stack effect - we want -1 CHARS to be used to step through
strings backwards, so we want the sign.  I.e. "len" is not the right
left side of this stack effect (len is a subtype of u, no sign).  But we
basically use CHARS to convert +-len into a +-c-addr offset.  Works fine
on two's complement, might cause problems on one's complement ;-).

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Knaggs  
View profile  
 More options Sep 13 2009, 7:01 pm
Newsgroups: comp.lang.forth
From: Peter Knaggs <p...@bcs.org.uk>
Date: Mon, 14 Sep 2009 00:01:11 +0100
Local: Sun, Sep 13 2009 7:01 pm
Subject: Re: RfD: c-addr/len

m_l_g3 wrote:

> Sorry, but I do not see from your proposal what sort of length "len"
> denotes: is it "length in characters", "length in logical (multi-
> byte)
> characters", or "length in address units"?

length in primitive characters (bytes).

> In fact, in many cases words are commented as taking and/or
> leaving ( addr len ) rather than ( c-addr u ), so there is existing
> practice.

Not in the standards document, hence the change.

--
Peter Knaggs


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Aleksej Saushev  
View profile  
 More options Sep 14 2009, 2:27 am
Newsgroups: comp.lang.forth
From: Aleksej Saushev <a...@inbox.ru>
Date: Mon, 14 Sep 2009 10:27:01 +0400
Local: Mon, Sep 14 2009 2:27 am
Subject: Re: RfD: c-addr/len

Peter Knaggs <p...@bcs.org.uk> writes:
> m_l_g3 wrote:

>> Sorry, but I do not see from your proposal what sort of length "len"
>> denotes: is it "length in characters", "length in logical (multi-
>> byte)
>> characters", or "length in address units"?

> length in primitive characters (bytes).

No, length in address units. Byte length is what is returned by "1 chars",
consider 4-bit address unit.

--
CE3OH...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Knaggs  
View profile  
 More options Sep 16 2009, 8:02 am
Newsgroups: comp.lang.forth
From: Peter Knaggs <p...@bcs.org.uk>
Date: Wed, 16 Sep 2009 13:02:23 +0100
Local: Wed, Sep 16 2009 8:02 am
Subject: Re: RfD: c-addr/len

m_l_g3 wrote:

> Sorry, but I do not see from your proposal what sort of length "len"
> denotes: is it "length in characters", "length in logical (multi-
> byte)
> characters", or "length in address units"?

Would it help if we replace item 1, the definition of "len" with:

   len      length of a character-string in address units       1 cell

--
Peter Knaggs


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David N. Williams  
View profile  
 More options Sep 16 2009, 8:41 am
Newsgroups: comp.lang.forth
From: "David N. Williams" <willi...@umich.edu>
Date: Wed, 16 Sep 2009 08:41:13 -0400
Local: Wed, Sep 16 2009 8:41 am
Subject: Re: RfD: c-addr/len

Peter Knaggs wrote:
> m_l_g3 wrote:

>> Sorry, but I do not see from your proposal what sort of length "len"
>> denotes: is it "length in characters", "length in logical (multi-
>> byte)
>> characters", or "length in address units"?

> Would it help if we replace item 1, the definition of "len" with:

>   len      length of a character-string in address units       1 cell

Shouldn't that be in characters?  (3.1.4.2)

-- David


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Knaggs  
View profile  
 More options Sep 16 2009, 8:49 am
Newsgroups: comp.lang.forth
From: Peter Knaggs <p...@bcs.org.uk>
Date: Wed, 16 Sep 2009 13:49:33 +0100
Local: Wed, Sep 16 2009 8:49 am
Subject: Re: RfD: c-addr/len

David N. Williams wrote:
> Peter Knaggs wrote:
>> m_l_g3 wrote:

>>> Sorry, but I do not see from your proposal what sort of length "len"
>>> denotes: is it "length in characters", "length in logical (multi-
>>> byte)
>>> characters", or "length in address units"?

>> Would it help if we replace item 1, the definition of "len" with:

>>   len      length of a character-string in address units       1 cell

> Shouldn't that be in characters?  (3.1.4.2)

Which type of character?  Primitive characters (3.1.3) possibly but you
could also interpret characters to be extended characters (XChar) which
include variable width characters, which is precisely what we are trying
to get away from.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David N. Williams  
View profile  
 More options Sep 16 2009, 9:31 am
Newsgroups: comp.lang.forth
From: "David N. Williams" <willi...@umich.edu>
Date: Wed, 16 Sep 2009 09:31:35 -0400
Local: Wed, Sep 16 2009 9:31 am
Subject: Re: RfD: c-addr/len
Peter Knaggs wrote:

 > David N. Williams wrote:
 >> Peter Knaggs wrote:
 >>> m_l_g3 wrote:

 >>>>
 >>>> Sorry, but I do not see from your proposal what sort of length "len"
 >>>> denotes: is it "length in characters", "length in logical (multi-
 >>>> byte)
 >>>> characters", or "length in address units"?
 >>>
 >>> Would it help if we replace item 1, the definition of "len" with:
 >>>
 >>>   len      length of a character-string in address units       1 cell
 >>
 >> Shouldn't that be in characters?  (3.1.4.2)
 >
 > Which type of character?  Primitive characters (3.1.3) possibly but you
 > could also interpret characters to be extended characters (XChar) which
 > include variable width characters, which is precisely what we are trying
 > to get away from.

I guess whatever character you meant in this:

   3. Replace "u" with "len" in 3.1.4.2 Character strings:

       A string is specified by a cell pair (c-addr len) representing
       its starting address and length in characters.

It would be a substantial change if it were to be address units,
since 1 CHARS is not necessarily one address unit.

I'm unclear what you intend.  Is the meaning of "character
string" in the above being changed to allow for extended
characters?

-- David


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Peter Knaggs  
View profile  
 More options Sep 16 2009, 9:54 am
Newsgroups: comp.lang.forth
From: Peter Knaggs <p...@bcs.org.uk>
Date: Wed, 16 Sep 2009 14:54:04 +0100
Local: Wed, Sep 16 2009 9:54 am
Subject: Re: RfD: c-addr/len

David N. Williams wrote:

> I guess whatever character you meant in this:

>   3. Replace "u" with "len" in 3.1.4.2 Character strings:

>       A string is specified by a cell pair (c-addr len) representing
>       its starting address and length in characters.

The part of the X:key-ekey proposal which was accepted at the Exeter
meeting included the following:

   3.1.2 Character types
     Characters shall have the following properties:
       – at least one address unit wide;
       – contain at least eight bits;
       – be of fixed width;
       – have a size less than or equal to cell size;
       – be unsigned.

   3.1.2.3 Primitive Character
     A primitive character (pchar) is a character with no restrictions
     on its contents. Unless otherwise stated, a “character” refers to
     a primitive character.

Thus item 3 should be changed to refer to the "length in primitive
characters".  In this case I feel it probably is worth spelling out.

> It would be a substantial change if it were to be address units,
> since 1 CHARS is not necessarily one address unit.

This is part of the problem, what does u mean in CMOVE?  According to
its definition "copy u consecutive characters", while most people
believe it refers to address units.

> I'm unclear what you intend.  Is the meaning of "character
> string" in the above being changed to allow for extended
> characters?

No, but once extended characters are introduced there is the potential
for confusion, hence the introduction of a primitive character. Extended
characters will always be referenced as "extended character" or xchar,
while a "character" is a primitive characters or pchar.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
David N. Williams  
View profile  
 More options Sep 16 2009, 10:06 am
Newsgroups: comp.lang.forth
From: "David N. Williams" <willi...@umich.edu>
Date: Wed, 16 Sep 2009 10:06:50 -0400
Local: Wed, Sep 16 2009 10:06 am
Subject: Re: RfD: c-addr/len
Peter Knaggs wrote:

 > David N. Williams wrote:
 >>
 >> I guess whatever character you meant in this:
 >>
 >>   3. Replace "u" with "len" in 3.1.4.2 Character strings:
 >>
 >>       A string is specified by a cell pair (c-addr len) representing
 >>       its starting address and length in characters.
 >
 > The part of the X:key-ekey proposal which was accepted at the Exeter
 > meeting included the following:
 >
 >   3.1.2 Character types
 >     Characters shall have the following properties:
 >       – at least one address unit wide;
 >       – contain at least eight bits;
 >       – be of fixed width;
 >       – have a size less than or equal to cell size;
 >       – be unsigned.
 >
 >   3.1.2.3 Primitive Character
 >     A primitive character (pchar) is a character with no restrictions
 >     on its contents. Unless otherwise stated, a “character” refers to
 >     a primitive character.
 >
 > Thus item 3 should be changed to refer to the "length in primitive
 > characters".  In this case I feel it probably is worth spelling out.

Me, too!

 >> It would be a substantial change if it were to be address units,
 >> since 1 CHARS is not necessarily one address unit.
 >
 > This is part of the problem, what does u mean in CMOVE?  According to
 > its definition "copy u consecutive characters", while most people
 > believe it refers to address units.

Not me!  :-)  MOVE is for that.

 >> I'm unclear what you intend.  Is the meaning of "character
 >> string" in the above being changed to allow for extended
 >> characters?
 >
 > No, but once extended characters are introduced there is the potential
 > for confusion, hence the introduction of a primitive character. Extended
 > characters will always be referenced as "extended character" or xchar,
 > while a "character" is a primitive characters or pchar.

Good!  I can probably wrap my head around that.

-- David


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile  
 More options Sep 14 2009, 10:51 am
Newsgroups: comp.lang.forth
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Mon, 14 Sep 2009 14:51:25 GMT
Local: Mon, Sep 14 2009 10:51 am
Subject: Re: RfD: c-addr/len

Aleksej Saushev <a...@inbox.ru> writes:
>Peter Knaggs <p...@bcs.org.uk> writes:

>> m_l_g3 wrote:

>>> Sorry, but I do not see from your proposal what sort of length "len"
>>> denotes: is it "length in characters", "length in logical (multi-
>>> byte)
>>> characters", or "length in address units"?

>> length in primitive characters (bytes).

>No, length in address units.

This proposal replaces "u" with "len" in words where "u" denotes the
number of characters.

A change to let this parameter specify a number of address units would
break existing standard programs.  Granted, there are only few
standard programs that don't have an environmental dependency on
1 CHARS = 1, and all maintained systems support these programs, so
there would be little problem with such a change, but I see little
point in having such a change.  Better propose standardizing
1 CHARS = 1.

>Byte length is what is returned by "1 chars",
>consider 4-bit address unit.

Yes, nibble-addressed hardware was the original rationale for
differentiating between aus and chars, but in 15 years there have been
no Forth-94 systems for nibble-addressed hardware, so I consider CHARS
a good solution for a problem that does not exist in practice.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2009: http://www.euroforth.org/ef09/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile  
 More options Sep 12 2009, 8:59 am
Newsgroups: comp.lang.forth
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Sat, 12 Sep 2009 12:59:41 GMT
Local: Sat, Sep 12 2009 8:59 am
Subject: Re: RfD: c-addr/len

Josh Grams <j...@qualdan.com> writes:
>Instead of adopting this (and that "pchar" rename proposal), I think it
>would make much more sense to clarify things by leaving the existing
>"char" and "character" alone, and instead adopting new terminology for
>variable width characters.

Yes.  And the variable-width characters have a new name: xchars.

>As I see it, there's no reason to go changing terminology on people when
>you could instead just adopt new terminology for the new concept.  Much
>less potential for confusion that way.

Apparently some people are confused because a member of the (extended)
character set need not fit into a char, and they think that renaming
chars into pchars will help avoid that confusion.  I am not convinced
of that, but I can live with pchars (although I fear that we will make
mistakes in the renaming, which will increase the confusion rather
than reducing it).

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2009: http://www.euroforth.org/ef09/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile  
 More options Sep 12 2009, 8:33 am
Newsgroups: comp.lang.forth
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Sat, 12 Sep 2009 12:33:55 GMT
Local: Sat, Sep 12 2009 8:33 am
Subject: Re: RfD: c-addr/len

Peter Knaggs <p...@bcs.org.uk> writes:
>c-add/len
>=========

>2009-09-09  Rendered into RfD form, added Forth200x words
>1999-06-22  Original Text by John Rible

>Problem
>=======
>A large number of words use "c-add u" to indicate the address of a
>string (c-addr) and its length (u) on the stack.  With the
>introduction of variable width characters, it is not clear if "u" is
>referring to the number of characters or address units.

Variable-width characters are introduced in the xchars proposal, they
are called xchars there (and can consist of one or more fixed-width
chars in memory).  Variable-width characters don't exist in the
current standard document, and chars don't become variable-width in
xchars.  It's clear in all words that deal with chars that u refers to
the number of chars.

It definitely does not refer to address units in these words (only in
MOVE and ERASE, which don't deal with chars), although given that 1
chars = 1 au in all maintained systems, that distinction is of no
consequence.  Every word that refers to chars says so explicitly, and
every word that refers to aus says so explicitly, and if any word in
the xchars proposal refers to a number of xchars, it will say so
explicitly, too (but I don't think there is such a word).

Examples:

From 17.6.1.0910 CMOVE:
|[...] copy u consecutive characters [...]

From 6.1.1900 MOVE:
|[...]

>4. Add the following to table 3.5 - Environmental Query Strings:

>     /CHARACTER-STRING   n  yes  maximum size of len in characters

What's the point of that?

Any system that cannot deal with strings of the length of the longest
data memory region that can be had from the system is broken.  And
that's not just IMO, but also in Forth-94.

So if the point of that query is to allow systems to not process some
of the strings that can be created, then existing standard programs
would become non-standard.  Such a restriction requires a two-step
process of first declaring the feature obsolescent, and eventually
removing it.  Moreover, I see no point in introducing such a
restriction.

If that's not the point of the query, then I see no point in it.  If
we can process all strings we can create, there is no point in
querying for the maximum size.

Otherwise the proposal looks fine.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2009: http://www.euroforth.org/ef09/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "c-addr/len" by Ed
Ed  
View profile  
 More options Sep 18 2009, 7:43 am
Newsgroups: comp.lang.forth
From: "Ed" <nos...@invalid.com>
Date: Fri, 18 Sep 2009 21:43:39 +1000
Local: Fri, Sep 18 2009 7:43 am
Subject: Re: c-addr/len

I must have missed it.  When did "u most significant digits of
the significand" [of a number] become the length of a string?

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "RfD: c-addr/len" by Josh Grams
Josh Grams  
View profile  
 More options Sep 18 2009, 7:48 am
Newsgroups: comp.lang.forth
From: Josh Grams <j...@qualdan.com>
Date: Fri, 18 Sep 2009 11:48:31 GMT
Local: Fri, Sep 18 2009 7:48 am
Subject: Re: RfD: c-addr/len

That's pretty much how I feel about it...

--Josh


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "c-addr/len" by Anton Ertl
Anton Ertl  
View profile  
 More options Sep 18 2009, 9:46 am
Newsgroups: comp.lang.forth
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Fri, 18 Sep 2009 13:46:26 GMT
Local: Fri, Sep 18 2009 9:46 am
Subject: Re: c-addr/len

"Ed" <nos...@invalid.com> writes:
>> Solution
>> ========
>> Introduce a new pseudo-type ("len") into the document of these words
>> to clarify the intent.  Replacing the "u" with a "len" should improve
>> the documentation of these words.  The words effected are:

>> ...
>> 12.6.1.2143   REPRESENT

>I must have missed it.  When did "u most significant digits of
>the significand" [of a number] become the length of a string?

u has always been the length of the buffer in characters in REPRESENT.
That's the only interpretation of the specification that makes any
sense.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2009: http://www.euroforth.org/ef09/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "RfD: c-addr/len" by Albert van der Horst
Albert van der Horst  
View profile  
 More options Sep 18 2009, 2:03 pm
Newsgroups: comp.lang.forth
From: Albert van der Horst <alb...@spenarnc.xs4all.nl>
Date: 18 Sep 2009 18:03:03 GMT
Local: Fri, Sep 18 2009 2:03 pm
Subject: Re: RfD: c-addr/len
In article <4AAAA038.2050...@bcs.org.uk>, Peter Knaggs  <p...@bcs.org.uk> wrote:

I use the word "sc" in my documentation for the pair.
It means string-constant. It implies that the word
using it must not reach through to the "c-add" and change
characters there. (So e.g. /STRING is okay.)

Anyway, I'm in favour of using a single indication of the pair
whenever they cannot be logically separated.
This allows for a full explanation of "sc" at one place, instead of
limited explanations regarding address units/ character units at
several places. Maybe a distinction between "sc" and "xsc" is in
order.

>Peter Knaggs, P.J.Kna...@exeter.ac.uk

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Aleksej Saushev  
View profile  
 More options Sep 19 2009, 5:16 pm
Newsgroups: comp.lang.forth
From: Aleksej Saushev <a...@inbox.ru>
Date: Sun, 20 Sep 2009 01:16:21 +0400
Local: Sat, Sep 19 2009 5:16 pm
Subject: Re: RfD: c-addr/len

You're not consistent in your opinion that we should use UNICODE:
either 1 CHARS = 1, and you use one-octet encodings on octet-addressing
platforms, or 1 CHARS may be any other value, and you return to address
units, which are octets in many cases. The third way is decoupling Forth
from hardware in full, so that you don't deal with real CPU address units
at all.

--
CE3OH...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bernd Paysan  
View profile  
 More options Sep 19 2009, 5:41 pm
Newsgroups: comp.lang.forth
Followup-To: comp.lang.forth
From: Bernd Paysan <bernd.pay...@gmx.de>
Date: Sat, 19 Sep 2009 23:41:10 +0200
Local: Sat, Sep 19 2009 5:41 pm
Subject: Re: RfD: c-addr/len

Aleksej Saushev wrote:
> You're not consistent in your opinion that we should use UNICODE:
> either 1 CHARS = 1, and you use one-octet encodings on
> octet-addressing platforms, or 1 CHARS may be any other value, and you
> return to address units, which are octets in many cases. The third way
> is decoupling Forth from hardware in full, so that you don't deal with
> real CPU address units at all.

"Unicode" is not just one encoding.  You can have an ASCII-compatible
byte-encoding like UTF-8 (which is what I recommend for Forth with
Unicode), or UTF-16, which is still a variable length encoding (one or
two 16-bit words make a character, i.e. you still need the XCHAR wordset
to work with UTF-16), or UCS4, which will be fixed-size, but is quite
wasteful.

Except a few experiments, all Forth systems have 1 CHARS = 1.  Most
programs rely on that, as well (i.e. they don't use CHARS where they
should, often, they also don't use CHAR+ but 1+ or so).

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Aleksej Saushev  
View profile  
 More options Sep 20 2009, 7:45 am
Newsgroups: comp.lang.forth
From: Aleksej Saushev <a...@inbox.ru>
Date: Sun, 20 Sep 2009 15:45:16 +0400
Local: Sun, Sep 20 2009 7:45 am
Subject: Re: RfD: c-addr/len

Again internal inconsistency. If you want 1 CHARS = 1 always, then you
should get rid of it and assume that you address bytes/characters or
octets, whatever you decide. You return to the way C took.
Then you won't need any conversion of code to use wide characters &c.

So, what is the point in dragging this "CHARS" stuff?

This brings another problem of Standard Forth: lack of internal consistency.
You either have overengineered parts, impractical parts, or lack of standard
tools to solve every day practical tasks (like reading non-textual streams).

Could you and Anton decide for yourself what you really want and stick to it?
Because as for now you easily jump from 1 CHARS being able to hold a byte,
i.e. real character be it 32-bit wide or octet-wide, to 1 CHARS being
addressable unit like it is in C.

Each variant has right to exist and has its own consequences.
If you decide 1 CHARS = 1, then how I access address units? Octets?
If you decide 1 CHARS to be byte width, how do I read non-textual file?

P.S. Most of UNIX text processing programs use "char" and don't care of
locales still, but there's some kind of general consensus that they
should be converted. So what's your argument about? I don't understand it.

Again, you overengineer standard in domain nobody has much experience with,
and skip fixing defects affecting practical everyday tasks.

--
CE3OH...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile  
 More options Sep 20 2009, 2:51 pm
Newsgroups: comp.lang.forth
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Sun, 20 Sep 2009 18:51:11 GMT
Local: Sun, Sep 20 2009 2:51 pm
Subject: Re: RfD: c-addr/len

Aleksej Saushev <a...@inbox.ru> writes:
>an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>> Granted, there are only few
>> standard programs that don't have an environmental dependency on
>> 1 CHARS = 1, and all maintained systems support these programs, so
>> there would be little problem with such a change, but I see little
>> point in having such a change.  Better propose standardizing
>> 1 CHARS = 1.

>You're not consistent in your opinion that we should use UNICODE:
>either 1 CHARS = 1, and you use one-octet encodings on octet-addressing
>platforms,

Yes, that's that way things work without xchars.  With xchars, you can
use variable-width encodings like UTF-8, and UTF-8 is compatible with
8-bit chars.

>or 1 CHARS may be any other value, and you return to address
>units, which are octets in many cases.

And?  The words where u refers to the number of characters still deal
with u chars, not u address units.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2009: http://www.euroforth.org/ef09/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile  
 More options Sep 20 2009, 3:15 pm
Newsgroups: comp.lang.forth
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Sun, 20 Sep 2009 19:15:27 GMT
Local: Sun, Sep 20 2009 3:15 pm
Subject: Re: RfD: c-addr/len

Aleksej Saushev <a...@inbox.ru> writes:
>If you want 1 CHARS = 1 always, then you
>should get rid of it

rid of what?

> and assume that you address bytes/characters or
>octets, whatever you decide.

On word-addressed machines 1 CHARS = 1, but a character is not a byte
or octet, but a word.

>So, what is the point in dragging this "CHARS" stuff?

It's in the current standard and nobody (not even you) has submitted
an RfD for making it obsolescent.

>This brings another problem of Standard Forth: lack of internal consistency.
>You either have overengineered parts, impractical parts, or lack of standard
>tools to solve every day practical tasks (like reading non-textual streams).

>Could you and Anton decide for yourself what you really want and stick to it?
>Because as for now you easily jump from 1 CHARS being able to hold a byte,
>i.e. real character be it 32-bit wide or octet-wide, to 1 CHARS being
>addressable unit like it is in C.

I can only guess what you mean here, but maybe the following can clear
things up: A char is a fixed-width memory unit, and on byte-addressed
machines it is a byte in all maintained systems.  There are also
xchars (in the xchars proposal); they have a variable-width
representation in memory, i.e., each xchar is stored in one or more
chars.  The "len" in this proposal always refers to the number of
chars, not to the number of xchars.

>Each variant has right to exist and has its own consequences.
>If you decide 1 CHARS = 1, then how I access address units?

Easy in that case: c@ and c!

> Octets?

No octets in the standard yet.  If you have a Forth system on a
word-addressed machine, you have to use system-specific code to deal
with octets.

>If you decide 1 CHARS to be byte width, how do I read non-textual file?

Use BIN.

A more interesting case is word-addressed machines: How should they
deal with BIN?  But I guess if the people implementing and programming
on such systems feel the need for standardization in this regard, they
will come forward and start discussing it.

>P.S. Most of UNIX text processing programs use "char" and don't care of
>locales still, but there's some kind of general consensus that they
>should be converted.

Converted to what?  Consensus among whom?

> So what's your argument about? I don't understand it.

Which argument?

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2009: http://www.euroforth.org/ef09/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 61   Newer >
« Back to Discussions « Newer topic     Older topic »