Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Words consuming arguments, was [Re: Is there a better way?]
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 126 - 150 of 176 - Collapse all  -  Translate all to Translated (View all originals) < Older  Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Mark Wills  
View profile  
 More options Nov 13 2012, 3:58 am
Newsgroups: comp.lang.forth
From: Mark Wills <forthfr...@gmail.com>
Date: Tue, 13 Nov 2012 00:58:18 -0800 (PST)
Local: Tues, Nov 13 2012 3:58 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On Nov 13, 8:50 am, Mark Wills <forthfr...@gmail.com> wrote:

On second thoughts, perhaps it's just me not reading the standard in
enough detail. In fairness, the stack picture given in S" does clearly
say:

c-addr u

That is *c-addr* - i.e. the address of a *character*. If I interpret
that correctly, then S" *must* return the address of a *character*,
not the address of a "thing" (for example, the address of entry into
an array of string pointers).

So, whilst the format of a Forth string isn't explicitly described in
English in the standard in the description of S" it *is* there. It's
in the stack sig. The devil is in the details, as they say.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Elizabeth D. Rather  
View profile  
 More options Nov 13 2012, 4:17 am
Newsgroups: comp.lang.forth
From: "Elizabeth D. Rather" <erat...@forth.com>
Date: Mon, 12 Nov 2012 23:17:43 -1000
Local: Tues, Nov 13 2012 4:17 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On 11/12/12 10:58 PM, Mark Wills wrote:

A couple of important points:

The definition of 6.1.0980 COUNT that you cite explicitly states that it
is for a "counted string" which is defined in 2.1 Definitions of Terms:
"counted string: A data structure consisting of one character containing
a length followed by zero or more contiguous data characters. Normally,
counted strings contain text."

The definition of S" however makes *no mention* of counted strings, and
specifies absolutely nothing about its storage format. The similar word
C" *does* return the address of a counted string; that is the difference
between them.

Counted strings have been around Forth for a long time (since 1970),
because they're a very efficient format that's useful for a wide variety
of things, and is still widely used internally. But this format is not
*mandated* except in specific circumstances such as C" above.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andrew Haley  
View profile  
 More options Nov 13 2012, 5:26 am
Newsgroups: comp.lang.forth
From: Andrew Haley <andre...@littlepinkcloud.invalid>
Date: Tue, 13 Nov 2012 04:26:21 -0600
Local: Tues, Nov 13 2012 5:26 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]

I'm responding to what I quoted; nothing more, nothing less.  If you
disagee with the part you wrote that I responded to, feel free to say
so.

> BTW, this is the original definition of COUNT:

> "      COUNT         addr1  ---  addr2  n                    L0
>               Leave the byte address addr2 and byte count n of a
>               message text beginning at addr1.  It is presumed that
>               the first byte at addr1 contains the text byte count
>               and the actual text starts with the second byte.
>               Typically COUNT is followed by TYPE."

> It's also more accurate.

In what more is it more accurate than the standard definition?

> It doesn't assume counted strings.

Yes it does.  Read it again.

Andrew.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Albert van der Horst  
View profile  
 More options Nov 13 2012, 8:44 am
Newsgroups: comp.lang.forth
From: alb...@spenarnc.xs4all.nl (Albert van der Horst)
Date: 13 Nov 2012 13:44:31 GMT
Local: Tues, Nov 13 2012 8:44 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
In article <6a42ffc1-3ac7-4099-905d-e5e89bccd...@b12g2000vbg.googlegroups.com>,
Mark Wills  <forthfr...@gmail.com> wrote:

Well, no. My forth implementation revolves around strings with a
cell count. I.e. even on a 64 bit system
   ' APE >NFA @
points to a ciforth-regular string stored in memory.
So a subsequent fetch ( @ ) gives the length of the string.
A $@ gives something to be passable to TYPE.

COUNT is aliased as $@-BD .

I feel not much constrained by the Standard, although
WORD and FIND have become loadable extension.

Hell no. I would have had a lot of trouble using sensible strings
in the core of my Forth, had the standard prescribed this.

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile  
 More options Nov 13 2012, 9:06 am
Newsgroups: comp.lang.forth
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Tue, 13 Nov 2012 13:53:13 GMT
Local: Tues, Nov 13 2012 8:53 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]

At least for counted strings, which I don't recommend using.

It is interesting, though, that some people write about COUNT as if
its specification was more abstract than it is.

>There is no clue to the format of a string in memory in the ANS
>definition of S" :

>6.1.2165 S"
>s-quote CORE
[...]
>Return c-addr and u describing a string consisting of the characters
>ccc. A program shall not alter the returned string.

The format of a c-addr u string is just that the first char is at
c-addr, the next char is at c-addr char+ etc.  Hmm, but is that
anywhere in the standard text?

>That's disappointing. If the format of a string is mandatory, then the
>appropriate place to describe it (or refer to it) is within the
>definition of the word S" not COUNT.

COUNT is for counted strings, S" is for c-addr u strings, and whether
is stores the string as counted string is up to the implementation (a
high-quality S" can deal with arbitrary-length strings, there counted
strings are not an option).  Neither says anything about the
arrangement of the characters themselves.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2012: http://www.euroforth.org/ef12/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex McDonald  
View profile  
 More options Nov 13 2012, 9:39 am
Newsgroups: comp.lang.forth
From: Alex McDonald <b...@rivadpm.com>
Date: Tue, 13 Nov 2012 06:39:27 -0800 (PST)
Local: Tues, Nov 13 2012 9:39 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On Nov 13, 2:06 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

The arrangement is specified. There's the usual Western ASCII right to
left bias specifically in the normative "Terms, notation, and
references" section of the standard.

character string: Data space that is associated with a sequence of
consecutive character-aligned addresses. Character strings usually
contain text. Unless otherwise indicated, the term “string” means
“character string”.

Counted strings get a mention too;

counted string: A data structure consisting of one character
containing a length followed by zero or more contiguous data
characters. Normally, counted strings contain text.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Wills  
View profile  
 More options Nov 13 2012, 10:55 am
Newsgroups: comp.lang.forth
From: Mark Wills <forthfr...@gmail.com>
Date: Tue, 13 Nov 2012 07:55:27 -0800 (PST)
Local: Tues, Nov 13 2012 10:55 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On Nov 13, 2:06 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

Of course. You're right. Elizabeth too. I really should pay more
attention. Somehow I never made the mental *dis*connect between
counted strings and c-addr u strings, which are very different from
each other.

I implemented them both in my system and they work just fine. If I do
S" HELLO" I get 5 and an address on the stack. On the other hand, when
I use file streams, I get a counted stream back (by design):

PAD myFile #GET ABORT" Can't read from the file"
PAD COUNT TYPE

Yet somehow, I'd never really considered them to be different, just
the same, but in different states: c-addr u is for carrying around on
the stack when you want to do work with them. Counted strings on the
other hand is how they are stored in memory.

I wonder if my version of S" is a hybrid of both techniques? It's the
classic state-smart implementation as far as I'm aware. If used in
interpretation state, it places the string in a transitory/temporary
memory area and pushes len addr to the stack. If compiled, it compiles
(S") len <s t r i n g> to memory. At run time, len is pushed by (S")
and the address is *derived* (by (S")) by examining the Forth VM IP.

Interesting subject.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andrew Haley  
View profile  
 More options Nov 13 2012, 11:20 am
Newsgroups: comp.lang.forth
From: Andrew Haley <andre...@littlepinkcloud.invalid>
Date: Tue, 13 Nov 2012 10:20:04 -0600
Local: Tues, Nov 13 2012 11:20 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]

Mark Wills <forthfr...@gmail.com> wrote:
> The definition of COUNT is quite interesting. It actually indirectly,
> and presumably un-intentionally mandates the format of a string in
> Forth.

It mandates (assumes?) the format of a counted string, and quite
deliberately so.

Andrew.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Brad Eckert  
View profile  
 More options Nov 13 2012, 11:50 am
Newsgroups: comp.lang.forth
From: Brad Eckert <hwfw...@gmail.com>
Date: Tue, 13 Nov 2012 08:50:42 -0800 (PST)
Local: Tues, Nov 13 2012 11:50 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]

On Tuesday, November 13, 2012 1:50:25 AM UTC-7, M.R.W Wills wrote:
> That's disappointing. If the format of a string is mandatory, then the
> appropriate place to describe it (or refer to it) is within the
> definition of the word S" not COUNT.

There's no reason S" can't store the string length as a long. ANS made an effort to promote the ( c-addr ulength ) string format and also support legacy counted strings.

I don't use S" and ." in my larger apps anyway, because of internationalization needs. In any given application domain, you can probably throw out half of ANS and not miss it.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile  
 More options Nov 13 2012, 1:09 pm
Newsgroups: comp.lang.forth
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Tue, 13 Nov 2012 18:02:27 GMT
Local: Tues, Nov 13 2012 1:02 pm
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]

Alex McDonald <b...@rivadpm.com> writes:
>On Nov 13, 2:06=A0pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
>wrote:
>> Neither says anything about the
>> arrangement of the characters themselves.

>The arrangement is specified. There's the usual Western ASCII right to
>left bias specifically in the normative "Terms, notation, and
>references" section of the standard.

>character string: Data space that is associated with a sequence of
>consecutive character-aligned addresses. Character strings usually
>contain text. Unless otherwise indicated, the term =93string=94 means
>=93character string=94.

I don't see a clear specification of the arrangement.  It says
"sequence of consecutive character-aligned addresses", but that does
not say anything about the order of the characters.  I don't see any
right-to-left bias here, either.

Of course, given that there has not been a question on the order of
characters since Forth-94 came out, it is obviously unnecessary to
specify the order of characters in more detail, at least for the main
purpose of the standard.

>Counted strings get a mention too;

>counted string: A data structure consisting of one character
>containing a length followed by zero or more contiguous data
>characters. Normally, counted strings contain text.

This clearly specifies the count concretely, but the other characters
still have no specified order.

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2012: http://www.euroforth.org/ef12/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alex McDonald  
View profile  
 More options Nov 13 2012, 3:10 pm
Newsgroups: comp.lang.forth
From: Alex McDonald <b...@rivadpm.com>
Date: Tue, 13 Nov 2012 12:10:34 -0800 (PST)
Local: Tues, Nov 13 2012 3:10 pm
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On Nov 13, 6:09 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

3.1.4.2 Character strings
A string is specified by a cell pair (c-addr u) representing its
starting address and length in characters.

For non-Western orderings, this is possible if c-addr points at the
starting address (c-addr+u) and the string extends (conceptually)
leftwards or through decreasing addresses. But then;

17.6.1.0245 /STRING “slash-string” STRING
( c-addr1 u1 n -- c-addr2 u2 )
Adjust the character string at c-addr1 by n characters. The resulting
character string, specified by c-addr2 u2, begins at c-addr1 plus n
characters and is u1 minus n characters long.

makes such an ordering impossible to implement. Of course, the string
could be held backwards ("sdrawkcab"), but then that introduces a
whole set of other issues; for instance, how would we interpret the
result of 1 /STRING ? How should a SEARCH for the character 'a' in the
example given operate?

> Of course, given that there has not been a question on the order of
> characters since Forth-94 came out, it is obviously unnecessary to
> specify the order of characters in more detail, at least for the main
> purpose of the standard.

Many (most?) standards have this issue, so I wouldn't consider it a
defect.

> >Counted strings get a mention too;

> >counted string: A data structure consisting of one character
> >containing a length followed by zero or more contiguous data
> >characters. Normally, counted strings contain text.

> This clearly specifies the count concretely, but the other characters
> still have no specified order.

As above, since the result of COUNT must be treatable by /STRING;
therefore they must have the order left to right at ascending
addresses.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile  
 More options Nov 14 2012, 6:24 am
Newsgroups: comp.lang.forth
From: "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
Date: Wed, 14 Nov 2012 06:29:08 -0500
Local: Wed, Nov 14 2012 6:29 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
"Andrew Haley" <andre...@littlepinkcloud.invalid> wrote in message

news:7vOdnf3g2cvQvT_NnZ2dnUVZ8mGdnZ2d@supernews.com...
> Rod Pemberton <do_not_h...@notemailnotz.cnm> wrote:

...

> > BTW, this is the original definition of COUNT:

> > "      COUNT         addr1  ---  addr2  n                    L0
> >               Leave the byte address addr2 and byte count n of a
> >               message text beginning at addr1.  It is presumed that
> >               the first byte at addr1 contains the text byte count
> >               and the actual text starts with the second byte.
> >               Typically COUNT is followed by TYPE."

> > It's also more accurate.

> [How] is it more accurate than the standard definition?

It doesn't assume counted strings.

> > It doesn't assume counted strings.

> Yes it does.  Read it again.

No it doesn't.  Read it again.

This time look for the word "presumed".
Look up the definition for "presumed".

Rod Pemberton


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile  
 More options Nov 14 2012, 6:26 am
Newsgroups: comp.lang.forth
From: "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
Date: Wed, 14 Nov 2012 06:30:22 -0500
Local: Wed, Nov 14 2012 6:30 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
"Elizabeth D. Rather" <erat...@forth.com> wrote in message
news:mdqdnYH8wqQ6BjzNnZ2dnUVZ_rWdnZ2d@supernews.com...

What does "presumed" mean to you?

Rod Pemberton


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile  
 More options Nov 14 2012, 6:30 am
Newsgroups: comp.lang.forth
From: "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
Date: Wed, 14 Nov 2012 06:34:56 -0500
Local: Wed, Nov 14 2012 6:34 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
"Mark Wills" <forthfr...@gmail.com> wrote in message

news:2ae8f57b-f995-4732-be93-6ffb90491b0f@q1g2000vbx.googlegroups.com...
...

> Of course. You're right. Elizabeth too. I really should pay more
> attention. Somehow I never made the mental *dis*connect between
> counted strings and c-addr u strings, which are very different from
> each other.

Why?  Why are they different?  Why should they be different?

I have but one string format for Forth.  It's not a counted string format.

Why would I implement two string formats?

An address describes a string adequately.  An address and length does so
too.

> Yet somehow, I'd never really considered them to be different, just
> the same, but in different states: c-addr u is for carrying around on
> the stack when you want to do work with them. Counted strings on the
> other hand is how they are stored in memory.

I see your confusion as resulting from a lack of familiarity
with C's string model.  Ms. Rather has demonstrated similar
confusion in the past when discussing the merits of null
terminated strings of C versus counted strings of Forth and PL/1.

Rod Pemberton


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Rod Pemberton  
View profile  
 More options Nov 14 2012, 6:38 am
Newsgroups: comp.lang.forth
From: "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
Date: Wed, 14 Nov 2012 06:42:19 -0500
Local: Wed, Nov 14 2012 6:42 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
"Mark Wills" <forthfr...@gmail.com> wrote in message

news:6a42ffc1-3ac7-4099-905d-e5e89bccd13b@b12g2000vbg.googlegroups.com...
...

> The definition of COUNT is quite interesting. It actually indirectly,
> and presumably un-intentionally mandates the format of a string in
> Forth.

> [This is at least second time the ANS COUNT definition
> was posted in this thread ...]

...

Yes.  The ANS COUNT definition defines a string as counted, or more
precisely assumes a counted string.  As long as the stack arguments are
functionally correct, the verbal definition is irrelevant.  fig-Forth by
using the word "presumes" doesn't define a string as counted.  It allows
for non-counted string implementations also.

> Note the last line: u *is* the contents of the character at c-addr1...

Yes.  Also note that Ms. Rather has stated in the past that the count isn't
required to be a character in size for Forth.  IIRC, she suggested a word
(16-bits) be used for a the count of a counted string.

> So, there's no way around it in actual fact. If you wanted to
> implement strings in a different way under the covers, you'd be
> prevented from doing so.

Wrong, or it should be wrong if it isn't.  Officially, the ANS Forth
specifications don't support a machine model.  Defining a string format
requires a machine model model to be defined in part.  Numerous Forth
"experts" here have even stated ANS doesn't define a machine model.  Earlier
Forth specifications did define a machine model.  Supposedly, those
specifications were very problematic because the fixed the sizes of integers
and addresses were hardcoded and inflexible.  Well, string formats are no
different and would suffer the same problem.  I.e., you have to take the ANS
COUNT definition requiring counted strings as "wrong".

> [more ANS stuff]

ANS Forth specification has a variety of errors in it.  E.g., "immediacy" is
not required for ; semicolon.

> If the format of a string is mandatory, then the
> appropriate place to describe it (or refer to it) is within the
> definition of the word S" not COUNT.

The appropriate place is *before* definitions, but the string format
shouldn't be defined.  If it is, then the specification hasn't been
fully abstracted from the machine model, i.e., the Forth specification
authors failed in their jobs.

Rod Pemberton


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andrew Haley  
View profile  
 More options Nov 14 2012, 7:03 am
Newsgroups: comp.lang.forth
From: Andrew Haley <andre...@littlepinkcloud.invalid>
Date: Wed, 14 Nov 2012 06:03:23 -0600
Local: Wed, Nov 14 2012 7:03 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]

In which case it would be less accurate; if your claim were true,
which it isn't.

FWIW, this isn't the "original" definition of COUNT .  COUNT dates
from before fig-FORTH, and we'd need Elizabeth's help to find its
earliest definition.

>> > It doesn't assume counted strings.

>> Yes it does.  Read it again.

> No it doesn't.  Read it again.

> This time look for the word "presumed".
> Look up the definition for "presumed".

It's the past participle of the verb "to presume", which means
variously, to assume to be true, to take for granted, to suppose, etc.

fig-FORTH isn't a standard; it is defined by its implementation.  The
fig-FORTH implementation of COUNT is

: COUNT    DUP 1+ SWAP C@  ;

Andrew.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Wills  
View profile  
 More options Nov 14 2012, 8:01 am
Newsgroups: comp.lang.forth
From: Mark Wills <forthfr...@gmail.com>
Date: Wed, 14 Nov 2012 05:01:27 -0800 (PST)
Local: Wed, Nov 14 2012 8:01 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On Nov 14, 11:30 am, "Rod Pemberton" <do_not_h...@notemailnotz.cnm>
wrote:

> "Mark Wills" <forthfr...@gmail.com> wrote in message

> news:2ae8f57b-f995-4732-be93-6ffb90491b0f@q1g2000vbx.googlegroups.com...
> ...

> > Of course. You're right. Elizabeth too. I really should pay more
> > attention. Somehow I never made the mental *dis*connect between
> > counted strings and c-addr u strings, which are very different from
> > each other.

> Why?  Why are they different?  Why should they be different?

They're different because they're, well... different!

S" Hello" results in c-addr u

C" Hello" results in addr and requires COUNT to convert it to c-addr
u.

The advantage of the latter is it is more convenient to carry about on
the stack; you only have to carry the address around, not the address
*and* the length. When you want the length, COUNT will get it for you.

They are different. Clearly. Though the *storage format* (how it is
stored in memory) is probably the same. A C" string, when executed,
will push the address of the count cell. A S" string, when executed
will push the address of the the first character, and the length.
*Intenrally* they are probably stored in memory in the same way, so,
yes, I can see why you might say they are the same!

I say *probably* stored in the same way, because they don't have to
be.

In the early days, my system compiled a string like this:

S" hello"

LIT addr LIT 5 branch xxx  h e l l o _

Where the branch jumps over the string payload and _ is an alignment
padding byte. This is probably how most beginners approach it. Later,
I modified it to store a counted string, so S" hello" now compiles:

(S") 5 h e l l o

while C" hello" compiles (C") 5 h e l l o

Stored in the same way, but different effects at run time.

That's what I was getting at, though I admit probably not explained
very well.

> I have but one string format for Forth.  It's not a counted string format.

> Why would I implement two string formats?

You don't have to. Just implement counted strings. COUNT converts a
counted string to a c-addr u string that words like TYPE need.

> I see your confusion as resulting from a lack of familiarity
> with C's string model.  Ms. Rather has demonstrated similar
> confusion in the past when discussing the merits of null
> terminated strings of C versus counted strings of Forth and PL/1.

> Rod Pemberton

<troll>
And don't get me started on C's crack-smoking "bunch o' bytes with a /
0 at the end"! Pants method of string storage! For reasons well
trodden in previous CLF threads.
</troll>

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Stephen Pelc  
View profile  
 More options Nov 14 2012, 9:15 am
Newsgroups: comp.lang.forth
From: stephen...@mpeforth.com (Stephen Pelc)
Date: Wed, 14 Nov 2012 14:14:12 GMT
Local: Wed, Nov 14 2012 9:14 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On Wed, 14 Nov 2012 06:42:19 -0500, "Rod Pemberton"

<do_not_h...@notemailnotz.cnm> wrote:
>Yes.  Also note that Ms. Rather has stated in the past that the count isn't
>required to be a character in size for Forth.  IIRC, she suggested a word
>(16-bits) be used for a the count of a counted string.

In order to test the ANS model, JaxForth used 16 bit characters. As
a consequence, the unit of COUNT was 16 bit items on a byte-addressed
machine.

: count   ( addr1 -- addr2 len )
  dup w@ swap 2 +
;

There is common practice in some Forth shops to use COUNT to step
through memory. To resolve this, and to cope with multi-byte
character sets including UTF-8, the Forth200x document treats the
word "character" as meaning a primitive character, usually a byte,
from which wide characters and multibyte characters are derived.

COUNT now refers to a byte count followed by primitive characters.

Counted strings are just a storage mechanism, in the same way
that zero terminated strings are a storage mechanism. Serious
string libraries work in terms of objects or structures whose
internal format is unlikely to be either 8-bit counted or zero
terminated.

These days, in order to write internationalised applications
for OS X and Windows, the programmer is likely to standardise
on UTF-16. See http://site.icu-project.org/ for an example
library. As a result, the relevance of counted and zero
terminated strings in larger applications is really a kernel
issue rather than an application issue.

Stephen

--
Stephen Pelc, stephen...@mpeforth.com
MicroProcessor Engineering Ltd - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, fax: +44 (0)23 8033 9691
web: http://www.mpeforth.com - free VFX Forth downloads


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Anton Ertl  
View profile  
 More options Nov 14 2012, 10:54 am
Newsgroups: comp.lang.forth
From: an...@mips.complang.tuwien.ac.at (Anton Ertl)
Date: Wed, 14 Nov 2012 15:28:21 GMT
Local: Wed, Nov 14 2012 10:28 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]

stephen...@mpeforth.com (Stephen Pelc) writes:
>On Wed, 14 Nov 2012 06:42:19 -0500, "Rod Pemberton"
><do_not_h...@notemailnotz.cnm> wrote:

>>Yes.  Also note that Ms. Rather has stated in the past that the count isn't
>>required to be a character in size for Forth.  IIRC, she suggested a word
>>(16-bits) be used for a the count of a counted string.

>In order to test the ANS model, JaxForth used 16 bit characters. As
>a consequence, the unit of COUNT was 16 bit items on a byte-addressed
>machine.

>: count   ( addr1 -- addr2 len )
>  dup w@ swap 2 +
>;

A standard implementation of COUNT is:

: count ( c-addr1 -- c-addr2 u )
  dup c@ swap char+ ;

This works on all standard Forth systems, even on JaxForth, and if
COUNT in JaxForth was defined in high-level, I would be surprised if
it did not use a standard-compliant definition of COUNT.

In any case, COUNT in every Forth standard I know of is required to
use a character-sized count.  In Forth-94 characters can be wider than
8 bits, however.

>COUNT now refers to a byte count followed by primitive characters.

Or, more generally, a (p)char count followed by (p)chars.

>These days, in order to write internationalised applications
>for OS X and Windows, the programmer is likely to standardise
>on UTF-16.

UTF-16 is the dead-end extension of UCS-2, which became obsolete with
Unicode 2.0.  It is present in some systems that were designed around
1990, like Windows NT and Java, but even there it's not universal.
Even if you have to interface with UTF-16-based Windows API functions,
I would recommend designing your interfaces such that they continue to
work if you switch to UTF-8-based API functions (switching to stuff
like Big5 and GB then is probably no additional effort).

- anton
--
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2012: http://www.euroforth.org/ef12/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Albert van der Horst  
View profile  
 More options Nov 14 2012, 11:11 am
Newsgroups: comp.lang.forth
From: alb...@spenarnc.xs4all.nl (Albert van der Horst)
Date: 14 Nov 2012 16:11:08 GMT
Local: Wed, Nov 14 2012 11:11 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
In article <257b37f9-69a9-4b6c-89e6-5528b89fa...@h16g2000vby.googlegroups.com>,
Mark Wills  <forthfr...@gmail.com> wrote:

The main advantage shouldn't be missed. Using c-addr u consistently,
an implementation can interpret a buffer without copying things around
all the time. If you use FIND you're almost obliged to.

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bernd Paysan  
View profile  
 More options Nov 14 2012, 11:54 am
Newsgroups: comp.lang.forth
From: Bernd Paysan <bernd.pay...@gmx.de>
Date: Wed, 14 Nov 2012 17:54:09 +0100
Local: Wed, Nov 14 2012 11:54 am
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]

Anton Ertl wrote:
>>These days, in order to write internationalised applications
>>for OS X and Windows, the programmer is likely to standardise
>>on UTF-16.

> UTF-16 is the dead-end extension of UCS-2, which became obsolete with
> Unicode 2.0.  It is present in some systems that were designed around
> 1990, like Windows NT and Java, but even there it's not universal.
> Even if you have to interface with UTF-16-based Windows API functions,
> I would recommend designing your interfaces such that they continue to
> work if you switch to UTF-8-based API functions (switching to stuff
> like Big5 and GB then is probably no additional effort).

Given that both Cocoa and Win32 like "zero terminated strings", you
better not use their strings as native objects in Forth, but convert on
the fly when you call Cocoa or Win32.  We don't have that in libcc.fs
now, but I suggest we should have a datatype string, which is addr len
in Forth, and 0-terminated char* in C, and converted on call/return.  
For UTF-16 a string16 type (which still is UTF-8 on the Forth side).

It should be noted that Cocoa strings are a rather complex object, and
when you feed data in or get data out, you can select the encoding -
both UTF-8 and UTF-16 are first class encodings.  Once it's inside
Cocoa's string class, you access it through Objective-C methods, and
don't care about internal repesentation.

It's a lot better with Xlib.  There, strings are represented as addr len
entities (yes, *the* addr len you use in Forth anyways, number of
bytes/pchars for Utf8, no zero termination needed), with Utf8 as the
preferred first-class encoding.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Elizabeth D. Rather  
View profile  
 More options Nov 14 2012, 2:06 pm
Newsgroups: comp.lang.forth
From: "Elizabeth D. Rather" <erat...@forth.com>
Date: Wed, 14 Nov 2012 09:06:54 -1000
Local: Wed, Nov 14 2012 2:06 pm
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On 11/14/12 1:30 AM, Rod Pemberton wrote:

 From dictionary.com:

pre·sume   [pri-zoom] Show IPA verb, pre·sumed, pre·sum·ing.
verb (used with object)
1. to take for granted, assume, or suppose: I presume you're tired after
your drive.
2. Law . to assume as true in the absence of proof to the contrary.

In other words, it's a synonym to 'assume', so all definitions of COUNT
both assume and presume a counted string format.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Elizabeth D. Rather  
View profile  
 More options Nov 14 2012, 2:36 pm
Newsgroups: comp.lang.forth
From: "Elizabeth D. Rather" <erat...@forth.com>
Date: Wed, 14 Nov 2012 09:36:46 -1000
Local: Wed, Nov 14 2012 2:36 pm
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On 11/14/12 3:01 AM, Mark Wills wrote:

They're different in that a "counted string" is a storage format, and
'c-addr u' is a stack notation providing the address and length of a
string independent of its storage format.

Address alone cannot define a string; you need some way to know how long
the string is, either by knowing its storage format (e.g., 'counted
string' or null-terminated) or by having a length on the stack in
addition to the address.

Cheers,
Elizabeth

--
==================================================
Elizabeth D. Rather   (US & Canada)   800-55-FORTH
FORTH Inc.                         +1 310.999.6784
5959 West Century Blvd. Suite 700
Los Angeles, CA 90045
http://www.forth.com

"Forth-based products and Services for real-time
applications since 1973."
==================================================


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
humptydumpty  
View profile  
 More options Nov 14 2012, 3:19 pm
Newsgroups: comp.lang.forth
From: humptydumpty <ouat...@gmail.com>
Date: Wed, 14 Nov 2012 12:19:22 -0800 (PST)
Local: Wed, Nov 14 2012 3:19 pm
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]

Yes, in `c-addr u' case I have *explicitly* a zone of memory that will be
treated as a `string'.  *No more information* is needed.

In `a' case I need *more information* to describe how to treat this address.

Have a nice day,
humptydumpty


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Hugh Aguilar  
View profile  
 More options Nov 14 2012, 6:01 pm
Newsgroups: comp.lang.forth
From: Hugh Aguilar <hughaguila...@yahoo.com>
Date: Wed, 14 Nov 2012 15:01:25 -0800 (PST)
Local: Wed, Nov 14 2012 6:01 pm
Subject: Re: Words consuming arguments, was [Re: Is there a better way?]
On Nov 14, 7:15 am, stephen...@mpeforth.com (Stephen Pelc) wrote:

> There is common practice in some Forth shops to use COUNT to step
> through memory. To resolve this, and to cope with multi-byte
> character sets including UTF-8, the Forth200x document treats the
> word "character" as meaning a primitive character, usually a byte,
> from which wide characters and multibyte characters are derived.

> COUNT now refers to a byte count followed by primitive characters.

Even when I was 18 and programming the Vic-20, I knew better than to
use COUNT for stepping through an array of chars. For one thing, it
won't work if chars are assumed to be 2 bytes and/or the count is 2
bytes, which was possible even on a 6502 (especially the count being 2
bytes, although chars were generally always 1 byte in those days). For
another thing, it makes for unreadable code, as the reader has to
wonder: "Why is COUNT being used? What is being counted?". I wrote a
word that did the same thing --- I think I called it c@c+ and there
was also w@w+ that was for stepping through an array of words --- or
something like that. I knew about the concept of abstraction way back
then when most of the Forth community was doing things like using
COUNT to step through char arrays or using 2+ for word arrays on the
assumption that words were inherently 2 bytes in size, and so forth.
Back in the 1980s, there seemed to be a lot of Forthers who didn't
understand basic programming concepts such as abstraction --- and,
that is true today too.

I don't think that I will support counted strings at all in Straight
Forth (I mean, strings with the count stored in the 0'th array
element). I will only support adr,len strings (the address of the char
array and the size of the array on the stack). I have a doubles stack
that is distinct from the parameter stack. This is for double numbers
and also for adr,len strings. I will also have a float stack that is
distinct from the parameter stack and is for floating-point numbers. I
may actually have two float stacks, one for low-precision and one for
high-precision floats. Modern processors have beaucoup registers, so
there is no need to mix data types together on the parameter stack,
which results in a lot of ugly stack-juggling --- each data type will
have its own stack in Straight Forth --- each stack will have a
register dedicated as its stack-pointer.

BTW Stephen --- I downloaded your VFX evaluation Forth system. So far
all I have done is cycle through the "tip of the day." When I get time
however, I will try to compile and run the novice package. If VFX
works correctly, this will be the first time ever. In all of the other
Forth systems that I have tried (SwiftForth, Gforth, Win32Forth and
FICL), doing this revealed bugs in the Forth system, and I had to
rewrite some portion of the novice package to work-around the bug
(FICL had so many problems that I didn't support it, but the others
did get supported).


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 126 - 150 of 176 < Older  Newer >
« Back to Discussions « Newer topic     Older topic »