Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

assembler code in gforth

275 views
Skip to first unread message

RLN37

unread,
Apr 11, 2010, 1:20:21 PM4/11/10
to
I'm trying to run syscalls386.fs in gforth. The first defined word in
that file starts with:
code syscall0
.d di ) ax mov


I get an immediate hangup on that initial .d (which I know is trying
to indicate double-word notation).
But gforth says "unrecognized word" on that .d

Can anyone give me a clue to why this assembly code is not accepted?
I know millions of others run this file successfully; where am I going
wrong?

Yes, I'm pretty much a newbie to this level of forth.
Grateful for any help you can furnish.

Baseball Bob

Bernd Paysan

unread,
Apr 11, 2010, 1:37:26 PM4/11/10
to
RLN37 wrote:

> I'm trying to run syscalls386.fs in gforth. The first defined word in
> that file starts with:
> code syscall0
> .d di ) ax mov
>
>
> I get an immediate hangup on that initial .d (which I know is trying
> to indicate double-word notation).
> But gforth says "unrecognized word" on that .d
>
> Can anyone give me a clue to why this assembly code is not accepted?
> I know millions of others run this file successfully; where am I going
> wrong?

Works here - but this is limited to x86 platforms, and the assembler
must be loaded (that should be the case by default, but maybe it isn't
in your case).

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://www.jwdt.com/~paysan/

Krishna Myneni

unread,
Apr 12, 2010, 7:02:37 PM4/12/10
to

Are you using Gforth on a 32-bit system? The assembly code will not
work on 64-bit Gforth, since the instructions are 32 bit.

Krishna Myneni

RLN37

unread,
Apr 12, 2010, 8:40:33 PM4/12/10
to

Thanks for your comments Bernd. I continue to struggle with this. I
have gforth loaded on a Windows Vista system,
and that system seems to handle syscall386.fs just fine. That is, the
assembly-code compilation seems to work
as expected. But on my Ubuntu system, I still get the problem I first
reported. It makes me wonder which assembler
is loaded in the gforth running under Ubuntu. Do you know how to be
sure the 386 assembler is the one that is active?
Also, when I type WORDS (with ASSEMBLER being the first wordlist
listed by ORDER), the system just comes back with OK;
no words get listed. Makes me wonder if ANY assembler is loaded. How
can I force the system to load the 386 assembler?
Thanks for your help on my behalf.
Baseball Bob

RLN37

unread,
Apr 13, 2010, 9:36:15 AM4/13/10
to

Krishna,

Thanks for your comments. I am using a 32-bit system with Ubuntu. On
the Vista system mentioned above,
all works fine, but not on the Ubuntu system. On the Vista system, if
I type ASSEMBLER WORDS, I get
a whole raft of assembly words typed out. But on the Ubuntu system, I
just get 'ok'. Sure looks
like there is no assembler vocabulary present. How does a person
force the 386 assembler to
get loaded? The gforth user manual says quite a lot about assemblers,
but never says how to
get one loaded. It indicates there are several assemblers available,
for different processor
types, but never says where to get them, or how to load them. Maybe
there is some other information
resource I'm not aware of?? Thanks for any guidance you might
provide.
Baseball Bob

Anton Ertl

unread,
Apr 13, 2010, 10:08:25 AM4/13/10
to
RLN37 <Rl...@YAHOO.COM> writes:
>Thanks for your comments. I am using a 32-bit system with Ubuntu. On
>the Vista system mentioned above,
>all works fine, but not on the Ubuntu system. On the Vista system, if
>I type ASSEMBLER WORDS, I get
>a whole raft of assembly words typed out. But on the Ubuntu system, I
>just get 'ok'. Sure looks
>like there is no assembler vocabulary present. How does a person
>force the 386 assembler to
>get loaded?

include arch/386/asm.fs

> The gforth user manual says quite a lot about assemblers,
>but never says how to
>get one loaded.

It's normally loaded already on systems built on the 386 architecture.
The Ubuntu packager(s) must have done something that prevented that.
It's certainly pre-loaded on the Gforth (0.6.2) on Debian Lenny on the
386 architecture.

> It indicates there are several assemblers available,
>for different processor
>types, but never says where to get them, or how to load them.

Of course, it pre-loads the assembler for the architecture you run on.
If you want to know what assemblers came with the package, try

ls /usr/share/gforth/*/arch/*/asm.fs

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2009: http://www.euroforth.org/ef09/

Charles G Montgomery

unread,
Apr 13, 2010, 7:52:22 PM4/13/10
to
Anton Ertl an...@mips.complang.tuwien.ac.at wrote:

> RLN37 <Rl...@YAHOO.COM> writes:
>> ...


>> The gforth user manual says quite a lot about assemblers,
>>but never says how to
>>get one loaded.
>
> It's normally loaded already on systems built on the 386 architecture.
> The Ubuntu packager(s) must have done something that prevented that.
> It's certainly pre-loaded on the Gforth (0.6.2) on Debian Lenny on the
> 386 architecture.

> ...
>
> - anton

On my Lenny (Debian stable) system:

/u01/home/cgm/QPO/wavea [514]$ gforth
Gforth 0.6.2, Copyright (C) 1995-2003 Free Software Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit
assembler ok
words
ok
bye
/u01/home/cgm/QPO/wavea [515]$ dpkg -s gforth
Package: gforth
Status: install ok installed
Priority: optional
Section: interpreters
Installed-Size: 2432
Maintainer: Eric Schwartz (Skif) <emsc...@debian.org>
Architecture: i386
Version: 0.6.2-7.3
Depends: libc6 (>= 2.7-1)
Conffiles:
/etc/emacs/site-start.d/50gforth.el 48d183a5587a3caa2bfef088ee08ece8
/etc/emacs/site-start.d/50gforth.el newconffile
Description: GNU Forth Language Environment ...

On the other hand, on my Squeeze (Debian testing) system the Version is
0.7.0-ds1-5, and the assembler words are indeed loaded correctly.
(The Maintainer and the Depends fields in the dpkg description are also
different for the Squeeze package.)

Just fyi.

regards cgm

Krishna Myneni

unread,
Apr 13, 2010, 9:07:23 PM4/13/10
to


I think this problem may offer you a good opportunity to learn how to
obtain the gforth source package from CVS and build it from scratch on
your system. The pre-packaged Gforth under Ubuntu also appears to be
crippled in speed, as I recall. The only hard part in building from
source is installing all of the necessary development tools --- Ubuntu
does not install most of the necessary tools by default, and you will
have to use the Synaptic package manager to install them (or use
command line installation, via "sudo apt-get install packagename"). I
don't have a ready list of the packages necessary to build gforth from
scratch (maybe Anton or Bernd can help with this), but you will find
out the missing packages when you try to build. Obtaining the sources
from CVS is relatively straightforward. See

http://www.complang.tuwien.ac.at/forth/gforth/cvs-public/

One advantage of building from source is that, once you have built
gforth, you will be easily able to build other Forth systems from
source (bigforth, pfe, kforth, etc) for comparison. Ubuntu may not
offer up to date packages for these systems -- I know the kForth
package is hopelessly out of date.

Cheers,
Krishna

Josh Grams

unread,
Apr 14, 2010, 7:46:24 AM4/14/10
to
Krishna Myneni wrote:
>
> I think this problem may offer you a good opportunity to learn how to
> obtain the gforth source package from CVS and build it from scratch on
> your system. The pre-packaged Gforth under Ubuntu also appears to be
> crippled in speed, as I recall. The only hard part in building from
> source is installing all of the necessary development tools --- Ubuntu
> does not install most of the necessary tools by default, and you will
> have to use the Synaptic package manager to install them (or use
> command line installation, via "sudo apt-get install packagename"). I
> don't have a ready list of the packages necessary to build gforth from
> scratch

`apt-get build-dep gforth` should theoretically install all the build
dependencies...

--Josh

Anton Ertl

unread,
Apr 14, 2010, 8:04:15 AM4/14/10
to
Charles G Montgomery <c...@physics.utoledo.edu> writes:

>Anton Ertl an...@mips.complang.tuwien.ac.at wrote:
>> It's certainly pre-loaded on the Gforth (0.6.2) on Debian Lenny on the
>> 386 architecture.
>> ...
>>
>> - anton
>
>On my Lenny (Debian stable) system:
>
>/u01/home/cgm/QPO/wavea [514]$ gforth
>Gforth 0.6.2, Copyright (C) 1995-2003 Free Software Foundation, Inc.
>Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
>Type `bye' to exit
>assembler ok
>words

Sorry for the misinformation. The system where I tried this on and
where the assembler is present is a Debian 3.1 (Sarge?) system, not
Lenny. We have not installed or upgraded an i386 system for several
years.

>On the other hand, on my Squeeze (Debian testing) system the Version is
>0.7.0-ds1-5, and the assembler words are indeed loaded correctly.

Which makes it likely that they are present in a newer version of
Ubuntu, too (maybe 9.10, very likely 10.4).

Anton Ertl

unread,
Apr 14, 2010, 8:46:27 AM4/14/10
to
Krishna Myneni <krishna...@ccreweb.org> writes:
>I think this problem may offer you a good opportunity to learn how to
>obtain the gforth source package from CVS and build it from scratch on
>your system. The pre-packaged Gforth under Ubuntu also appears to be
>crippled in speed, as I recall. The only hard part in building from
>source is installing all of the necessary development tools --- Ubuntu
>does not install most of the necessary tools by default, and you will
>have to use the Synaptic package manager to install them (or use
>command line installation, via "sudo apt-get install packagename"). I
>don't have a ready list of the packages necessary to build gforth from
>scratch (maybe Anton or Bernd can help with this)

Well, here's the list that the Debian maintainer uses:

http://packages.debian.org/source/sid/gforth

You don't need quilt, but you need cvs for building from the Gforth
CVS.

Yesterday I built Gforth on a freshly installed (i.e., only basic
stuff present) Debian Squeeze armel system, and I installed:

gcc autoconf automake libtool cvs gforth libffi-dev gdb

You don't need cvs and gforth if you install from the release tarball.

If you are looking for maximal speed, gcc <=4.3 is better than gcc-4.4
(get the gcc-4.3 package, and then configure with "./configure
CC=gcc-4.3").

>Obtaining the sources
>from CVS is relatively straightforward. See
>
>http://www.complang.tuwien.ac.at/forth/gforth/cvs-public/

- anton

Krishna Myneni

unread,
Apr 14, 2010, 11:31:53 PM4/14/10
to


I tried your suggestion on my 12 year-old PII system, running xubuntu
9.04, and it worked in getting a current gforth executable. Here's
what I did:
--

1) Following your suggestion above, I obtained a list of packages to
install and was prompted to install them:

--
$ sudo apt-get build-dep gforth
...
The following NEW packages will be installed:
autoconf automake autotools-dev build-essential debhelper dpkg-dev
gettext html2text intltool-debian libmail-sendmail-perl
libsys-hostname-long-perl m4 patch po-debconf
0 upgraded, 14 newly installed, 0 to remove and 0 not upgraded.
Need to get 4924kB of archives.
After this operation, 16.9MB of additional disk space will be used.
Do you want to continue [Y/n]?
--

2) Next, I installed the xubuntu gforth and cvs packages, realizing
those will be necessary to build gforth from source obtained via CVS:

$ sudo apt-get install gforth
$ sudo apt-get install cvs

3) Next, the most recent sources for gforth were obtained from its CVS
repository:

$ cvs -d :pserver:anon...@c1.complang.tuwien.ac.at:/nfs/unsafe/cvs-
repository/src-master co gforth

4) Changing directory to the newly created "gforth" directory, the
BUILD-FROM-SCRATCH command was issued:

$ ./BUILD-FROM-SCRATCH

The build went a good way before conking out on the "makeinfo"
command. However, the gforth executables had been built by then.

5) I tried out the newly built gforth:
--
$ ./gforth
Gforth 0.7.0-20090215, Copyright (C) 1995-2009 Free Software


Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit

--

6) The assembler words were present:

assembler ok
words
makeflag YET BUT ?DO REPEAT AGAIN UNTIL WHILE DO BEGIN ELSE
AHEAD THEN IF >offset ~cond PREFETCHT2 PREFETCHT1 PREFETCHT0
PREFETCHNTA SFENCE PSWABD PFPNACC PFNACC PF2IW PI2FW PSHUFW PSADBW
...

--

Sadly, this latest version of gforth on an x86 system was unable to
run the contrib/terminal.fs program, which exercises the serial.fs,
syscalls386.fs, and associated files:
--
$ ../gforth terminal.fs
redefined ekey with EKEY redefined clear-line

Type 'term' to start a 9600 baud terminal on COM1 configured with 8N1.

Gforth 0.7.0-20090215, Copyright (C) 1995-2009 Free Software


Foundation, Inc.
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
Type `bye' to exit

term

:1: Invalid memory address
>>>term<<<
Backtrace:
$8B04C783
$B7199C5C open
$B719B700 serial_open


$B719BC0C terminal

--

However, the prebuilt gforth package provided by xubuntu 9.04, which
was gforth version 0.6.2, succeeded in running the terminal.fs program
without any obvious errors.

I don't understand why the most recent gforth is unable to run the
code.

Krishna

Krishna Myneni

unread,
Apr 14, 2010, 11:35:35 PM4/14/10
to
On Apr 14, 7:46 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
> >I think this problem may offer you a good opportunity to learn how to
> >obtain the gforth source package from CVS and build it from scratch on
> >your system. The pre-packaged Gforth under Ubuntu also appears to be
> >crippled in speed, as I recall. The only hard part in building from
> >source is installing all of the necessary development tools --- Ubuntu
> >does not install most of the necessary tools by default, and you will
> >have to use the Synaptic package manager to install them (or use
> >command line installation, via "sudo apt-get install packagename"). I
> >don't have a ready list of the packages necessary to build gforth from
> >scratch (maybe Anton or Bernd can help with this)
>
> Well, here's the list that the Debian maintainer uses:
>
> http://packages.debian.org/source/sid/gforth
>
> You don't need quilt, but you need cvs for building from the Gforth
> CVS.
>
> Yesterday I built Gforth on a freshly installed (i.e., only basic
> stuff present) Debian Squeeze armel system, and I installed:
>
> gcc autoconf automake libtool cvs gforth libffi-dev gdb
> ...

In connection with the problems experienced by the OP, please see my
response to Josh Grams. I was able to execute the contrib/terminal.fs
program, which relies on serial.fs and syscalls386.fs, using gforth
0.6.2. However, the same code would not run with the latest gforth
from its cvs repository, and the problem appears to be related to
syscalls386.fs. Has the x86 assembler changed significantly since
gforth 0.6.2?

Krishna

Anton Ertl

unread,
Apr 15, 2010, 9:49:42 AM4/15/10
to
Krishna Myneni <krishna...@ccreweb.org> writes:
>However, the same code would not run with the latest gforth
>from its cvs repository, and the problem appears to be related to
>syscalls386.fs. Has the x86 assembler changed significantly since
>gforth 0.6.2?

My guess is that the register allocation is different; it can vary
from installation to installation, even with the same Gforth version,
and then assembly code will not run.

A recently introduced feature (i.e., after 0.7.0) does not have this
problem: ABI-CODE is simimlar to CODE, but the code is for a function
according to the ABI of the platform; sp and fp are passed as
parameters and returned in a struct. Since the ABI stays the same
words defined using ABI-CODE should be portable within the platform.

Another option (and the one I would recommend) would be to use the C
interface (either the new libcc interface, or the old lib.fs
interface) to perform the system calls. That (especially if you use
libcc) would be portable to any platform that has these system calls.

Krishna Myneni

unread,
Apr 15, 2010, 7:40:57 PM4/15/10
to
On Apr 15, 8:49 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
> >However, the same code would not run with the latest gforth
> >from its cvs repository, and the problem appears to be related to
> >syscalls386.fs. Has the x86 assembler changed significantly since
> >gforth 0.6.2?
>
> My guess is that the register allocation is different; it can vary
> from installation to installation, even with the same Gforth version,
> and then assembly code will not run.
>

Do you mean that the register containing the stack pointer can be
different on different computers using the same hardware (x86, for
example) and gforth version? If so, I don't see the point in providing
an assembler with gforth.

> A recently introduced feature (i.e., after 0.7.0) does not have this
> problem: ABI-CODE is simimlar to CODE, but the code is for a function
> according to the ABI of the platform; sp and fp are passed as
> parameters and returned in a struct.  Since the ABI stays the same
> words defined using ABI-CODE should be portable within the platform.
>

Do you have an example of how one would load the sp and fp into regs
on an x86 system, using the new ABI-CODE interface? This seems
needlessly complicated. Why not ensure sp and fp are provided in fixed
registers for a CODE definition on a given platform?

> Another option (and the one I would recommend) would be to use the C
> interface (either the new libcc interface, or the old lib.fs
> interface) to perform the system calls.  That (especially if you use
> libcc) would be portable to any platform that has these system calls.
>

My two cents on Forth philosophy:
The syscalls386.fs interface is one of the simplest examples of using
assembly code in Forth to access low-level functions. Forth is
supposed to exemplify agility in using the appropriate level of
abstraction for a given problem, all the way from assembly code to
high-level abstraction (e.g. object-oriented/lists) approach. The
assembly language approach is the most appropriate for this problem.


Krishna

Anton Ertl

unread,
Apr 16, 2010, 8:13:43 AM4/16/10
to
Krishna Myneni <krishna...@ccreweb.org> writes:
>On Apr 15, 8:49=A0am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

>wrote:
>> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
>> >However, the same code would not run with the latest gforth
>> >from its cvs repository, and the problem appears to be related to
>> >syscalls386.fs. Has the x86 assembler changed significantly since
>> >gforth 0.6.2?
>>
>> My guess is that the register allocation is different; it can vary
>> from installation to installation, even with the same Gforth version,
>> and then assembly code will not run.
>>
>
>Do you mean that the register containing the stack pointer can be
>different on different computers using the same hardware (x86, for
>example) and gforth version?

Yes.

However, on the 386 architecture, if Gforth is built with explicit
register allocation and gcc-2.95 or later, you get sp in %esi (but the
TOS is more volatile).

> If so, I don't see the point in providing
>an assembler with gforth.

I guess the main point was to comply with Forth tradition; and to
provide something for those who need it, even though the portability
was less than ideal. And now, with ABI-CALL, there is even
portability within a platform (actually ABI, so for cross-platform
ABIs we can have portability even between platforms).

>> A recently introduced feature (i.e., after 0.7.0) does not have this
>> problem: ABI-CODE is simimlar to CODE, but the code is for a function
>> according to the ABI of the platform; sp and fp are passed as

>> parameters and returned in a struct. =A0Since the ABI stays the same


>> words defined using ABI-CODE should be portable within the platform.
>>
>
>Do you have an example of how one would load the sp and fp into regs
>on an x86 system, using the new ABI-CODE interface?

I'll prepare one for the documentation soon (probably today).

> This seems
>needlessly complicated.

Complicated? Somewhat. Needless? Not unless we want to give up the
portability advantage that using gcc as portable assembler gives us.

> Why not ensure sp and fp are provided in fixed
>registers for a CODE definition on a given platform?

This may be possible, but I see a few difficulties:

1) It's not clear that we can get gcc to move them into fixed
registers without disrupting gcc's register allocation (and we have
to go through gcc, because only gcc knows where sp and fp are).

2) We would have to add the registers for sp and fp to all the
architectures where we want to have this feature.

3) It would be unclear which of the other registers are unused. One
probably would have to preserve them all (in contrast, with the
ABI-based interface we don't need to preserve caller-saved and
argument registers).

I think that, because of 3), the advantages of this approach are not
big enough to justify the costs.

>> Another option (and the one I would recommend) would be to use the C
>> interface (either the new libcc interface, or the old lib.fs

>> interface) to perform the system calls. =A0That (especially if you use


>> libcc) would be portable to any platform that has these system calls.
>>
>
>My two cents on Forth philosophy:
>The syscalls386.fs interface is one of the simplest examples of using
>assembly code in Forth to access low-level functions. Forth is
>supposed to exemplify agility in using the appropriate level of
>abstraction for a given problem, all the way from assembly code to
>high-level abstraction (e.g. object-oriented/lists) approach. The
>assembly language approach is the most appropriate for this problem.

I disagree. I think the C interface approach is the most approriate
one, because it lets you port the code to more platforms: other Gforth
installations, other CPUs, and other Unix implementations. I replaced
your syscalls386 code with a libcc-based implementation (untested),
and the result is shorter. The libcc implementation requires a lot
more machinery in the background, though. You can find it on

http://www.complang.tuwien.ac.at/viewcvs/cgi-bin/viewcvs.cgi/gforth/contrib/syscalls386.fs?rev=1.2&view=auto

OTOH, there is certainly a Forth tradition that sees complications for
the sake of portability as needless complexity. The current CODE can
be seen as being in that tradition; the assembly code written for that
does not necessarily port to another Gforth installation, but isn't
that just needless complexity?

Albert van der Horst

unread,
Apr 16, 2010, 9:39:38 AM4/16/10
to
In article <2010Apr1...@mips.complang.tuwien.ac.at>,

Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>Krishna Myneni <krishna...@ccreweb.org> writes:
>>However, the same code would not run with the latest gforth
>>from its cvs repository, and the problem appears to be related to
>>syscalls386.fs. Has the x86 assembler changed significantly since
>>gforth 0.6.2?
>
>My guess is that the register allocation is different; it can vary
>from installation to installation, even with the same Gforth version,
>and then assembly code will not run.
>
>A recently introduced feature (i.e., after 0.7.0) does not have this
>problem: ABI-CODE is simimlar to CODE, but the code is for a function
>according to the ABI of the platform; sp and fp are passed as
>parameters and returned in a struct. Since the ABI stays the same
>words defined using ABI-CODE should be portable within the platform.

This is of course, as far as it goes. You don't want just code,
you want to add interesting instructions that don't duplicate
what Forth can handle on its own.

Case in point (and I raised it on this forum some time ago).
The real time counter on the Pentium RDTSC (or some such)
involves the D and A registers of the Pentium
There is no forth-way that an assembler is usable, if one cannot
find out whether or not the D and A registers are used internally
for some reason.
(Of course one can do it the C-way and save all registers that
are not used.)

Another point I hate is that I need to learn C to use Forth.
I happen to know C, but it may lead some to say "let's not bother,
use C straight away".

>
>Another option (and the one I would recommend) would be to use the C
>interface (either the new libcc interface, or the old lib.fs
>interface) to perform the system calls. That (especially if you use
>libcc) would be portable to any platform that has these system calls.

Replacing an inherently simple OS interface ( push 3 register, call
a gate) with a C interface is something Forthers (at least me)
hate.
My word OSX is a breeze to use, leads to portable file words
from linux 1.2 to BSD. Even the binary code (!) of OSX itself
runs unaltered between 32 and 64 bit linux system.
What holds it back is only documentation. About how system calls
work exactly, on symbolic codes, and in some Forth's register
allocation.

A small trick goes a long way and maybe gForth could do an equivalent
of the following:

This is what I do in my generic source before macro processing:
"

FORTH REGISTERS
The names under FORTH are used in the generic source.

FORTH 8088 FORTH PRESERVATION RULES
----- ---- ----- ------------ -----
dnl Note how the following is to pass through m4
{HIP} HIP High level Interpreter Pointer. Must be preserved
across FORTH words.

{WOR} WOR Working register. When entering a word
via its code field the DEA is passed in {WOR}.

{SPO} SPO Parameter stack pointer. Must be preserved
across FORTH words.

{RPO} RPO Return stack pointer. Must be preserved across
FORTH words.

"
The macro processor expands the HIP name to the register
actually used, but not if when quoted, i.e. between curly brackets.
(The dnl is a comment to the generic source that is removed by
the macro processor.)
(There is only one source file, so it is not hard to know
where to look.)

The resulting file contains the expanded source for the Forth
user to inspect and is the source delivered in accordance
with the GPL.

I sympathize with the difficulty of documenting a large system
that comes in so many configurations. I have been experimenting
with a system of conditional documentation.

>
>- anton

Groetjes Albert

--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Krishna Myneni

unread,
Apr 17, 2010, 9:29:33 AM4/17/10
to
On Apr 16, 7:13 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

I think this is an unfortunate state of events for gforth. My
expectation, as a Forth user, is that if I write a CODE ... END-CODE
definition, I do not expect it to be portable to other Forth systems
running on the same platform, and certainly not to different
platforms, but that it should be portable to the same Forth running on
an another system within the same underlying hardware and OS. If not,
CODE and END-CODE really have no use.


> >> A recently introduced feature (i.e., after 0.7.0) does not have this
> >> problem: ABI-CODE is simimlar to CODE, but the code is for a function
> >> according to the ABI of the platform; sp and fp are passed as
> >> parameters and returned in a struct. =A0Since the ABI stays the same
> >> words defined using ABI-CODE should be portable within the platform.
>
> >Do you have an example of how one would load the sp and fp into regs
> >on an x86 system, using the new ABI-CODE interface?
>
> I'll prepare one for the documentation soon (probably today).
>
> > This seems
> >needlessly complicated.
>
> Complicated? Somewhat.  Needless?  Not unless we want to give up the
> portability advantage that using gcc as portable assembler gives us.
>

I look forward to your example, although with some trepidation. The
ability to intermix Forth source and assembly language source using
CODE and END-CODE is one of the stellar and distinguishing features
(from other programming languages) we have come to expect from a
usable Forth system. To effectively do away with this ability by
adding extra complexity in implementing CODE definitions, for the sake
of simplification in making the Forth system portable, is IMO a very
bad system design decision, especially since there are alternatives.


> > Why not ensure sp and fp are provided in fixed
> >registers for a CODE definition on a given platform?
>
> This may be possible, but I see a few difficulties:
>
> 1) It's not clear that we can get gcc to move them into fixed
>    registers without disrupting gcc's register allocation (and we have
>    to go through gcc, because only gcc knows where sp and fp are).
>

Maybe not with gcc alone. You would likely have to add some platform
specific assembly code, for each desired target platform, to make this
happen. But it would be well worth it to do so.


> 2) We would have to add the registers for sp and fp to all the
>    architectures where we want to have this feature.
>

A judiciously selective approach is warranted, starting with the
widely used platforms for gforth, and then documenting what has been
done.


> 3) It would be unclear which of the other registers are unused.  One
>    probably would have to preserve them all (in contrast, with the
>    ABI-based interface we don't need to preserve caller-saved and
>    argument registers).
>

A small interface module written in the platform's native assembly
instructions, as suggested above, will allow registers to be be
preserved and restored appropriately.

> I think that, because of 3), the advantages of this approach are not
> big enough to justify the costs.
>

There is a cost, but choosing portability over making unusable one of
the central features expected of a Forth system is like the saying,
"throwing the baby out with the bath water",


> >> Another option (and the one I would recommend) would be to use the C
> >> interface (either the new libcc interface, or the old lib.fs
> >> interface) to perform the system calls. =A0That (especially if you use
> >> libcc) would be portable to any platform that has these system calls.
>
> >My two cents on Forth philosophy:
> >The syscalls386.fs interface is one of the simplest examples of using
> >assembly code in Forth to access low-level functions. Forth is
> >supposed to exemplify agility in using the appropriate level of
> >abstraction for a given problem, all the way from assembly code to
> >high-level abstraction (e.g. object-oriented/lists) approach. The
> >assembly language approach is the most appropriate for this problem.
>
> I disagree.  I think the C interface approach is the most approriate
> one, because it lets you port the code to more platforms: other Gforth
> installations, other CPUs, and other Unix implementations.  I replaced
> your syscalls386 code with a libcc-based implementation (untested),
> and the result is shorter.  The libcc implementation requires a lot
> more machinery in the background, though.  You can find it on
>

Better yet, making SYSCALL an intrinsic word. However, there may be a
portability issue.

> http://www.complang.tuwien.ac.at/viewcvs/cgi-bin/viewcvs.cgi/gforth/c...


>
> OTOH, there is certainly a Forth tradition that sees complications for
> the sake of portability as needless complexity.  The current CODE can
> be seen as being in that tradition; the assembly code written for that
> does not necessarily port to another Gforth installation, but isn't
> that just needless complexity?
>
> - anton
> --
> M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
> comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
>      New standard:http://www.forth200x.org/forth200x.html
>    EuroForth 2009:http://www.euroforth.org/ef09/

Krishna

Krishna Myneni

unread,
Apr 17, 2010, 10:05:36 AM4/17/10
to
On Apr 16, 8:39 am, Albert van der Horst <alb...@spenarnc.xs4all.nl>
wrote:
...
> ... You don't want just code,

> you want to add interesting instructions that don't duplicate
> what Forth can handle on its own.
>

Yes, for example spin locks cannot be implemented by Forth source, or
even C source. They require assembly level instructions.


...


> Replacing an inherently simple OS interface  ( push 3 register, call
> a gate) with a C interface is something Forthers (at least me)

> hate. ...

Echo.


> A small trick goes a long way and maybe gForth could do an equivalent

> of the following: ...

Relying on gcc for everything appears to be the source of the problem.
It is likely that some hardware specific assembly code must be
introduced into gforth to make CODE and END-CODE usable, and I'm not
downplaying the difficulty of doing this.


> I sympathize with the difficulty of documenting a large system

> that comes in so many configurations. ...

Yes, me too. Given the enormous task of maintaining the system for
many platforms, the reluctance to add platform-specific code into the
main application is understandable.


Krishna


Krishna Myneni

unread,
Apr 17, 2010, 12:32:29 PM4/17/10
to
On Apr 16, 7:13 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:


> I disagree.  I think the C interface approach is the most approriate
> one, because it lets you port the code to more platforms: other Gforth
> installations, other CPUs, and other Unix implementations.  I replaced
> your syscalls386 code with a libcc-based implementation (untested),
> and the result is shorter.  The libcc implementation requires a lot
> more machinery in the background, though.  You can find it on
>

> http://www.complang.tuwien.ac.at/viewcvs/cgi-bin/viewcvs.cgi/gforth/c...
>

Uggh.. while your new syscalls386.fs file is shorter, I can't make it
work on my system without installing and configuring a lot of extra
stuff:

==============================
$ ../gforth terminal.fs
sh: libtool: not found

in file included from *OS command line*:-1
in file included from terminal.fs:24
syscalls386.fs:43: libtool compile failed
>>>end-c-library<<<
Backtrace:
$B7111B2C throw
$B7134320 c(abort")
$B7134A80 compile-wrapper-function1
==============================

Ok. I installed libtool. Next,

==============================
../gforth terminal.fs

libltdl is not configured
in file included from *OS command line*:-1
in file included from terminal.fs:24
syscalls386.fs:43: open-lib failed
>>>end-c-library<<<
Backtrace:
$B719CB2C throw
$B71BF508 c(abort")
$B71BFA80 compile-wrapper-function1
================================

What does this mean? System calls, which are about the lowest level
interface available to applications, should not require a lot of extra
machinery.


Krishna

Anton Ertl

unread,
Apr 17, 2010, 1:53:20 PM4/17/10
to
Krishna Myneni <krishna...@ccreweb.org> writes:
>Ok. I installed libtool. Next,
>
>=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
>=3D=3D=3D=3D=3D

> ../gforth terminal.fs
>
>libltdl is not configured
...
>What does this mean?

You have installed libtool, but either not libltdl(-dev), or you did
not reconfigure and rebuild Gforth afterwards.

Krishna Myneni

unread,
Apr 17, 2010, 3:21:02 PM4/17/10
to
On Apr 17, 12:53 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)

wrote:
> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
...

> > ../gforth terminal.fs
>
> >libltdl is not configured
> ...
> >What does this mean?
>
> You have installed libtool, but either not libltdl(-dev), or you did
> not reconfigure and rebuild Gforth afterwards.
>
> - anton

I reconfigured and rebuilt Gforth. It now executes the terminal.fs
program with no apparent problems, which indicates that your new
implementation (non-assembly version) of syscalls386.fs works. You
should probably rename it simply to syscalls.fs, since it is no longer
dependent on the 386 architecture.

Krishna

Anton Ertl

unread,
Apr 17, 2010, 5:35:25 PM4/17/10
to
Krishna Myneni <krishna...@ccreweb.org> writes:
>I look forward to your example

The example is:

abi-code my+ ( n1 n2 -- n )
\ eax, edx, ecx are caller-saved
4 sp d) ax mov \ sp into return reg
ax ) cx mov \ tos
cx 4 ax d) add \ sec = sec+tos
4 # ax add \ update sp (pop)
ret \ return from my+
end-code

Note that I changed ABI-CODE this evening to make programming with it
nicer, so you will have to "cvs update && make" before trying it.

Below you find the whole rewritten section on assembler definitions.

- anton

5.26.1 Definitions in assembly language
---------------------------------------

Gforth provides ways to implement words in assembly language (using
`abi-code'...`end-code'), and also ways to define defining words with
arbitrary run-time behaviour (like `does>'), where (unlike `does>') the
behaviour is not defined in Forth, but in assembly language (with
`;code').

However, the machine-independent nature of Gforth poses a few
problems: First of all, Gforth runs on several architectures, so it can
provide no standard assembler. It does provide assemblers for several
of the architectures it runs on, though. Moreover, you can use a
system-independent assembler in Gforth, or compile machine code
directly with `,' and `c,'.

Another problem is that the virtual machine registers of Gforth (the
stack pointers and the virtual machine instruction pointer) depend on
the installation and engine. Also, which registers are free to use
also depend on the installation and engine. So any code written to run
in the context of the Gforth virtual machine is essentially limited to
the installation and engine it was developed for (it may run elsewhere,
but you cannot rely on that).

Fortunately, you can define `abi-code' words in Gforth that are
portable to any Gforth running on the same ABI (typically the same
architecture/OS combination, sometimes crossing OS boundaries).

`assembler' - tools-ext "assembler"
A vocubulary: Replaces the wordlist at the top of the search order
with the assembler wordlist.

`init-asm' - gforth "init-asm"
Pushes the assembler wordlist on the search order.

`abi-code' "name" - colon-sys gforth "abi-code"
Start a native code definition that is called using the platform's
ABI conventions corresponding to the C-prototype:
Cell *function(Cell *sp, Float **fpp);
The FP stack pointer is passed in by providing a reference to a
memory location containing the FP stack pointer and is passed out by
storing the changed FP stack pointer there (if necessary).

`end-code' colon-sys - gforth "end-code"
End a code definition. Note that you have to assemble the return
from the ABI call (for `abi-code') or the dispatch to the next VM
instruction (for `code' and `;code') yourself.

`code' "name" - colon-sys tools-ext "code"
Start a native code definition that runs in the context of the
Gforth virtual machine (engine). Such a definition is not portable
between Gforth installations, so we recommend using `abi-code' instead
of `code'. You have to end a `code' definition with a dispatch to the
next virtual machine instruction.

`;code' compilation. colon-sys1 - colon-sys2 tools-ext "semicolon-code"
The code after `;code' becomes the behaviour of the last defined
word (which must be a `create'd word). The same caveats apply as for
`code', but Gforth does not have a `;abi-code' yet. As a workaround,
you can use `does> foo ;' instead, where `foo' is defined with
`abi-code'.

`flush-icache' c-addr u - gforth "flush-icache"
Make sure that the instruction cache of the processor (if there is
one) does not contain stale data at c-addr and u bytes afterwards.
`END-CODE' performs a `flush-icache' automatically. Caveat:
`flush-icache' might not work on your installation; this is usually the
case if direct threading is not supported on your machine (take a look
at your `machine.h') and your machine has a separate instruction cache.
In such cases, `flush-icache' does nothing instead of flushing the
instruction cache.

If `flush-icache' does not work correctly, `abi-code' words etc.
will not work (reliably), either.

The typical usage of these words can be shown most easily by analogy
to the equivalent high-level defining words:

: foo abi-code foo
<high-level Forth words> <assembler>
; end-code

: bar : bar
<high-level Forth words> <high-level Forth words>
CREATE CREATE
<high-level Forth words> <high-level Forth words>
DOES> ;code
<high-level Forth words> <assembler>
; end-code

For using `abi-code', take a look at the ABI documentation of your
platform to see how the parameters are passed (so you know where you
get the stack pointers) and how the return value is passed (so you know
where the data stack pointer is returned). The ABI documentation also
tells you which registers are saved by the caller (caller-saved), so
you are free to destroy them in your code, and which registers have to
be preserved by the called word (callee-saved), so you have to save
them before using them, and restore them afterwards. More
reverse-engineering oriented people can also find out about the passing
and returning of the stack pointers through `see abi-call'.

Most ABIs pass the parameters through registers, but some (in
particular the 386 (aka IA-32) architecture) pass them on the
architectural stack. The usual ABIs all pass the return value in a
register.

One other thing you need to know for using `abi-code' is that both
the data and the FP stack grow downwards (towards lower addresses) in
Gforth.

Here's an example of using `abi-code' on the 386 architecture:

abi-code my+ ( n1 n2 -- n )
\ eax, edx, ecx are caller-saved
4 sp d) ax mov \ sp into return reg
ax ) cx mov \ tos
cx 4 ax d) add \ sec = sec+tos
4 # ax add \ update sp (pop)
ret \ return from my+
end-code

Anton Ertl

unread,
Apr 18, 2010, 12:45:07 PM4/18/10
to
Krishna Myneni <krishna...@ccreweb.org> writes:
>I reconfigured and rebuilt Gforth. It now executes the terminal.fs
>program with no apparent problems, which indicates that your new
>implementation (non-assembly version) of syscalls386.fs works.

Thanks for the feedback.

>You
>should probably rename it simply to syscalls.fs, since it is no longer
>dependent on the 386 architecture.

The medium-term plan is to have files that give you all the POSIX
functions; these would supersede this file.

Krishna Myneni

unread,
Apr 18, 2010, 1:03:29 PM4/18/10
to
On Apr 18, 11:45 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
> >I reconfigured and rebuilt Gforth. It now executes the terminal.fs
> >program with no apparent problems, which indicates that your new
> >implementation (non-assembly version) of syscalls386.fs works.
>
> Thanks for the feedback.
>

The terminal.fs program also runs on a x86_64 linux system, now. I
haven't checked its operation yet, but it does start up without error.
Your point about the revised syscalls file, using the library
interface, being more general is certainly valid. There are revised
versions of serial.fs and terminal.fs, which use the new syscalls.fs
at

ftp://ccreweb.org/software/gforth/

> >You
> >should probably rename it simply to syscalls.fs, since it is no longer
> >dependent on the 386 architecture.
>
> The medium-term plan is to have files that give you all the POSIX
> functions; these would supersede this file.
>

Sounds good. Maybe syscalls.fs can serve as an interim solution.


Krishna

Anton Ertl

unread,
Apr 18, 2010, 12:47:40 PM4/18/10
to
Krishna Myneni <krishna...@ccreweb.org> writes:
>On Apr 16, 7:13=A0am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

>wrote:
>> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
[...]

>> I guess the main point was to comply with Forth tradition; and to
>> provide something for those who need it, even though the portability
>> was less than ideal. =A0And now, with ABI-CALL, there is even

>> portability within a platform (actually ABI, so for cross-platform
>> ABIs we can have portability even between platforms).
>>
>
>I think this is an unfortunate state of events for gforth. My
>expectation, as a Forth user, is that if I write a CODE ... END-CODE
>definition, I do not expect it to be portable to other Forth systems
>running on the same platform, and certainly not to different
>platforms, but that it should be portable to the same Forth running on
>an another system within the same underlying hardware and OS.

If you want that, use ABI-CODE...END-CODE, now that we have them.

>If not,
>CODE and END-CODE really have no use.

There may be some specialized uses, but I have deemphasized them in
the documentation.

>The
>ability to intermix Forth source and assembly language source using
>CODE and END-CODE is one of the stellar and distinguishing features
>(from other programming languages) we have come to expect from a
>usable Forth system. To effectively do away with this ability by
>adding extra complexity in implementing CODE definitions, for the sake
>of simplification in making the Forth system portable, is IMO a very
>bad system design decision, especially since there are alternatives.

If you can accept neither the lack of portability of CODE nor the
overhead of ABI-CODE, then I guess you should go with the
alternatives. It's unclear to me what they are, though. I don't know
any highly portable Forth system that offers a portable CODE.

>> > Why not ensure sp and fp are provided in fixed
>> >registers for a CODE definition on a given platform?
>>
>> This may be possible, but I see a few difficulties:
>>
>> 1) It's not clear that we can get gcc to move them into fixed

>> =A0 =A0registers without disrupting gcc's register allocation (and we hav=
>e
>> =A0 =A0to go through gcc, because only gcc knows where sp and fp are).


>>
>
>Maybe not with gcc alone. You would likely have to add some platform
>specific assembly code, for each desired target platform, to make this
>happen. But it would be well worth it to do so.

I don't see how to do it with assembly code alone. Assembly code does
not know where gcc put the stack pointers. That's the problem we
started with.

I have been thinking along the lines of using either explicit register
allocation for the interface (and assigning from Gforth's sp and fp to
the expliciatly allocated code_sp and code_fp), or using gcc's
extended asm feature to get access to the sp and fp variables in the
assembly code that copies them into the fixed registers. However, it
is possible that gcc either cannot allocate registers with the
resulting restrictions (resulting in gcc not building Gforth), or
produces a bad register allocation that slows down ordinary Forth code
significantly.

>> 3) It would be unclear which of the other registers are unused. =A0One
>> =A0 =A0probably would have to preserve them all (in contrast, with the
>> =A0 =A0ABI-based interface we don't need to preserve caller-saved and
>> =A0 =A0argument registers).


>>
>
>A small interface module written in the platform's native assembly
>instructions, as suggested above, will allow registers to be be
>preserved and restored appropriately.

Sounds like ABI-CODE.

>Better yet, making SYSCALL an intrinsic word. However, there may be a
>portability issue.

A primitive? The problem is (as has been with C calls): How does
SYSCALL know how many parameters and return values there are.

Interestingly, even though Unix was developed in C and was designed to
work with programs written in C, C does not have a system call
intrinsic, either. C programs call system calls through the C
library, which contains wrappers around the system calls that are
compatible with the C calling conventions (somewhat parallel to what
we do with libcc).

Krishna Myneni

unread,
Apr 18, 2010, 5:38:04 PM4/18/10
to
On Apr 18, 11:47 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

wrote:
> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
> >On Apr 16, 7:13=A0am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
> >wrote:
> >> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
> [...]
> >> I guess the main point was to comply with Forth tradition; and to
> >> provide something for those who need it, even though the portability
> >> was less than ideal. =A0And now, with ABI-CALL, there is even
> >> portability within a platform (actually ABI, so for cross-platform
> >> ABIs we can have portability even between platforms).
>
> >I think this is an unfortunate state of events for gforth. My
> >expectation, as a Forth user, is that if I write a CODE ... END-CODE
> >definition, I do not expect it to be portable to other Forth systems
> >running on the same platform, and certainly not to different
> >platforms, but that it should be portable to the same Forth running on
> >an another system within the same underlying hardware and OS.
>
> If you want that, use ABI-CODE...END-CODE, now that we have them.
>

I haven't had a chance yet to do anything but take a quick look at
your ABI-CODE example. It seems to me, in essential features, like
CODE, with the exception of not loading sp into a register. How is an
update of the fp pointer handled?


> >If not,
> >CODE and END-CODE really have no use.
>
> There may be some specialized uses, but I have deemphasized them in
> the documentation.
>

Maybe it would be better to rename ABI-CODE to simply CODE.


> >The
> >ability to intermix Forth source and assembly language source using
> >CODE and END-CODE is one of the stellar and distinguishing features
> >(from other programming languages) we have come to expect from a
> >usable Forth system. To effectively do away with this ability by
> >adding extra complexity in implementing CODE definitions, for the sake
> >of simplification in making the Forth system portable, is IMO a very
> >bad system design decision, especially since there are alternatives.
>
> If you can accept neither the lack of portability of CODE nor the
> overhead of ABI-CODE, then I guess you should go with the
> alternatives.  It's unclear to me what they are, though.  I don't know
> any highly portable Forth system that offers a portable CODE.
>

I don't know of any highly portable Forth systems with a portable
CODE, either. But the assemblers are not really portable either. If
you want portability of CODE, it will probably have to be defined in
the assembler itself, and not be a primitive.

> >> > Why not ensure sp and fp are provided in fixed
> >> >registers for a CODE definition on a given platform?
>
> >> This may be possible, but I see a few difficulties:
>
> >> 1) It's not clear that we can get gcc to move them into fixed
> >> =A0 =A0registers without disrupting gcc's register allocation (and we hav=
> >e
> >> =A0 =A0to go through gcc, because only gcc knows where sp and fp are).
>
> >Maybe not with gcc alone. You would likely have to add some platform
> >specific assembly code, for each desired target platform, to make this
> >happen. But it would be well worth it to do so.
>
> I don't see how to do it with assembly code alone.  Assembly code does
> not know where gcc put the stack pointers.  That's the problem we
> started with.
>

I haven't examined the gforth code lately. How do you refer to the sp
and fp pointers from the C code? Why can't assembly code access those
same pointers and load them into registers? Does it have to do with sp
and fp being register variables?

> I have been thinking along the lines of using either explicit register
> allocation for the interface (and assigning from Gforth's sp and fp to
> the expliciatly allocated code_sp and code_fp), or using gcc's
> extended asm feature to get access to the sp and fp variables in the
> assembly code that copies them into the fixed registers.  However, it
> is possible that gcc either cannot allocate registers with the
> resulting restrictions (resulting in gcc not building Gforth), or
> produces a bad register allocation that slows down ordinary Forth code
> significantly.
>


Are we going to let gcc push us around? :)

Seriously, for a usable assembly interface, at some point you will
have to wrest control away from gcc, and then hand it back.


> >> 3) It would be unclear which of the other registers are unused. =A0One
> >> =A0 =A0probably would have to preserve them all (in contrast, with the
> >> =A0 =A0ABI-based interface we don't need to preserve caller-saved and
> >> =A0 =A0argument registers).
>
> >A small interface module written in the platform's native assembly
> >instructions, as suggested above, will allow registers to be be
> >preserved and restored appropriately.
>
> Sounds like ABI-CODE.
>

Maybe.

> >Better yet, making SYSCALL an intrinsic word. However, there may be a
> >portability issue.
>
> A primitive?  The problem is (as has been with C calls): How does
> SYSCALL know how many parameters and return values there are.
>


Below is the implementation of SYSCALL from kForth (vmc.c). It handles
different number of parameters, up to 3 for now, but it can be
extended. There is only one return value. All other returns are
through address arguments.

int C_syscall ()
{
/* stack: ( arg1 ... arg_n nargs nsyscall -- err | 0 <= n <= 3) */

int nargs, nsyscall, i, args[3];

DROP
nsyscall = TOS;
DROP
nargs = TOS;
if (nargs > 3) nargs = 3; // this should be an error
for (i = 0; i < nargs; i++)
{
DROP
args[i] = TOS;
}

switch (nargs)
{
case 0:
TOS = syscall(nsyscall);
break;
case 1:
TOS = syscall(nsyscall, args[0]);
break;
case 2:
TOS = syscall(nsyscall, args[1], args[0]);
break;
case 3:
TOS = syscall(nsyscall, args[2], args[1], args[0]);
break;
default:
; // Illegal number or args
}
DEC_DSP
STD_IVAL

return 0;
}


And here's how it's used in syscalls.4th:

: syscall0 0 swap syscall ;
: syscall1 1 swap syscall ;
: syscall2 2 swap syscall ;
: syscall3 3 swap syscall ;

> Interestingly, even though Unix was developed in C and was designed to
> work with programs written in C, C does not have a system call
> intrinsic, either.  C programs call system calls through the C
> library, which contains wrappers around the system calls that are
> compatible with the C calling conventions (somewhat parallel to what
> we do with libcc).
>

The mmap system call appears to have 6 args, so the function above
would need to be extended to up to 6 args. I haven't done an
exhaustive study of syscalls to see what is the largest number of
args.


Krishna

Krishna Myneni

unread,
Apr 18, 2010, 9:27:16 PM4/18/10
to
On Apr 18, 11:47 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> >> > Why not ensure sp and fp are provided in fixed


> >> >registers for a CODE definition on a given platform?
>

> I don't see how to do it with assembly code alone.  Assembly code does


> not know where gcc put the stack pointers.  That's the problem we
> started with.
>
> I have been thinking along the lines of using either explicit register
> allocation for the interface (and assigning from Gforth's sp and fp to
> the expliciatly allocated code_sp and code_fp), or using gcc's
> extended asm feature to get access to the sp and fp variables in the
> assembly code that copies them into the fixed registers.  However, it
> is possible that gcc either cannot allocate registers with the
> resulting restrictions (resulting in gcc not building Gforth), or
> produces a bad register allocation that slows down ordinary Forth code
> significantly.

Would the following work?

Looking at Gforth's engine/main.c, I see that you have global
variables which contain the stack pointers:

Cell *gforth_SP;
Float *gforth_FP;
Cell *gforth_RP;
Address gforth_LP;

So, when a CODE word is defined, CODE could insert header code which
calls C code that copies the current register based stack pointers,
apparently stored in sp, fp, rp, and lp, into the global pointers
above. Then, architecture-specific instructions can copy from the
global variables into the machine-specific registers, e.g. on x86,

movl gforth_SP, %ebx

etc.

There's some performance penalty for copying the registers to memory
and reloading them from memory, of course, but that may be the price
you have to pay for a portable design.


Krishna

Albert van der Horst

unread,
Apr 19, 2010, 7:42:36 AM4/19/10
to
<SNIP>

>
>I have been thinking along the lines of using either explicit register
>allocation for the interface (and assigning from Gforth's sp and fp to
>the expliciatly allocated code_sp and code_fp), or using gcc's
>extended asm feature to get access to the sp and fp variables in the
>assembly code that copies them into the fixed registers. However, it
>is possible that gcc either cannot allocate registers with the
>resulting restrictions (resulting in gcc not building Gforth), or
>produces a bad register allocation that slows down ordinary Forth code
>significantly.

Can you accomodate people like me?
I *accept* beforehand that assembler code is non-portable.
I *accept* beforehand that assembler code is non-portable even to the next
release of Gforth.

I just want to know what the hell the register allocation is,
but I want it endorsed, i.e. a somewhat official statement what
they are. Or a somewhat official statement about how to find them
from the sources.

<SNIP>

>
>Interestingly, even though Unix was developed in C and was designed to
>work with programs written in C, C does not have a system call
>intrinsic, either. C programs call system calls through the C
>library, which contains wrappers around the system calls that are
>compatible with the C calling conventions (somewhat parallel to what
>we do with libcc).

Wasn't the old idea of C to provide a simple language and do the
rest with library calls? Me thinks the system call mechanism
is fully in line with that.
In my linux the c-interface is very similar to the lina interface:
there are functions _syscall0 _syscall1 ... _syscall6
where the number indicates the number of parameters.
All functions of manual chapter 2 (e.g. man 2 unlink) are
implemented using that mechanism. The number of the unlink
call to the dispatchers in the kernel is 10, and it is passed
as one of those parameters.
[Those _syscallx are "static inline" and defined by inline assembly,
but all that is neither here nor there. They are c-functions,]

lina has a something similar way to _syscallx :
OSX (3 parameters) and OSX5 and relies on dummy parameters,
as does in fact the assembly call.
Adding OSX as a wrapper for _syscall3 would go a long way to
making system call's available in gforth. (Those with more parameters
are not needed often, and those with less can do with dummies.)

Groetjes Albert

Anton Ertl

unread,
Apr 19, 2010, 9:45:01 AM4/19/10
to
Krishna Myneni <krishna...@ccreweb.org> writes:
>On Apr 18, 11:47=A0am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

>wrote:
>> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
>> >On Apr 16, 7:13=3DA0am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

>> >wrote:
>> >> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
>> [...]

>I haven't had a chance yet to do anything but take a quick look at


>your ABI-CODE example. It seems to me, in essential features, like
>CODE, with the exception of not loading sp into a register.

That parameters (including sp) are passed in in memory is a property
of the Intel ABI for the 386. Most other ABIs pass the parameters in
registers. E.g., here's the MY-+ example for the AMD64:

abi-code my+ ( n1 n2 -- n3 )
\ SP passed in di, returned in ax, address of FP passed in si
8 di d) ax lea \ compute new sp in result reg
di ) dx mov \ get old tos
dx ax ) add \ add to new tos
ret
end-code

>How is an
>update of the fp pointer handled?

We don't pass fp, but the address of a memory cell containing fp; if
you change fp, you have to store it back there. This is more
complicated than we would like, but most ABIs don't support returning
more than one cell in registers.

Example using fp on 386:

abi-code my-f+ ( r1 r2 -- r )
8 sp d) cx mov \ load address of fp
cx ) dx mov \ load fp
.fl dx ) fld \ r2
8 # dx add \ update fp
.fl dx ) fadd \ r1+r2
.fl dx ) fstp \ store r
dx cx ) mov \ store new fp


4 sp d) ax mov \ sp into return reg

ret \ return from my-f+
end-code

>Maybe it would be better to rename ABI-CODE to simply CODE.

That would break existing uses of CODE.

One interesting aspect about ABI-CODE is that it could be portable
across Forth systems on the same platform; of course, other Forth
systems also won't change their existing CODE, because they don't like
to break existing uses of CODE, either. One other thing that would be
needed for that would be a common assembler syntax, but an application
could achieve that by supplying the assembler words itself, or by
defining the ABI-CODE words with binary code ($4c c, $12345 , ...).

>I don't know of any highly portable Forth systems with a portable
>CODE, either. But the assemblers are not really portable either. If
>you want portability of CODE, it will probably have to be defined in
>the assembler itself, and not be a primitive.

I don't know what you mean here.

>> I don't see how to do it with assembly code alone. =A0Assembly code does
>> not know where gcc put the stack pointers. =A0That's the problem we


>> started with.
>>
>
>I haven't examined the gforth code lately. How do you refer to the sp
>and fp pointers from the C code?

I write "sp" and "fp".

> Why can't assembly code access those
>same pointers and load them into registers? Does it have to do with sp
>and fp being register variables?

Yes, if everything goes well, gcc keeps them in registers, and there
is no automatic way to find out in which ones.

>Are we going to let gcc push us around? :)
>
>Seriously, for a usable assembly interface, at some point you will
>have to wrest control away from gcc, and then hand it back.

Depending on the level you are talking about:

* If you talk about control flow at run-time: That's what CODE and
ABI-CODE do, in different ways.

* If you talk about forcing a particular register allocation on gcc:
We try that, but there is no guarantee that gcc then still compiles
the code and compiles it correctly. And if it does not, we compile
without forcing a register allocation, because, as essential as CODE
may seem to you, most people consider Gforth without portable CODE
words preferable to no Gforth.

>> A primitive? =A0The problem is (as has been with C calls): How does


>> SYSCALL know how many parameters and return values there are.
>>
>
>
>Below is the implementation of SYSCALL from kForth (vmc.c). It handles
>different number of parameters, up to 3 for now, but it can be
>extended. There is only one return value. All other returns are
>through address arguments.
>
>int C_syscall ()
>{

> /* stack: ( arg1 ... arg_n nargs nsyscall -- err | 0 <=3D n <=3D 3) */

So you pass the number of arguments in a count. Hmm, that may be
possible with system calls (which don't accept FP arguments), even
though it's not general enough for general C calls. Still, some
parameters or return values may be "long longs", which complicates
matters quite a lot.

One interesting system call that you have (and that I always use as
example for explaining the automatic type conversion of libcc) is
lseek(), because a 32-bit off_t is often not enough.

So, a system-call savvy programmer (few are) might know that he needs
to call _llseek in 32-bit Linux (don't know how it's solved in other
OSs); and he would need to pass a buffer for the result, and read from
that. And he would incur several portability problems: The system
call is Linux-specific; and fetching the result from the buffer is
cell-size-dependent. And he would have to make sure to call open with
the option O_LARGEFILE, otherwise it won't even open files that are
too big for 32-bit off_t.

Alternatively, he could just use libcc, define

\c #define _FILE_OFFSET_BITS 64
\c #include <sys/types.h>
\c #include <unistd.h>
c-function lseek n d n -- d

and the result will be portable between 32-bit and 64-bit platforms
and between OSs (even to OSs that don't support 64-bit off_t, but of
course there the program will not work on large files).

My point is: If we want to interface to an OS that sees the C library
as the primary way of accessing system calls and the basic system call
interface only as something to be used by legacy binaries (which some
of them find ok to break after 2 years
<http://lwn.net/Articles/374367/>), it's probably smarter to go
through the C library if we can (and with libcc, we can).

Anton Ertl

unread,
Apr 19, 2010, 11:14:50 AM4/19/10
to
Krishna Myneni <krishna...@ccreweb.org> writes:
>Would the following work?
>
>Looking at Gforth's engine/main.c, I see that you have global
>variables which contain the stack pointers:
>
>Cell *gforth_SP;
>Float *gforth_FP;
>Cell *gforth_RP;
>Address gforth_LP;
>
>So, when a CODE word is defined, CODE could insert header code which
>calls C code that copies the current register based stack pointers,
>apparently stored in sp, fp, rp, and lp, into the global pointers
>above. Then, architecture-specific instructions can copy from the
>global variables into the machine-specific registers, e.g. on x86,
>
> movl gforth_SP, %ebx
>
>etc.
>
>There's some performance penalty for copying the registers to memory
>and reloading them from memory, of course, but that may be the price
>you have to pay for a portable design.

Yes, something along such lines could work (there are some
complications, e.g., getting the Forth system to know the addresses of
the memory locations, and how to deal with reentrancy, if we want
that, but they can be solved). Then there is still the problem of
which registers are free to use and which have to be saved.

But, overall, it does not appear that this approach offers any
advantage over ABI-CODE; it's a relatively similar idea with similar
(probably slightly worse) overhead, and it requires much more
implementation effort.

Anton Ertl

unread,
Apr 19, 2010, 11:23:14 AM4/19/10
to
Albert van der Horst <alb...@spenarnc.xs4all.nl> writes:
>In article <2010Apr1...@mips.complang.tuwien.ac.at>,
>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
><SNIP>
>>
>>I have been thinking along the lines of using either explicit register
>>allocation for the interface (and assigning from Gforth's sp and fp to
>>the expliciatly allocated code_sp and code_fp), or using gcc's
>>extended asm feature to get access to the sp and fp variables in the
>>assembly code that copies them into the fixed registers. However, it
>>is possible that gcc either cannot allocate registers with the
>>resulting restrictions (resulting in gcc not building Gforth), or
>>produces a bad register allocation that slows down ordinary Forth code
>>significantly.
>
>Can you accomodate people like me?
>I *accept* beforehand that assembler code is non-portable.
>I *accept* beforehand that assembler code is non-portable even to the next
>release of Gforth.

What about the next installation of Gforth?

>I just want to know what the hell the register allocation is,
>but I want it endorsed, i.e. a somewhat official statement what
>they are. Or a somewhat official statement about how to find them
>from the sources.

Well, the released documentation (official enough?) discusses the
issue at length:

http://www.complang.tuwien.ac.at/forth/gforth/Docs-html-history/0.7.0/Code-and-_003bcode.html

and a little bit more in:

http://www.complang.tuwien.ac.at/forth/gforth/Docs-html-history/0.7.0/Common-Assembler.html

The most reliable approach IMO is to use reverse engineering using
SEE. Use:

* SEE NOOP to find out IP and how to dispatch
* SEE DROP and SEE NIP to find out SP and whether TOS is in memory
* SEE FDROP to find out FP and whether FTOS is in memory.
* SEE RDROP to find out RP.

You can then check words like @ + F+ ;S I to confirm your results.

If you see any primitive writing some temporary value to a register,
that register is obviously free to be used by primitives.

Krishna Myneni

unread,
Apr 20, 2010, 8:05:01 AM4/20/10
to
On Apr 19, 10:14 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:

> Krishna Myneni <krishna.myn...@ccreweb.org> writes:
> >Would the following work?
>
> >Looking at Gforth's engine/main.c, I see that you have global
> >variables which contain the stack pointers:
>
> >Cell *gforth_SP;
> >Float *gforth_FP;
> >Cell *gforth_RP;
> >Address gforth_LP;
>
> >So, when a CODE word is defined, CODE could insert header code which
> >calls C code that copies the current register based stack pointers,
> >apparently stored in  sp, fp, rp, and lp, into the global pointers
> >above. Then, architecture-specific instructions can copy from the
> >global variables into the machine-specific registers, e.g. on x86,
>
> >   movl gforth_SP, %ebx
>
> >etc.
>
> >There's some performance penalty for copying the registers to memory
> >and reloading them from memory, of course, but that may be the price
> >you have to pay for a portable design.
>
> Yes, something along such lines could work (there are some
> complications, e.g., getting the Forth system to know the addresses of
> the memory locations, and how to deal with reentrancy, if we want
> that, but they can be solved).  Then there is still the problem of
> which registers are free to use and which have to be saved.
>

If you can't determine which registers need to be saved, save them all
(isn't there a pushall instruction on the x86) in the prefix code for
the CODE word, and restore them all in the postfix code.

> But, overall, it does not appear that this approach offers any
> advantage over ABI-CODE; it's a relatively similar idea with similar
> (probably slightly worse) overhead, and it requires much more
> implementation effort.
>

Maybe not to the system implementor, but this approach offers a huge
advantage to the user, who has the expectation of knowing which
registers to use when he/she writes a CODE definition, without any
extra manipulation on their part.


Krishna

Krishna Myneni

unread,
Apr 20, 2010, 8:26:17 AM4/20/10
to
On Apr 19, 8:45 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)

I expect, as with syscalls386.fs, any existing use of CODE is broken
already.


> One interesting aspect about ABI-CODE is that it could be portable
> across Forth systems on the same platform; of course, other Forth
> systems also won't change their existing CODE, because they don't like
> to break existing uses of CODE, either.  One other thing that would be
> needed for that would be a common assembler syntax, but an application
> could achieve that by supplying the assembler words itself, or by
> defining the ABI-CODE words with binary code ($4c c, $12345 , ...).
>
> >I don't know of any highly portable Forth systems with a portable
> >CODE, either. But the assemblers are not really portable either. If
> >you want portability of CODE, it will probably have to be defined in
> >the assembler itself, and not be a primitive.
>
> I don't know what you mean here.
>

Since CODE will have to insert prefix code which is specific to the
architecture, it's definition can be placed in the architecture-
specific assembler. We do this in kForth, for example, in the source
for the asm-x86 assembler (asm-x86.4th):


VARIABLE CODE-STACK-PTR

: SIZED-CODE ( n -- )
ALSO ASSEMBLER
CREATE IMMEDIATE ?allot ASM-TO
TCELL # EBX ADD,
DOES>
POSTPONE LITERAL
POSTPONE CALL
CODE-STACK-PTR
POSTPONE LITERAL
POSTPONE a@
POSTPONE SP! ;

: END-CODE
EBX CODE-STACK-PTR #@ MOV, \ update stack ptr
RET,
ASM-RESET PREVIOUS ;

...

: SMALL-CODE SMALLCODESIZE SIZED-CODE ;

...

: CODE SMALL-CODE ; \ default for code provides 256 bytes

=======

CODE inserts the necessary prefix instructions, and END-CODE inserts
the necessary postfix instructions, which are machine specific, so
that the CODE definition can run properly within the Forth
environment. Note that CODE also resets the Forth stack pointer using
SP! after execution of the machine code.


> >> I don't see how to do it with assembly code alone. =A0Assembly code does
> >> not know where gcc put the stack pointers. =A0That's the problem we
> >> started with.
>
> >I haven't examined the gforth code lately. How do you refer to the sp
> >and fp pointers from the C code?
>
> I write "sp" and "fp".
>
> > Why can't assembly code access those
> >same pointers and load them into registers? Does it have to do with sp
> >and fp being register variables?
>
> Yes, if everything goes well, gcc keeps them in registers, and there
> is no automatic way to find out in which ones.
>
> >Are we going to let gcc push us around?  :)
>
> >Seriously, for a usable assembly interface, at some point you will
> >have to wrest control away from gcc, and then hand it back.
>
> Depending on the level you are talking about:
>
> * If you talk about control flow at run-time: That's what CODE and
>   ABI-CODE do, in different ways.
>
> * If you talk about forcing a particular register allocation on gcc:
>   We try that, but there is no guarantee that gcc then still compiles
>   the code and compiles it correctly.  And if it does not, we compile
>   without forcing a register allocation, because, as essential as CODE
>   may seem to you, most people consider Gforth without portable CODE
>   words preferable to no Gforth.
>

No, the whole point is to take gcc out of the picture and make its
register allocation irrelevant for CODE definitions. Only worry about
saving and restoring certain registers, which are guaranteed to be
usable by the assembly programmer.

Since syscall() is implemented on GNU systems, it is unlikely that the
C interface will ever break. The question is just one of how it is
used by the Forth system. I haven't encountered a system call yet that
would break the approach that I have used; however, I don't rule it
out. Since you mention the possibility of long long or fp arguments,
the Forth interface could be changed to ( ... caddr u -- ), where the
arguments, which could be on the data stack or fp stack, precede a
type specifier string, much like the method you use for C library
function import. One need not even specify the number of args then,
since it can be deduced from the arg type specifier string.


> My point is: If we want to interface to an OS that sees the C library
> as the primary way of accessing system calls and the basic system call
> interface only as something to be used by legacy binaries (which some
> of them find ok to break after 2 years
> <http://lwn.net/Articles/374367/>), it's probably smarter to go
> through the C library if we can (and with libcc, we can).
>

Maybe, but I think its unlikely that the syscall() interface would be
broken, since it's purpose appears to be to avoid the type of
situation you mention.


Krishna

Assad Ebrahim

unread,
Apr 1, 2014, 8:58:38 PM4/1/14
to
On Saturday, April 17, 2010 10:35:25 PM UTC+1, Anton Ertl wrote:
This looks good (from April 2010).

Did this ever get released?

- Assad
0 new messages