On Tue, 22 May 2018 12:48:11 GMT
Kip Ingram <
kipi...@kips-air.attlocal.net> wrote:
> First off, I've made this system work in Linux and on MacOS, but
> I haven't tried it in Windows. I don't know for sure there's not
> some sort of show-stopper there. But here are some more details.
My Forth is only working for DOS, with two C compilers, OpenWatcom and
DJGPP. I have some hardcoded offsets for the dictionary that need
adjusting for 64-bit Linux.
> For the initial construction of the thing the "primitives" were
> written in C - most of them were very short "one liners." The key
> Forth registers were housed in C variables which were directly
> manipulated by the primitives.
I worked from C first, then migrated, transformed, rewrote, to the
point that the primitives are in C and the high-level words are in
Forth.
Unlike virtually every other Forth in C, I don't use a switch()
statement as an "interpreter", but use an actual ITC interpreter coded
in C. The address/inner interpreter is coded in C using W, IP etc. The
outer/text interpreter is high-level Forth word, pre-"compiled" in C to
match the same layout as a high-level colon-def.
> The primary system lived in a memory block I allocated with malloc.
Ditto.
My allocation amount is completely arbitrary at this point, since I
don't know how large the dictionary will grow. I took a guess at the
average Forth word size and quantity of words in the dictionary.
> The structure was pretty much standard FIG: name string, link field,
> code field, parameter region.
Ditto.
Except, I moved the LFA to the start of the dictionary header. This
was originally so I could move through the dictionary quickly in C
using a linked list. The C code no longer accesses the dictionary
directly, as all of the high-level words are now in Forth.
> The code field held the address of a primitive or some other bit of
> code that was written in C. It was all very standard.
Ditto.
I have the standard CFA routines, except for DOUSER.
> In that situation I could not create new *code* words, but I could
> make new colon definitions in a completely straightforward way - I
> just added the new header and definition onto the end of the growing
> image.
Ditto.
Yes, I hadn't thought about that, but the primitives (or "code" words)
must be compiled in for mine as well, as the low-level is in C and
there is no assembler. An assembler in Forth would have register
incompatibilities with the C code.
> Since then I've made progress toward getting EVERYTHING into the image
> buffer. First step there is to use a library call at the outset to
> make that image have executable permissions. I have confirmed that I
> can add new code words to the system by manually poking the
> appropriate bytes into the image - it works.
And, earlier,
> These primitives of course resided
> whereever gcc put them in the program code.
You've gone further than I have here. My primitives and their
dictionary headers are wherever the C compiler places them in the
executable image. The dictionary is in malloc'd space. I.e., the
primitives and high-level colon-def words not together. I'm not sure
that copying the primitives or relocating them to the malloc'd
dictionary space would have any advantage. I would need some C code to
patch up their field addresses for the relocation.
> Then I created (in C) a "primitive wrapper" that moves the C
> variables of the Forth VM into registers, jumps to an in-image
> address that is expected to house machine code, and then when that
> code passes control back it moves the VM regs back into the C
> variables. This of course is not efficient and no good for a final
> working point, but it does allow me to develop Forth source code for
> primitives that would function eventually in a fully final way.
Well, you've progressed more to that of a standalone OS-like Forth, as
you've transferred execution directly into the image. You did
mention working with embedded environments. It's also interesting that
you're saving and restoring the registers similar to DOS DPMI calls.
I could do these things, but I don't see the point for a generic
Forth, not embedded, which starts from an existing OS. Why does your
Forth need to be entirely within a self-contained image which can be
saved and restarted, apparently? Is this just for embedded? Or, do you
intend to use it as an OS too, like early Forths?
> That's where I am right now - this is just a hobby project and I don't
> have a lot of time to work on it.
I haven't worked on mine in a few years either, also hobby project. I
started it about a decade ago or so. I have a small set of high-level
Forth words to complete to fully pass Hayes core. It was a challenge
to me as I'm not a Forth programmer, but a C coder.
> But the next step is to get all of
> the primitives (all things that require C code be called) implemented
> in-image,
Ok.
> and fully "cut the cord" to C.
How is that possible? You still need C to be compiled to initially
produce the binary code for the primitives or other low-level code
words, yes? Do you plan to re-implement this code from Forth assembly?
> After that, I plan to learn enough about executable file formats to
> be able to write my image out to a file that can be run stand-alone.
This reminds me of how I created my OS. I started with C code for
DOS. DOS has virtually no OS or hardware protections. Then, I started
my OS image from DOS, and worked backwards to booting the image from a
boot loader.
> So I have a game plan for eventually removing the C "scaffolding"
> that's been required to bring the thing into existence - when I'm done
> it would be a fully native Forth.
Ok.
> Finally, I want to work toward
> Forth source for the whole system, so that it can "recompile itself."
I started doing that with my Forth, but backed away from that path.
I.e., I started re-implementing the low-level primitives and outer
interpreter in high-level Forth, but realized that the code would still
call some low-level code, e.g., primitives or inner interpreter, which
was coded in C. I would still prefer to bootstrap the system from a
lower level than it is presently, but that appeared to be much more
complicated than having a few extra words created using C.