On Nov 30, 6:34 am, Hugh Aguilar <
hughaguila...@yahoo.com> wrote:
> On Nov 29, 2:26 am, Mark Wills <
forthfr...@gmail.com> wrote:
>
> > On Nov 29, 5:56 am, Hugh Aguilar <
hughaguila...@yahoo.com> wrote:
> > > Mark: Since your TI Forth system is ITC, why don't you take a stab at
> > > writing a single-step source-level debugger? As I mentioned, I wrote
> > > one for my 65c02 system. It is not as difficult as you might suppose.
> > > I did it with screen-file source-code. It can be done with seq-file
> > > source-code though, I would suppose. I don't think that a debugger is
> > > all that useful, but writing one is pretty interesting --- and your
> > > users will be impressed. :-)
>
> This suggestion was a bad idea. I wasn't thinking straight when I said
> that.
>
Oh. Well. Now you've gone and thrown the gauntlet down, haven't
you?! ;-)
> I was able to write a source-level debugger because my 65c02 Forth was
> a cross-compiler and it was running on an MS-DOS machine (it was
> written in UR/Forth). The compiler needs to generate a large data
> structure containing the addresses of every word compiled in every
> definition. When the single-stepper is running, it stops at every one
> of these (a BRK instruction in my system, although in an ITC system
> DOCOLON will stop on every word). This address is looked up and the
> corresponding source-code is displayed. The host computer has to have
> a lot of memory for that gigantic data-structure, and it has to be
> pretty fast. You don't want to try this on a 1980s vintage TI99/4A
> computer --- you don't have the memory or the speed to do this --- it
> taxed the limits of the 80386 computer that I was using as a host.
>
Well, I have already written a simple debugger. It's not a single
stepper, though. It works like this:
You load the debugger and it modifies : and ; such that each
subsequently defined colon definition makes a call into a word (can't
remember what it's called) that displays the name of the executing
word, and the data-stack.
As it goes, the depth of the return stack is measured, and it uses
this to produce indentations on the on-screen display - thus one can
see how one's program nests and un-nests.
A new word is defined, BREAK, which stops the program, returning to
the command line, and giving a full return stack dump. You can scatter
BREAKs around your code at point where you think it may be going awry.
For example:
: test swap break ;
: harry 3 test ;
: dick 2 harry ;
: tom 1 dick ;
tom
And you'd get output that looked like this:
>tom (1) 1
>dick (2) 1 2
>harry (3) 1 2 3
>test (3) 1 2 3
BREAK in test in harry in dick in tom
Without a break, if you just let the program run, you'd get:
>tom (1) 1
>dick (2) 1 2
>harry (3) 1 2 3
>test (3) 1 2 3
<test (3) 1 3 2
<harry (3) 1 3 2
<dick (3) 1 3 2
<tom (3) 1 3 2
It's fairly simple to extend the above into a single step debugger. An
on-screen display showing the definition currently being executed with
a cursor pointing to the current word is less trivial, but it
possible. Again, I have a starting point, in that I already have SEE
for my system. So I already have code to de-compile a word. So it's
possible, and doesn't require a large list/table in memory, it's just
a different technique. In fact it would be an interesting excercise!
There's just one little itsy bitsy problem: Since I wrote my TRACER
program, I've used it once. And that was to demo it to someone else.
And the only reason I was showing it was to show them that facilities
such as a tracer/debugger can be written in Forth itself and the Forth
environment simply augmented with the functionality (they were
suitably impressed). I don't think I've touched it since. I just debug
at the command line.
In fact, I rarely use SEE. I only use SEE if I'm debugging a compiling
word. The last time I used SEE was a couple of weeks ago when I was
implementing your MACRO: idea (duly implemented as a loadable
extension and working beautifully - thank you for the inspiration!).
SEE has limitations (at least on my system) because some subroutines
in my system are headerless (don't have dictionary entries) so they
display as a ? when de-compiled. It's no problem to me, since I know
what's going on. But a newbie would wonder what's going on.
Unfortunately I don't have the ROM space available to allow headers
for everything. For example, DOES> compiles a DODOES, but DODOES is
headerless. This would be a problem in a single stepping debugger,
because it would not be possible to display the names for headerless
words. This is a limitation of my system due to memory constraints. I
only have 16K. My Forth system is implemented as a plug in cartridge:
http://turboforth.net/about_turboforth.html
>
> For your old 16-bit computer, DTC should help to speed up the system.
> It will also make the programs slightly larger. Instead of a pointer
> in front of each colon word, you have a chunk of code. It is true that
> NEXT should be smaller, so every primitive will be slightly smaller,
> but this won't reduce the size of the system very much --- overall,
> more memory will be needed.
>
I had a look at this yesterday and got myself tied up in knots. I
couldn't work out how to bootstrap the thing; to get it started. How
does the 'interpreter' for a high-level definition execute the words
in the thread. I couldn't figure it out in my lunch break and had to
junk what I had done. Need more time to concentrate. I was missing
something very fundamental. I was using the : SQUARE DUP * ; as my
target but didn't get anywhere.
> A better way to boost the speed, is with some optimization. In many
> cases, there are pairs of producers and consumers. For example, LIT is
> a producer because it produces some data for the parameter stack, and
> + is a consumer because it consumes some data from the parameter
> stack. These pairs are inefficient because the producer pushes data
> onto the stack, and the consumer immediately pops that data off the
> stack. The solution is to combine them into a single word. For
> example, write a primitive LIT_+ that combines what LIT and + do. It
> would hold the literal value in a register rather than push it onto
> the stack and then pop it off again.
This is an excellent suggestion. Perhaps an easier way (at least, in
terms of performing optimisations) is to make every word in the
dictionary immediate. Then, every word can 'look ahead' and see what
is about to be compiled and intervene accordingly. It would be very
difficult to produce a standard Forth with such a system though! I
wonder if anyone has previously experimented with such a technique?
>
> Even with ITC, it is possible to optimize pairs like this. Make your
> compiler smart enough to remember what the last word it compiled was.
> When it is ready to compile the next word, it checks what the last
> word was and, if they are an optimizable pair, it compiles the combo
> instead. For example, if the last thing you did was LIT, when you are
> about to compile + your compiler will instead back up and get rid of
> the LIT and replace it with LIT_+. This kind of peephole-optimization
> not only makes your program faster, but smaller as well.
>
I'll add your peephole suggestion to my "things to look at in the next
version" list. The next version (V2.0) is the version that I tell
myself I'm *not* going to write, but I know I 99.9% probably will.
It's like a bloody drug. It's the classic symptom of wanting to start
with a clean sheet, to implement all the 'lessons learned' that you
spent blood, sweat and tears learning on the first implementation.
There are many aspects of V1.x that have been re-written a couple of
times as I learned (from other Forthers, some here on this list) or
simply discovered (as part of the Forth awakening procees) better way
to do things.
[ and to the nay-sayers: I *do* write Forth code too. Not just a
compiler. But my Forth coding is for fun. I'm still learning. I write
stuff like this:
http://turboforth.net/tutorials/darkstar.html ]
I also want to spend some time looking at Smalltalk though (a project
for 2013) - not writing a smalltalk system, just learning the
language. It's the OOP equivalent of Forth. It's beautiful (though
very slow, I believe). It looks very interesting indeed to me. Despite
being pure OO it shares the idea of terseness and brevity and total
simplicity that Forth has.
Things for 2013:
* VFX (I want to do some simple SCADA stuff using serial and IP comms)
* Smalltalk
* TurboForth V2.0 (maybe - it'll be a part-time when-feel-like-it
thing)
> You are right though, that DTC is low-hanging fruit, and much easier
> to implement. Peephole-optimization is somewhat more difficult, but
> not unreasonably difficult. You can do the peephole-optimization on a
> piece-meal basis. Start with + and make it smart enough to combine
> with all the likely producers, then do ! and +! and so forth --- you
> don't have to do everything at once, just doing + should boost the
> speed significantly, and you can go from there.
>
Yeah. You've got me thinking now! I need a lot more memory to do this
though than I currently have. Still, I plan to make the next version a
64K EPROM but I can go up to 128K in an eprom if I need to. That's 16
8K pages which is a PITA, but doable.
> BTW: I'm switching from DTC to ITC on my system. This is because I
> realize (from reading this thread!), that with DTC the DOCOLON code is
> scattered all around and won't be in the code cache, whereas with ITC
> the whole VM should be in the code cache.
Well, if your high-level definitions make a CALL to DOCOLON then
there's no reason why DOCOLON would not be in the cache. The expense
of the call might be less than the delay induced by a cache miss.
However, I'd urge you to take a step back and a deep breath. I was
reading your post on the x86 group where you are discussing caching
etc. However, I have to point out that if the Forth you are intending
to produce is primarily for embedded systems then the chances of the
embedded system running on an x86 processor are quite low. It's much
more likely to be an ARM variant. In other words, don't allow key
descisions about the architecture of your system to be guided by
relatively un-important architectural constraints of a particular
processor family.
I'd urge you to get out a notebook and pencil. Sit down somewhere
quiet and write a list of key things that you want the system to do.
Design goals. Then put the list away. Reflect on it for a couple of
days and go back and make changes. Iterate. Eventually your thoughts/
ideas/requirements will coalesce. There's your plan/design goals. When
you've got it done, pin it up on the wall above your computer. Let it
be your guide as you develop the *project*, and when you feel a knee-
jerk step-change coming on, consult the plan again! Don't be swayed.
Stick to the plan. Have faith in the design decisions you made
earlier, even if you've had a bright idea.
I failed to make a plan/design and ended up with many many more
iterations/builds/bugs/teeth-knashing/wailing than I should have. It's
okay for me, because it's a hobby system and is given away for free.
Your aspirations are somewhat higher though, wanting a good system for
embedded targets. So I'd urge due consideration and diligence!
Just my two cents, FWIW!
Mark