OMIT and PARSE

24 views
Skip to first unread message

The Beez

unread,
Apr 8, 2013, 12:57:05 PM4/8/13
to 4tH-compiler
Hi 4tH-ers!

These are some of the oldest opcodes in the 4tH VM. So far they
required a terminator (a NULL byte) to see whether the end of the
buffer was reached. And that was OK for a long time. In ANSBLOCK.4TH I
sneaked in an extra byte to terminate the only buffer, so EVALUATE did
work. However, more and more we rely on the address/count routine -
including those that are in SOURCE. Take the new MULTIBLK.4tH block
driver.

I tried to fix it, but the murky code only became murkier (though it
did work). When I wrote it I was more focused on replicating Forth
behavior than actually understanding what I was doing. Time to fix it
once and for all.I've tried to make the code as understandable (and
portable) as possible as well as making the loop itself as fast as I
know how. Consequently, the startup and shutdown code is a bit
bulkier.

I don't have the habit of publishing C code here, but since this is in
the core of your VM (the part that executes YOUR code) I felt
compelled to break this rule this once. First the old code, then its
replacement. Comment is welcome, of course. I will hold on to the code
a little while longer to ensure this nicely tested code also works in
the real world.

Hans Bezemer

---8<---
< CODE (OMIT) DSIZE (1); a = Vars [VTIB] + Vars [VTIBS]
+ 1;
< do t = fetch (Vars [VTIB] + ((*In)++));
< while ((t == (unit) DS (1)) && (t)
< && (Vars [VTIB] + (*In) < a));
< --(*In); DDROP; NEXT;
< CODE (PARSE) DSIZE (1); DFREE (1); a = DS (1); b = 0L;
< c = Vars [VTIB] + Vars [VTIBS] + 1;
< DS (1) = Vars [VTIB] + (*In);
< do { /* note >IN is incremented */
< t = fetch (Vars [VTIB] + ((*In)++)); b+
+;
< } /* before it is tested! */
< while ((t != (unit) a) && (t)
< && (Vars [VTIB] + (*In) < c));
< if ((! t) || (Vars [VTIB] + (*In) == c))
--(*In);
< DPUSH (--b); NEXT;
---
> CODE (OMIT) DSIZE (1); a = DPOP; b = Vars [VTIB];
> c = b + Vars [VTIBS];
> while (b + (*In) < c) {
> t = fetch (b + (*In));
> if ((t != (unit) a) || (! t)) break;
> else (*In)++;
> }
> NEXT;
> CODE (PARSE) DSIZE (1); DFREE (1); a = DS (1); b = Vars [VTIB];
> c = b + Vars [VTIBS];
> DS (1) = b + (*In); DPUSH ((*In));
> while (b + (*In) < c) {
> t = fetch (b + (*In));
> if ((t == (unit) a) || (! t)) break;
> else (*In)++;
> }
> DS (1) = (*In) - DS (1); NEXT;
---8<---

The Beez

unread,
Apr 10, 2013, 5:34:06 AM4/10/13
to 4tH-compiler


On 8 apr, 18:57, The Beez <the.beez.spe...@gmail.com> wrote:
Well, it didn't take too long until I found an annoying error in
PARSE. Of course the delimiter itself may never be part of the result,
so I had to fix that. On the other hand, you want parsing to STOP at
the end of the buffer, whether it is reached because it's at the end
or whether a NULL byte is encountered. Subsequent parses shouldn't
increment >IN anymore, nor return anything but an empty string.

OMIT doesn't have that problem, because it simply stops at the first
non-matching character - in other words, since it always skips
characters it doesn't need to skip anything else.

At the same time I fixed an "error" in Accept(). I mean an "error",
because the standard simply states that then ACCEPT gets a count of 0,
the result is undetermined. 4tH returned a character in that case. Now
it returns a NULL string, a much more sane result.

Lately, I've ironed out a lot of these bugs. Most likely, you will
never encounter them, because you feed 4tH with sane programming, but
nonetheless it should react in a controlled way when it is confronted
with INSANE programming.

Hans Bezemer
Reply all
Reply to author
Forward
0 new messages