Is there an advantage to Forth dictionary restructuring?

Rod Pemberton

unread,

Oct 10, 2017, 4:23:49 AM10/10/17

to

Is there an advantage to Forth dictionary restructuring?

Hugh Aquilar, (Yes, him...) in "hash-tables in Forth" thread, said:

HA> Words such as DUP and SWAP are going to be looked up a lot more
HA> often than words such as >R and R@ etc.. We can expect BYE to only
HA> be accessed once per session, [...]

So, the idea of a self-sorting dictionary, based on Forth word
execution frequency, to improve Forth word look-ups, comes to mind.
Has this been done before? I know that some people have used hash
values to speed up dictionary searches.

How would you restructure the historical Forth dictionary format to be
sortable by NFA? E.g., another linked-list comprised of the sorted
NFAs, a search count to sort by, and a pointer to the remaining portion
of the original dictionary entry, e.g., CFA LFA, PFA?

Or, would you do something else instead, like use hash values, or
perhaps a binary tree, maybe self-balancing, as Hugh might be doing?

Rod Pemberton
--
"The duty of the American president is ... not [to] cater to one
segment of a political base ...," said Diane Feinstein. That's exactly
what Obama did for eight years.

Helmar Wodtke

unread,

Oct 10, 2017, 4:34:58 AM10/10/17

to

I've done this long time before as memory for a hash table was too much for the machine. The idea was simply to put the last found word at top of the linked list that will be searched next time. Frequent words are at top of the list. Also words that are actually freshly defined. This might speed up linear search in linked lists on very slow computers. But if you have a hash table... I dont know if this effort is worth it.

Regards,
-Helmar

Wolf Wejgaard

unread,

Oct 10, 2017, 5:07:01 AM10/10/17

to

Holonforth uses a separate "alfa-link" where all words are arranged in
alfbetical sequence independent of any structuring (wordlists, etc).
There is a table that points to the first word in the alfa-link that starts
with the initial char of the name you search. This is my "alfa-hash", only
seach through the words with this initial char.

You need global word names, of course, but that"s good anyway.

But note, in Holonforth the dictionary is created by the editor, there is
ample time to insert an alfa-link for new words. The compiler merely nserts
code pointers. I like having the dictionary available all the time, for all
words, independent of code state.

Wolf Wejgaard

Alex

unread,

Oct 10, 2017, 9:22:28 AM10/10/17

to

On 10-Oct-17 09:25, Rod Pemberton wrote:
>
> Is there an advantage to Forth dictionary restructuring?
>
> Hugh Aquilar, (Yes, him...) in "hash-tables in Forth" thread, said:
>
> HA> Words such as DUP and SWAP are going to be looked up a lot more
> HA> often than words such as >R and R@ etc.. We can expect BYE to only
> HA> be accessed once per session, [...]
>
> So, the idea of a self-sorting dictionary, based on Forth word
> execution frequency, to improve Forth word look-ups, comes to mind.
> Has this been done before? I know that some people have used hash
> values to speed up dictionary searches.
>
> How would you restructure the historical Forth dictionary format to be
> sortable by NFA? E.g., another linked-list comprised of the sorted
> NFAs, a search count to sort by, and a pointer to the remaining portion
> of the original dictionary entry, e.g., CFA LFA, PFA?
>
> Or, would you do something else instead, like use hash values, or
> perhaps a binary tree, maybe self-balancing, as Hugh might be doing?
>
>
> Rod Pemberton
>

There was a very long conversation as to why Hugh's method does not work
in practice (and Hugh never did the basic testing to prove otherwise):
https://groups.google.com/d/msg/comp.lang.forth/w3tHPsTaZvE/dlIQX4nEXSwJ

--
Alex

dxf...@gmail.com

unread,

Oct 10, 2017, 9:04:47 PM10/10/17

to

IIRC LMI PC/Forth employed a hash table
of recently used words to speed up
searches. It still maintained a
conventional dictionary however. This was
back in the days of slow pc's.

Mark Wills

unread,

Nov 13, 2017, 8:13:26 AM11/13/17

to

On Tuesday, 10 October 2017 09:23:49 UTC+1, Rod Pemberton wrote:
> Is there an advantage to Forth dictionary restructuring?
>

In 1981, on a TRS80 compiling from blocks on floppy? Maybe. Dictionary
search times would extend as the program got larger, but the floppy disk
access times would still be the bottleneck to compile speed.

In 2017, who cares? If it takes an extra millisecond will you care? :-)

foxaudio...@gmail.com

unread,

Nov 13, 2017, 9:56:41 AM11/13/17

to

On the desktop machines it is not a big deal, but I bet it would matter on MSP430 and for sure on our TI-99s :-)

a...@littlepinkcloud.invalid

unread,

Nov 13, 2017, 10:30:37 AM11/13/17

to

Mark Wills <markwi...@gmail.com> wrote:
> On Tuesday, 10 October 2017 09:23:49 UTC+1, Rod Pemberton wrote:
>> Is there an advantage to Forth dictionary restructuring?
>>
> In 1981, on a TRS80 compiling from blocks on floppy? Maybe.

It did. I remember the polyFORTH compilation speed was considerably
better than fig-FORTH's, even with floppies.

And that's not really surprising: step time was maybe 12 ms per track
and the rotational latency perhaps 100ms. To search a dictionary of
16k you'd have to traverse 6k(too much?) of headers, and each byte
would take maybe four instructions to scan. So each search would take
maybe 10ms. Say 150 words to search for on each block: that's more
than a second per block for dictionary searches alone. That's how I
remember it, and it's a whole order of magnitude slower than the disk.

Take the polyFORTH hashed dictionary, with much more efficient
scanning, and you can cut that search time by a factor of ten.

Andrew.

c...@forthworks.com

unread,

Nov 13, 2017, 1:45:19 PM11/13/17

to

I've never found the speed of dictionary lookups to be an issue.

In my Forth, lookups are costly. The dictionary is a linked list, and
I do a full comparison against each name. Additionally, this runs on a
virtual machine emulating a MISC instruction set, with no special focus
on performance.

With `dup` as the first entry, and 100,000 words with thirty character
names, finding and executing `dup` takes 0.74 seconds on my system. I
can live with that, especially as I'm unlikely to ever actually have a
dictionary that big in anything other than a benchmark test.

-- crc