Kernel cache and an interesting paper "Efficient Symbolic Computation via Hash Consing"

0 views
Skip to first unread message

Qian Yun

unread,
May 9, 2026, 10:47:41 PM (3 hours ago) May 9
to fricas-devel
https://arxiv.org/abs/2509.20534

Well, using "hash consing" to eliminate common sub expression
is not a new idea, in fact, fricas is doing a better job via
Kernel than the authors described in their paper.
(Although they fail to realize that in section 5.3).

(Off topic a little: it seems that Reduce also has the notion
of Kernel, I wonder if this idea came from Reduce first,
or Scratchpad first, or both individually.)

Now, what is interesting to me from this paper is,
"weak-reference hash table".

If an entry in "weak-reference hash table" is not referenced
elsewhere, then garbage collector can recycle that.

I believe this can help to solve this issue for fricas:
we can't GC kernels from cache. Once a kernel is entered and
cached into the system, it stays there forever.
So for a long computation session, the kernel cache keeps
growing, and can slowdown the following computation.
(For example, we compute a thousand integrals in one session.
We can work around that by issuing ")clear", but let GC
taking care of it automatically is better and the correct
thing to do.)

Luckily, SBCL provides :weakness to "make-hash-table".


Our current Kernel cache implementation, if I understand correctly,
it was a linear list at frist, then Waldek optimize to Array to
allow binary search, then Kurt and I find that there's no
well defined order so binary search is unsound, so linear search
is back.

So changing current Array implementation to Lisp hashtable
should not have much performance impact, if any, it should
be faster.


It surprises me how well the abstraction and encapsulation
is provided by fricas. To change such a key component
is very quick and painless. This is just a proof of concept,
the "keys" call should be optimized into "with-hash-table-iterator".
And 10 tests are failed, probably due to the order of
kernels/expression is not the same as before.


So with the patch in attachment, I can confirm that unused kernels
(for unused kernels, I mean stuff like [sin x for x in 1..1000])
can be recycled by GC, in *spad* code (compiler) but not in interpreter.
I've already disabled history by ")set history off". I wonder
are there variables reference values computed in interpreter?

- Qian
poc-hashtable-kernel.patch

Waldek Hebisch

unread,
12:42 AM (1 hour ago) 12:42 AM
to fricas...@googlegroups.com
It is a bit more complicated: we keep things "as ordered as
we can" and in case of cache hit we usually should get result
from binary search. Linear search is only used to insert things
and in case of troubles with ordering.

> So changing current Array implementation to Lisp hashtable
> should not have much performance impact, if any, it should
> be faster.

Correcntess and speed very much depend on hash function used.
To fully and correctly use hash table we need a hash
function which respect mathematical equality. I was unable
to find such hash function.

I do not see how Lisp EQ hash table could help us: mathematicaly
equal expressions may be EQ but also may be not. So if we do
not get hit we need to do linear search. While cache hits
may be faster, I would expect much more cases when we need
linear search (actually, it is not clear to me if we will
get _any_ hits).

For purpose of garbage collecting kernels if would rather
investigate weak references.

> It surprises me how well the abstraction and encapsulation
> is provided by fricas. To change such a key component
> is very quick and painless. This is just a proof of concept,
> the "keys" call should be optimized into "with-hash-table-iterator".
> And 10 tests are failed, probably due to the order of
> kernels/expression is not the same as before.

I interrupted 'jet.input' after 16 minutes, I do not know if it
was just slow and would eventually finish (normally on the same
machine the whole testsuite finishes after 25 seconds of real
time).

Changed order does not look good, while we do not have total
order, some ordering relations were prdictible and apparantly
our code depends on them.

In 'integ' input the change apparently broke convertion of
'%iint' to 'polylog'.

--
Waldek Hebisch

Qian Yun

unread,
1:06 AM (27 minutes ago) 1:06 AM
to fricas...@googlegroups.com
On 5/10/26 12:42 PM, Waldek Hebisch wrote:
>
> I do not see how Lisp EQ hash table could help us: mathematicaly
> equal expressions may be EQ but also may be not. So if we do
> not get hit we need to do linear search. While cache hits
> may be faster, I would expect much more cases when we need
> linear search (actually, it is not clear to me if we will
> get _any_ hits).
>

The "EQ hashtable" is a little misleading: the insertion
of new kernels is still a linear search, just like current
situation. The comparison are done via "kerEqual", not EQ.


>
> I interrupted 'jet.input' after 16 minutes, I do not know if it
> was just slow and would eventually finish (normally on the same
> machine the whole testsuite finishes after 25 seconds of real
> time).
>

Did you do a fresh build? What's your SBCL version?
On my side, "make check" returns quickly with 10 failed tests,
no hangup.

- Qian

Reply all
Reply to author
Forward
0 new messages