Kernel cache and an interesting paper "Efficient Symbolic Computation via Hash Consing"

30 views
Skip to first unread message

Qian Yun

unread,
May 9, 2026, 10:47:41 PMMay 9
to fricas-devel
https://arxiv.org/abs/2509.20534

Well, using "hash consing" to eliminate common sub expression
is not a new idea, in fact, fricas is doing a better job via
Kernel than the authors described in their paper.
(Although they fail to realize that in section 5.3).

(Off topic a little: it seems that Reduce also has the notion
of Kernel, I wonder if this idea came from Reduce first,
or Scratchpad first, or both individually.)

Now, what is interesting to me from this paper is,
"weak-reference hash table".

If an entry in "weak-reference hash table" is not referenced
elsewhere, then garbage collector can recycle that.

I believe this can help to solve this issue for fricas:
we can't GC kernels from cache. Once a kernel is entered and
cached into the system, it stays there forever.
So for a long computation session, the kernel cache keeps
growing, and can slowdown the following computation.
(For example, we compute a thousand integrals in one session.
We can work around that by issuing ")clear", but let GC
taking care of it automatically is better and the correct
thing to do.)

Luckily, SBCL provides :weakness to "make-hash-table".


Our current Kernel cache implementation, if I understand correctly,
it was a linear list at frist, then Waldek optimize to Array to
allow binary search, then Kurt and I find that there's no
well defined order so binary search is unsound, so linear search
is back.

So changing current Array implementation to Lisp hashtable
should not have much performance impact, if any, it should
be faster.


It surprises me how well the abstraction and encapsulation
is provided by fricas. To change such a key component
is very quick and painless. This is just a proof of concept,
the "keys" call should be optimized into "with-hash-table-iterator".
And 10 tests are failed, probably due to the order of
kernels/expression is not the same as before.


So with the patch in attachment, I can confirm that unused kernels
(for unused kernels, I mean stuff like [sin x for x in 1..1000])
can be recycled by GC, in *spad* code (compiler) but not in interpreter.
I've already disabled history by ")set history off". I wonder
are there variables reference values computed in interpreter?

- Qian
poc-hashtable-kernel.patch

Waldek Hebisch

unread,
May 10, 2026, 12:42:36 AMMay 10
to fricas...@googlegroups.com
It is a bit more complicated: we keep things "as ordered as
we can" and in case of cache hit we usually should get result
from binary search. Linear search is only used to insert things
and in case of troubles with ordering.

> So changing current Array implementation to Lisp hashtable
> should not have much performance impact, if any, it should
> be faster.

Correcntess and speed very much depend on hash function used.
To fully and correctly use hash table we need a hash
function which respect mathematical equality. I was unable
to find such hash function.

I do not see how Lisp EQ hash table could help us: mathematicaly
equal expressions may be EQ but also may be not. So if we do
not get hit we need to do linear search. While cache hits
may be faster, I would expect much more cases when we need
linear search (actually, it is not clear to me if we will
get _any_ hits).

For purpose of garbage collecting kernels if would rather
investigate weak references.

> It surprises me how well the abstraction and encapsulation
> is provided by fricas. To change such a key component
> is very quick and painless. This is just a proof of concept,
> the "keys" call should be optimized into "with-hash-table-iterator".
> And 10 tests are failed, probably due to the order of
> kernels/expression is not the same as before.

I interrupted 'jet.input' after 16 minutes, I do not know if it
was just slow and would eventually finish (normally on the same
machine the whole testsuite finishes after 25 seconds of real
time).

Changed order does not look good, while we do not have total
order, some ordering relations were prdictible and apparantly
our code depends on them.

In 'integ' input the change apparently broke convertion of
'%iint' to 'polylog'.

--
Waldek Hebisch

Qian Yun

unread,
May 10, 2026, 1:06:49 AMMay 10
to fricas...@googlegroups.com
On 5/10/26 12:42 PM, Waldek Hebisch wrote:
>
> I do not see how Lisp EQ hash table could help us: mathematicaly
> equal expressions may be EQ but also may be not. So if we do
> not get hit we need to do linear search. While cache hits
> may be faster, I would expect much more cases when we need
> linear search (actually, it is not clear to me if we will
> get _any_ hits).
>

The "EQ hashtable" is a little misleading: the insertion
of new kernels is still a linear search, just like current
situation. The comparison are done via "kerEqual", not EQ.


>
> I interrupted 'jet.input' after 16 minutes, I do not know if it
> was just slow and would eventually finish (normally on the same
> machine the whole testsuite finishes after 25 seconds of real
> time).
>

Did you do a fresh build? What's your SBCL version?
On my side, "make check" returns quickly with 10 failed tests,
no hangup.

- Qian

Waldek Hebisch

unread,
May 10, 2026, 8:31:10 AMMay 10
to fricas...@googlegroups.com
On Sun, May 10, 2026 at 01:06:44PM +0800, Qian Yun wrote:
> On 5/10/26 12:42 PM, Waldek Hebisch wrote:
> >
> > I do not see how Lisp EQ hash table could help us: mathematicaly
> > equal expressions may be EQ but also may be not. So if we do
> > not get hit we need to do linear search. While cache hits
> > may be faster, I would expect much more cases when we need
> > linear search (actually, it is not clear to me if we will
> > get _any_ hits).
> >
>
> The "EQ hashtable" is a little misleading: the insertion
> of new kernels is still a linear search, just like current
> situation. The comparison are done via "kerEqual", not EQ.

Well, currently in 'kernelEnterInCache' we first try binary
search. AFAICS this handles majority of calls. Only when
this fails we do linear search. Doing linear search
always is likely to slow down things. It is not clear to
me if smaller cache is enough to compensate this.

Also, there are potential traps when doing linear search
in a hash table. Namely, comparisons may insert new kernels.
Current code contains special cases so that insertion
douring search is handled correctly.

> > I interrupted 'jet.input' after 16 minutes, I do not know if it
> > was just slow and would eventually finish (normally on the same
> > machine the whole testsuite finishes after 25 seconds of real
> > time).
> >
>
> Did you do a fresh build?

Yes.

> What's your SBCL version?

Debian sbcl 2.2.9

> On my side, "make check" returns quickly with 10 failed tests,
> no hangup.

--
Waldek Hebisch

Qian Yun

unread,
May 10, 2026, 10:50:10 AMMay 10
to fricas...@googlegroups.com
On 5/10/26 8:31 PM, Waldek Hebisch wrote:
>
> Well, currently in 'kernelEnterInCache' we first try binary
> search. AFAICS this handles majority of calls. Only when
> this fails we do linear search. Doing linear search
> always is likely to slow down things. It is not clear to
> me if smaller cache is enough to compensate this.
>
Yes, for new kernels, both methods need to linear search
all cache. For existing kernels, current method is much faster.

A benchmark: src/input/mapleok.input, contains 248 integrals.
Current time: 13.94s ; hashtable: 13.39s.
Number of kernel cache drops from 4876 to 355.
")trace SCACHE )timer )nt" shows that over 8 seconds is spent
in SCACHE, is that accurate?!

For even longer sessions, hashtable method may have advantage.

But we can have the best of both worlds: when the array cache
needs to expand (or hitting some threshold, or triggered manually),
use sb-ext:make-weak-pointer on the kernels, do a GC,
create the smaller cache again. I'll report back.

- Qian

Qian Yun

unread,
May 10, 2026, 11:11:09 PMMay 10
to fricas...@googlegroups.com
On 5/10/26 10:50 PM, Qian Yun wrote:
>
> But we can have the best of both worlds: when the array cache
> needs to expand (or hitting some threshold, or triggered manually),
> use sb-ext:make-weak-pointer on the kernels, do a GC,
> create the smaller cache again. I'll report back.
>
> - Qian
>

Naively using weak pointer will get trouble, this might also
explain the errors met by the previous hashtable approach:
say you are doing integrals, the program may just construct
a useful kernel, and enter it to cache, but it is not referenced
by any expression, then it can get recycled by GC, causing
failures later.

I wonder if it is safe to recycled kernels at top level
interpreter, manually, between 2 commands. If so, we can
automate the process by running this kernel recycle function
via something like a hook.

In the attachment is my draft version, I haven't tested it
extensively yet.

- Qian
poc-recycle-kernel-cache-v2.patch

Waldek Hebisch

unread,
May 10, 2026, 11:43:18 PMMay 10
to fricas...@googlegroups.com
On Mon, May 11, 2026 at 11:11:04AM +0800, Qian Yun wrote:
> On 5/10/26 10:50 PM, Qian Yun wrote:
> >
> > But we can have the best of both worlds: when the array cache
> > needs to expand (or hitting some threshold, or triggered manually),
> > use sb-ext:make-weak-pointer on the kernels, do a GC,
> > create the smaller cache again. I'll report back.
> >
> > - Qian
> >
>
> Naively using weak pointer will get trouble, this might also
> explain the errors met by the previous hashtable approach:
> say you are doing integrals, the program may just construct
> a useful kernel, and enter it to cache, but it is not referenced
> by any expression, then it can get recycled by GC, causing
> failures later.

I do not think this is a problem. Either kernel is reachable
from some variable (possibly local variable in a function) or
is useless. GC should not collect things reachable from
variables. Of course, weak pointer should be used for
reference from kernel cache to actual kernel. All other
places should just work with kernels.

A different problem needs to be handled: GC can happen
essentially at any time and cause kernel to "vanish".
But binary search uses comparisons so needs something
to work correctly (linear search could just skip
dangling weak pointers).

--
Waldek Hebisch

Qian Yun

unread,
May 11, 2026, 12:28:31 AMMay 11
to fricas...@googlegroups.com
OK, maybe you are right. The failure I saw happens in
"insertInCache". Maybe during "insertInCache", GC happens
and "compactCache" is called and modifies cache array,
causing "insertInCache" to fail.

I'll try your "allow dangling weak pointers" idea.

- Qian

Qian Yun

unread,
May 11, 2026, 12:53:56 AMMay 11
to fricas...@googlegroups.com
On 5/11/26 12:28 PM, Qian Yun wrote:
>
> OK, maybe you are right. The failure I saw happens in
> "insertInCache". Maybe during "insertInCache", GC happens
> and "compactCache" is called and modifies cache array,
> causing "insertInCache" to fail.
>
> I'll try your "allow dangling weak pointers" idea.
>
> - Qian
>

This should be the root cause, the timing of this kernel
cache recycle should not happen during kernel insertion.

See my attachment, this patch does not break any regression
test, and speeds up src/input/mapleok.input by 25%!

And there is room for improvement, the parameters can be
tweaked.

- Qian
poc-recycle-kernel-cache-v3.patch

Waldek Hebisch

unread,
May 14, 2026, 10:03:51 AMMay 14
to fricas...@googlegroups.com
It looks promising. I think it needs more stress testing,
for example cases when there is really a lot of kernels.
And portability to other Lisp-s.

--
Waldek Hebisch

Qian Yun

unread,
May 16, 2026, 7:59:06 AMMay 16
to fricas-devel
Good news!

Answer to myself:

In interpreter, all the computed value is stored in BOOT variable
$frameRecord, it is used to implement ")undo".
(This is found out by sb-ext:search-roots.)

I use ")undo" very little. At least we should have a setting
to toggle the ability of ")undo", just like ")history".

So when this mechanism is disabled, we can further decrease the
number of kernels in cache during a session.

- Qian

Qian Yun

unread,
May 18, 2026, 10:48:13 AM (12 days ago) May 18
to fricas-devel
On 5/16/26 7:59 PM, Qian Yun wrote:
> Good news!
>
> Answer to myself:
>
> In interpreter, all the computed value is stored in BOOT variable
> $frameRecord, it is used to implement ")undo".
> (This is found out by sb-ext:search-roots.)
>
> I use ")undo" very little. At least we should have a setting
> to toggle the ability of ")undo", just like ")history".
>
> So when this mechanism is disabled, we can further decrease the
> number of kernels in cache during a session.

It can be disabled by:

)boot $undoFlag := false

- Qian
Reply all
Reply to author
Forward
0 new messages