Ageing, checkpoint, and OTEntry scans

10 views
Skip to first unread message

Jonathan S. Shapiro

unread,
Mar 6, 2026, 1:45:11 PM (7 days ago) Mar 6
to cap-talk
A thought occurred to me during my last note responding to William.

One of the problems with ageing is that you don't get a lot of signals about which objects are still needed. Once you get them into memory, and any mappings are set up, they tend to stay around if you have sized the depend tables correctly and guessed right on the number of mapping tables you need. You can see activity when capabilities are explicitly invoked, but there is very little visibility into load and store behavior. You end up with (very roughly) patterns:
  1. Short-lived stuff where a spacebank gets destroyed and you see objects explicitly freed.
  2. Long-lived stuff that is actively used and does its best to stay resident.
  3. Long-lived stuff that is seldom used, but occupies DRAM it doesn't need until it is aged out from the last ageing pool.
Coyotos now has three related background passes, and I am prompted to wonder if they can be merged. The first is ageing, the second is checkpoint, and the third is the background unswizzler. All of them are triggered by in-bound I/O rate, checkpoint can also be triggered by a watchdog timer.

In KeyKOS, the checkpoint pass involved a synchronous pass to mark all the dirty objects immutable and invalidate mapping tables. This cost is linear with DRAM size, and it induces background I/O. In Coyotos all of that is incremental. Once you make it into the kernel, the snapshot step is about ten instructions to flip the generation number and an IPI to dump the address space register contents on all CPUs to force a reload.

It occurred to me just now that taking an in-memory snapshot doesn't mean that you have to write down a background checkpoint. I suspect that with very little revision you could decide to open a new checkpoint every two snapshots. Or every four. Or every eight.

The thing about the in-memory snapshot is that it's a really good way to find out what objects are being actively used.

Which has me wondering if ageing shouldn't be handled by a "snapshot only" checkpoint...


Jonathan

William ML Leslie

unread,
Mar 6, 2026, 4:47:33 PM (7 days ago) Mar 6
to cap-...@googlegroups.com
On Sat, 7 Mar 2026 at 04:45, Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
It occurred to me just now that taking an in-memory snapshot doesn't mean that you have to write down a background checkpoint. I suspect that with very little revision you could decide to open a new checkpoint every two snapshots. Or every four. Or every eight.

I worry about time spent re-establishing mappings with the same misguided fervor that C programmers worry about garbage collection.  I'd like to measure it, but I don't know that I want to commit to it.

--
William ML Leslie
A tool for making incorrect guesses and generating large volumes of plausible-looking nonsense.  Who is this very useful tool for?

Jonathan S. Shapiro

unread,
Mar 7, 2026, 12:57:50 AM (6 days ago) Mar 7
to cap-...@googlegroups.com
On Fri, Mar 6, 2026 at 1:47 PM William ML Leslie <william.l...@gmail.com> wrote:
On Sat, 7 Mar 2026 at 04:45, Jonathan S. Shapiro <jonathan....@gmail.com> wrote:
It occurred to me just now that taking an in-memory snapshot doesn't mean that you have to write down a background checkpoint. I suspect that with very little revision you could decide to open a new checkpoint every two snapshots. Or every four. Or every eight.

I worry about time spent re-establishing mappings with the same misguided fervor that C programmers worry about garbage collection.  I'd like to measure it, but I don't know that I want to commit to it.

That would be my main concern as well. The concern is that if you don't whack the mappings you have no data on what is actually being used. How long something has been in memory is not a great proxy for how actively it is being used. On the other hand, the reason we whack the mappings on generation six is because it gives the hot stuff time to tell us it is still being used.

I still have this itchy feeling that we could usefully merge the scan passes, but that may not be right.


Jonathan
Reply all
Reply to author
Forward
0 new messages