I'm still mostly just taking stabs in the dark, but here goes ...
As part of this thread, I previously suggested:
>> * You could run pstat or swapinfo from time to see how much
>> swap space is in use.
to which Louis Epstein <
l...@main.lekno.ws> replied:
> The "dashboard" on synth includes a steadily updating
> percentage of how much swap space is in use...I also
> run top on another terminal session at times to see
> the figure there.
I misread that for a long time. I read "steadily decreasing"
rather than what you actually wrote: steadily updating.
So, is swap space decreasing as the build progresses? If so,
does df /tmp steadily increase to match? If it does, check
what's filling up /tmp(fs).
Louis followed up what I quoted above by adding:
> Further experimentation has revealed that while synth is
> still running before the system crashes, doing a kill -9
> rather than plain kill saves the shutdown program from
> becoming unable to complete, and doing umount -a before a
> shutdown after it HAS crashed doesn't save the shutdown
> program or remove the /tmpfs entries from df,but it DOES
> avoid all the fsck'ing of the drives after the power-cycling.
Since part of shutting down is syncing the vnodes, if it
doesn't get that far, it's not surprising that a fsck could be
needed. That's probably why the umount helps. I'm a little
surprised that umount /tmp(fs) wouldn't remove the df entry,
though.
After rebooting, can you resume the build from the point of the
crash (and does it complete?), or do you start over?
I note that /sbin/shutdown runs /usr/bin/wall and execs
/sbin/reboot, /sbin/halt, etc. In those cases where your
system is out of swap space, that may contribute to shutdown
not working, ... Even a subroutine call that needs heap space
and can't get it could cause trouble.
But, several articles ago, you wrote:
>>>> And killing the rust build when 7% of swap space is in
>>>> use still leads to an unable to get swap space message
>>>> just as killing it when 99.9% of swap space is in use
>>>> before the shutdown.
So then it's not strictly an out-of-swap-space issue.
This from the "man tmpfs" might be relevant:
Metadata, including the directory content, is never
swapped out by the current implementation. Keep this
in mind when planning the mount limits, especially when
expecting to place many small files on a tmpfs mount.
I can imagine a case where a smaller RAM is full from the
combination of locked memory, active processes, and lots of
tmpfs file metadata, even when there's unused swap space. As
such a situation develops, I would think there'd be a *lot* of
swapping of whichever processes are still swappable, making
things really slow.
Trying to help, but not confident I'm succeeding, :)
-WBE