Hard crashes and swap issues

Louis Epstein

unread,

Aug 20, 2023, 10:40:29 PM8/20/23

to

For some time now when I run synth builds lang/rust
never completes,running out of swap space and making
over a dozen other ports that wait for it to get
built be skipped.

In the last few days,however,the swap space (I have 19G)
running out has not only caused a failure of rust or
a termination of synth,but made other applications terminate
("unable to recover memory") and logged me out,and when
I log in I discover that programs such as httpd and sendmail
are no longer running...

and furthermore,when I try a shutdown to reboot,it doesn't
complete the process,just sits after syncer is finished
and never powers down from shutdown -p
(and despite all the controlled shutdowns that it has
gone through,when I manually power down and up I have
to fsck everything).

-=-=-
The World Trade Center towers MUST rise again,
at least as tall as before...or terror has triumphed.

Louis Epstein

unread,

Aug 22, 2023, 2:06:55 PM8/22/23

to

Louis Epstein <l...@main.lekno.ws> wrote:
> For some time now when I run synth builds lang/rust
> never completes,running out of swap space and making
> over a dozen other ports that wait for it to get
> built be skipped.
>
> In the last few days,however,the swap space (I have 19G)
> running out has not only caused a failure of rust or
> a termination of synth,but made other applications terminate
> ("unable to recover memory") and logged me out,and when
> I log in I discover that programs such as httpd and sendmail
> are no longer running...
>
> and furthermore,when I try a shutdown to reboot,it doesn't
> complete the process,just sits after syncer is finished
> and never powers down from shutdown -p
> (and despite all the controlled shutdowns that it has
> gone through,when I manually power down and up I have
> to fsck everything).

shutdown -r also stops cold after sync-ing.

Louis Epstein

unread,

Aug 23, 2023, 2:14:40 PM8/23/23

to

Louis Epstein <l...@main.lekno.ws> wrote:
> Louis Epstein <l...@main.lekno.ws> wrote:
>> For some time now when I run synth builds lang/rust
>> never completes,running out of swap space and making
>> over a dozen other ports that wait for it to get
>> built be skipped.
>>
>> In the last few days,however,the swap space (I have 19G)
>> running out has not only caused a failure of rust or
>> a termination of synth,but made other applications terminate
>> ("unable to recover memory") and logged me out,and when
>> I log in I discover that programs such as httpd and sendmail
>> are no longer running...
>>
>> and furthermore,when I try a shutdown to reboot,it doesn't
>> complete the process,just sits after syncer is finished
>> and never powers down from shutdown -p
>> (and despite all the controlled shutdowns that it has
>> gone through,when I manually power down and up I have
>> to fsck everything).
>
> shutdown -r also stops cold after sync-ing.

And killing the rust build when 7% of swap space is in use
still leads to an unable to get swap space message just
as killing it when 99.9% of swap space is in use before
the shutdown.

The swp_pager message comes after the "all buffers synced"
and I note a message above that saying that rc.shutdown
had an abnormal termination and it's going to single user
mode but then it never gets there.

Winston

unread,

Aug 24, 2023, 2:03:55 AM8/24/23

to

Louis Epstein <l...@main.lekno.ws> wrote:
>>> In the last few days,however,the swap space (I have 19G)
>>> running out has not only caused a failure of rust or
>>> a termination of synth,but made other applications terminate
>>> ("unable to recover memory") and logged me out,and when
>>> I log in I discover that programs such as httpd and sendmail
>>> are no longer running...

and later added:

> And killing the rust build when 7% of swap space is in use
> still leads to an unable to get swap space message just
> as killing it when 99.9% of swap space is in use before
> the shutdown.

Wild stab in the dark ...

If you're using tmpfs and it's an unknown tmpfs bug,
perhaps disable tmpfs and see if that makes a difference?
-WBE

Louis Epstein

unread,

Sep 23, 2023, 2:13:41 AM9/23/23

to

Will have to consider that,tmpfs is certainly
shown on a pre-shutdown df with lots of synth material.

The problem had gone away for a while but returned recently.

Louis Epstein

unread,

Oct 14, 2023, 1:07:35 AM10/14/23

to

Now it may indeed by gone...after a rust config-option
deletion.

Louis Epstein

unread,

Oct 27, 2023, 8:28:07 PM10/27/23

to

Louis Epstein <l...@main.lekno.ws> wrote:
> Louis Epstein <l...@main.lekno.ws> wrote:
>> Winston <w...@ubeblock.psr.com.invalid> wrote:
>>> Louis Epstein <l...@main.lekno.ws> wrote:
>>>>>> In the last few days,however,the swap space (I have 19G)
>>>>>> running out has not only caused a failure of rust or
>>>>>> a termination of synth,but made other applications terminate
>>>>>> ("unable to recover memory") and logged me out,and when
>>>>>> I log in I discover that programs such as httpd and sendmail
>>>>>> are no longer running...
>>>
>>> and later added:
>>>> And killing the rust build when 7% of swap space is in use
>>>> still leads to an unable to get swap space message just
>>>> as killing it when 99.9% of swap space is in use before
>>>> the shutdown.
>>>
>>> Wild stab in the dark ...
>>>
>>> If you're using tmpfs and it's an unknown tmpfs bug,
>>> perhaps disable tmpfs and see if that makes a difference?
>>> -WBE
>>
>> Will have to consider that,tmpfs is certainly
>> shown on a pre-shutdown df with lots of synth material.
>>
>> The problem had gone away for a while but returned recently.
>
> Now it may indeed by gone...after a rust config-option
> deletion.

Now it seems to have returned...

bob prohaska

unread,

Oct 27, 2023, 11:17:17 PM10/27/23

to

Louis Epstein <l...@main.lekno.ws> wrote:
>
> Now it seems to have returned...
>

Is there a description of your setup somewhere? I didn't see it.

bob prohaska

Louis Epstein

unread,

Oct 29, 2023, 1:43:16 PM10/29/23

to

What details do you need?
I'm running 13.2-RELEASE-p4 on an Intel i7-3770K.
8G DDR3 RAM and 19G swap that gets swallowed by these crashes.

bob prohaska

unread,

Oct 29, 2023, 8:42:00 PM10/29/23

to

Louis Epstein <l...@main.lekno.ws> wrote:
> bob prohaska <b...@www.zefox.net> wrote:
>> Louis Epstein <l...@main.lekno.ws> wrote:
>>>
>>> Now it seems to have returned...
>>>
>>
>> Is there a description of your setup somewhere? I didn't see it.
>
> What details do you need?
> I'm running 13.2-RELEASE-p4 on an Intel i7-3770K.
> 8G DDR3 RAM and 19G swap that gets swallowed by these crashes.
>

Oops, I won't do you much good. The problems described sounded a bit
like me trying to compile large ports (think www/chromium) on Raspberry
Pi systems. I thought perhaps there might be some similarities, but
that seem most unlikely.

Apologies for intruding,

bob prohaska

Winston

unread,

Oct 30, 2023, 10:09:52 AM10/30/23

to

Louis Epstein <l...@main.lekno.ws> wrote:
>>>>> In the last few days,however,the swap space (I have 19G)
>>>>> running out has not only caused a failure of rust or
>>>>> a termination of synth,but made other applications terminate
>>>>> ("unable to recover memory") and logged me out,and when
>>>>> I log in I discover that programs such as httpd and sendmail
>>>>> are no longer running...

and later added:
>>> And killing the rust build when 7% of swap space is in use
>>> still leads to an unable to get swap space message just
>>> as killing it when 99.9% of swap space is in use before
>>> the shutdown.

to which I replied:

>> Wild stab in the dark ...

>> If you're using tmpfs and it's an unknown tmpfs bug,
>> perhaps disable tmpfs and see if that makes a difference?

Louis Epstein <l...@main.lekno.ws> then replied:

> Will have to consider that,tmpfs is certainly
> shown on a pre-shutdown df with lots of synth material.

> The problem had gone away for a while but returned recently.

Did you ever try the experiment of disabling tmpfs to see if
that helps?

FWIW, your issue sounds similar to the title of bug 219399
from 2017 which is still Open:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399

In that bug, though, the problem was seen with an early AMD Ryzen
processor, and (skimming through the discussion), it seems like
the problem was a hardware problem where temperature caused
various RAM and CPU errors.
-WBE

Louis Epstein

unread,

Oct 31, 2023, 12:50:31 PM10/31/23

to

Winston <w...@ubeblock.psr.com.invalid> wrote:
> Louis Epstein <l...@main.lekno.ws> wrote:
>>>>>> In the last few days,however,the swap space (I have 19G)
>>>>>> running out has not only caused a failure of rust or
>>>>>> a termination of synth,but made other applications terminate
>>>>>> ("unable to recover memory") and logged me out,and when
>>>>>> I log in I discover that programs such as httpd and sendmail
>>>>>> are no longer running...
>
> and later added:
>>>> And killing the rust build when 7% of swap space is in use
>>>> still leads to an unable to get swap space message just
>>>> as killing it when 99.9% of swap space is in use before
>>>> the shutdown.
>
> to which I replied:
>>> Wild stab in the dark ...
>
>>> If you're using tmpfs and it's an unknown tmpfs bug,
>>> perhaps disable tmpfs and see if that makes a difference?
>
> Louis Epstein <l...@main.lekno.ws> then replied:
>> Will have to consider that,tmpfs is certainly
>> shown on a pre-shutdown df with lots of synth material.
>
>> The problem had gone away for a while but returned recently.
>
> Did you ever try the experiment of disabling tmpfs to see if
> that helps?

Given that using tmpfs is supposed to speed things up and
the synth builds already take hours and hours when the
batch is large and the swap-swallows prevent using the
computer for anything else,I was cautious about this.

Is there a way to dismount the /tmpfs entries of df
that are there when I kill synth before the swap-swallow
has crashed the system,so I know if they play a part in
the failure of the shutdown program to execute completely?
shutdown -h and shutdown -p both end up in a circle.

> FWIW, your issue sounds similar to the title of bug 219399
> from 2017 which is still Open:
>
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=219399
>
> In that bug, though, the problem was seen with an early AMD Ryzen
> processor, and (skimming through the discussion), it seems like
> the problem was a hardware problem where temperature caused
> various RAM and CPU errors.
> -WBE

Winston

unread,

Nov 1, 2023, 10:40:54 AM11/1/23

to

As part of the thread, I asked:

>> Did you ever try the experiment of disabling tmpfs to see if
>> that helps?

to which Louis Epstein <l...@main.lekno.ws> responded:

> Given that using tmpfs is supposed to speed things up and
> the synth builds already take hours and hours when the
> batch is large and the swap-swallows prevent using the
> computer for anything else,I was cautious about this.

> Is there a way to dismount the /tmpfs entries of df
> that are there when I kill synth before the swap-swallow

> has crashed the system, so I know if they play a part in

> the failure of the shutdown program to execute completely?
> shutdown -h and shutdown -p both end up in a circle.

I'm not confident I understand what you're asking, but
perhaps one or more of the following address your question.

While I would think that disabling tmpfs is the only
definitive way to test whether it's a tmpfs bug, if you
don't wish to disable tmpfs:

* You could run pstat or swapinfo from time to see how much
swap space is in use.

* If you have >1 swap area, you could consider suspending
the build and then using swapoff to remove one of them.
It'll take a while to move the active content to the
remaining swap space. Then use swapon to re-add it.
Hopefully, tmpfs handles swap space addition and removal
well.

See if the reported swap space used changes significantly.
If so, maybe it's a tmpfs bug (not freeing no-longer-used
swap space).

* protect(1) can be use to keep processes from being killed
when swap space runs out, but, as the man page warns,
that can also lead to the system hanging instead of
crashing.

* Maybe check temperatures from time to time while the
build is running, too.

* Periodic logging: maybe write a trivial shell loop to run
every few seconds or so to write the output of swapinfo,
df -h /tmp, temperatures, etc. to a file which you could
examine after rebooting.

If swap space doesn't grow gradually, it starts to look
more like a bit error or really-hard-to-find bug.

In the case of bug 219399, small increases in voltages
reduced, but didn't eliminate, bit errors caused by heat.

HTH,
-WBE

Winston

unread,

Nov 1, 2023, 1:13:34 PM11/1/23

to

Also in the "in case it's a hardware problem" vein,
maybe also record voltages if you're able to.
-WBE

Louis Epstein

unread,

Nov 4, 2023, 12:21:33 PM11/4/23

to

Winston <w...@ubeblock.psr.com.invalid> wrote:
> As part of the thread, I asked:
>>> Did you ever try the experiment of disabling tmpfs to see if
>>> that helps?
>
> to which Louis Epstein <l...@main.lekno.ws> responded:
>> Given that using tmpfs is supposed to speed things up and
>> the synth builds already take hours and hours when the
>> batch is large and the swap-swallows prevent using the
>> computer for anything else,I was cautious about this.
>
>> Is there a way to dismount the /tmpfs entries of df
>> that are there when I kill synth before the swap-swallow
>> has crashed the system, so I know if they play a part in
>> the failure of the shutdown program to execute completely?
>> shutdown -h and shutdown -p both end up in a circle.
>
> I'm not confident I understand what you're asking, but
> perhaps one or more of the following address your question.
>
> While I would think that disabling tmpfs is the only
> definitive way to test whether it's a tmpfs bug, if you
> don't wish to disable tmpfs:
>
> * You could run pstat or swapinfo from time to see how much
> swap space is in use.

The "dashboard" on synth includes a steadily updating
percentage of how much swap space is in use...I also
run top on another terminal session at times to see
the figure there.

> * If you have >1 swap area, you could consider suspending
> the build and then using swapoff to remove one of them.

I don't think synth takes an interrupt.

> It'll take a while to move the active content to the
> remaining swap space. Then use swapon to re-add it.
> Hopefully, tmpfs handles swap space addition and removal
> well.
>
> See if the reported swap space used changes significantly.
> If so, maybe it's a tmpfs bug (not freeing no-longer-used
> swap space).
>
> * protect(1) can be use to keep processes from being killed
> when swap space runs out, but, as the man page warns,
> that can also lead to the system hanging instead of
> crashing.

But would this solve what makes shutdown unable to complete
after I have killed synth between errors but before the
whole system has gone down?

> * Maybe check temperatures from time to time while the
> build is running, too.
>
> * Periodic logging: maybe write a trivial shell loop to run
> every few seconds or so to write the output of swapinfo,
> df -h /tmp, temperatures, etc. to a file which you could
> examine after rebooting.
>
> If swap space doesn't grow gradually, it starts to look
> more like a bit error or really-hard-to-find bug.
>
> In the case of bug 219399, small increases in voltages
> reduced, but didn't eliminate, bit errors caused by heat.
>
> HTH,
> -WBE

Louis Epstein

unread,

Nov 21, 2023, 12:17:09 AM11/21/23

to

Further experimentation has revealed that while synth is still
running before the system crashes,doing a kill -9 rather than
plain kill saves the shutdown program from becoming unable to
complete,
and doing umount -a before a shutdown after it HAS crashed
doesn't save the shutdown program or remove the /tmpfs entries
from df,but it DOES avoid all the fsck'ing of the drives after
the power-cycling.

Winston

unread,

Nov 25, 2023, 4:31:12 PM11/25/23

to

I'm still mostly just taking stabs in the dark, but here goes ...

As part of this thread, I previously suggested:

>> * You could run pstat or swapinfo from time to see how much
>> swap space is in use.

to which Louis Epstein <l...@main.lekno.ws> replied:

> The "dashboard" on synth includes a steadily updating
> percentage of how much swap space is in use...I also
> run top on another terminal session at times to see
> the figure there.

I misread that for a long time. I read "steadily decreasing"
rather than what you actually wrote: steadily updating.

So, is swap space decreasing as the build progresses? If so,
does df /tmp steadily increase to match? If it does, check
what's filling up /tmp(fs).

Louis followed up what I quoted above by adding:

> Further experimentation has revealed that while synth is

> still running before the system crashes, doing a kill -9

> rather than plain kill saves the shutdown program from
> becoming unable to complete, and doing umount -a before a
> shutdown after it HAS crashed doesn't save the shutdown
> program or remove the /tmpfs entries from df,but it DOES
> avoid all the fsck'ing of the drives after the power-cycling.

Since part of shutting down is syncing the vnodes, if it
doesn't get that far, it's not surprising that a fsck could be
needed. That's probably why the umount helps. I'm a little
surprised that umount /tmp(fs) wouldn't remove the df entry,
though.

After rebooting, can you resume the build from the point of the
crash (and does it complete?), or do you start over?

I note that /sbin/shutdown runs /usr/bin/wall and execs
/sbin/reboot, /sbin/halt, etc. In those cases where your
system is out of swap space, that may contribute to shutdown
not working, ... Even a subroutine call that needs heap space
and can't get it could cause trouble.

But, several articles ago, you wrote:
>>>> And killing the rust build when 7% of swap space is in
>>>> use still leads to an unable to get swap space message
>>>> just as killing it when 99.9% of swap space is in use
>>>> before the shutdown.

So then it's not strictly an out-of-swap-space issue.

This from the "man tmpfs" might be relevant:

Metadata, including the directory content, is never
swapped out by the current implementation. Keep this
in mind when planning the mount limits, especially when
expecting to place many small files on a tmpfs mount.

I can imagine a case where a smaller RAM is full from the
combination of locked memory, active processes, and lots of
tmpfs file metadata, even when there's unused swap space. As
such a situation develops, I would think there'd be a *lot* of
swapping of whichever processes are still swappable, making
things really slow.

Trying to help, but not confident I'm succeeding, :)
-WBE

Louis Epstein

unread,

Dec 4, 2023, 12:02:07 PM12/4/23

to

Winston <w...@ubeblock.psr.com.invalid> wrote:
> I'm still mostly just taking stabs in the dark, but here goes ...
>
> As part of this thread, I previously suggested:
>>> * You could run pstat or swapinfo from time to see how much
>>> swap space is in use.
>
> to which Louis Epstein <l...@main.lekno.ws> replied:
>> The "dashboard" on synth includes a steadily updating
>> percentage of how much swap space is in use...I also
>> run top on another terminal session at times to see
>> the figure there.
>
> I misread that for a long time. I read "steadily decreasing"
> rather than what you actually wrote: steadily updating.
>
> So, is swap space decreasing as the build progresses? If so,
> does df /tmp steadily increase to match? If it does, check
> what's filling up /tmp(fs).

Not quite linearly but it was ultimately reaching 99.9% swap
in use before the crashes.

The df has shown tons of things in tmpfs.

I just got rust to build overnight after increasing the
swap space on the machine.

> Louis followed up what I quoted above by adding:
>> Further experimentation has revealed that while synth is
>> still running before the system crashes, doing a kill -9
>> rather than plain kill saves the shutdown program from
>> becoming unable to complete, and doing umount -a before a
>> shutdown after it HAS crashed doesn't save the shutdown
>> program or remove the /tmpfs entries from df,but it DOES
>> avoid all the fsck'ing of the drives after the power-cycling.
>
> Since part of shutting down is syncing the vnodes, if it
> doesn't get that far, it's not surprising that a fsck could be
> needed. That's probably why the umount helps. I'm a little
> surprised that umount /tmp(fs) wouldn't remove the df entry,
> though.

I had not tried that but umount -a removes the filesystems
in etc/fstab which do not include /tmpfs.

So far the increased swap space has done the immediate need,
get lang/rust to complete the build process.

I'm now doing a general synth prepare-system to see if it
follows through and builds all the things that would always
wait on rust to build.

Winston

unread,

Dec 5, 2023, 1:49:26 AM12/5/23

to

Louis Epstein <l...@main.lekno.ws> wrote:
> I just got rust to build overnight after increasing the
> swap space on the machine.

> So far the increased swap space has done the immediate need,
> get lang/rust to complete the build process.

So, just to wrap things up ... Once the build was finished,
the swap space all reappeared or was accounted for by tmpfs
files?
-WBE

Louis Epstein

unread,

Dec 18, 2023, 7:48:07 PM12/18/23

to

Synth runs now continue past the building of rust,
and complete the rebuilding of the local repository.

Winston

unread,

Dec 19, 2023, 9:44:40 AM12/19/23

to

Louis Epstein <l...@main.lekno.ws> wrote:
>>> I just got rust to build overnight after increasing the
>>> swap space on the machine.

>>> So far the increased swap space has done the immediate need,
>>> get lang/rust to complete the build process.

prompting me to ask:

>> So, just to wrap things up ... Once the build was finished,
>> the swap space all reappeared or was accounted for by tmpfs
>> files?

Louis Epstein <l...@main.lekno.ws> replied:

> Synth runs now continue past the building of rust,
> and complete the rebuilding of the local repository.

OK, but my question (above) was more about whether you saw
swap + tmpfs space lost along the way (bug), or was
swap + tmpfs space all back to normal afterward (no bug)?
-WBE

Louis Epstein

unread,

Dec 25, 2023, 3:15:24 AM12/25/23

to

I don't think I checked the tmpfs closely enough to answer.