[PATCH 1/1] issue warning on problematic system configuration

5 views
Skip to first unread message

Felix Moessbauer

unread,
Jan 18, 2024, 9:21:16 AM1/18/24
to kas-...@googlegroups.com, adriaan...@siemens.com, jan.k...@siemens.com, Felix Moessbauer
This patch adds a basic sanity check of the system configuration. For
now, we only check if the coredump files are created with a non-default
name. Later on, more checks can be added.

Signed-off-by: Felix Moessbauer <felix.mo...@siemens.com>
---
This check should help to identify hard to debug issues around
coredumps: During the build, often coredumps are written as part
of the toolings feature probing (e.g. by CMake or autotools). These
are created in the current folder and might conflict with equally
named files or directories. As this unfortunately is the default config
of Linux, we warn about it and give a hint on how to change it.


kas/libcmds.py | 28 ++++++++++++++++++++++++++++
1 file changed, 28 insertions(+)

diff --git a/kas/libcmds.py b/kas/libcmds.py
index 3c194da..2b92f23 100644
--- a/kas/libcmds.py
+++ b/kas/libcmds.py
@@ -29,6 +29,7 @@ import shutil
import os
import pprint
import configparser
+from pathlib import Path
from .libkas import (ssh_cleanup_agent, ssh_setup_agent, ssh_no_host_key_check,
get_build_environ, repos_fetch, repos_apply_patches)
from .includehandler import IncludeException
@@ -200,6 +201,32 @@ class SetupHome(Command):
shutil.copy(os.environ['AWS_WEB_IDENTITY_TOKEN_FILE'],
webid_token_file)

+ def _check_coredump_is_named(self):
+ """
+ The linux default name for coredumps is just 'core'. This likely
+ conflicts with files or directories from the image build which
+ are also named core, leading to hard to debug errors.
+ """
+ sysfs_kernel = Path('/proc/sys/kernel')
+ if not all([os.path.exists(sysfs_kernel / entry)
+ for entry in ['core_pattern', 'core_uses_pid']]):
+ logging.warning('[system check] could not check coredump pattern')
+ return
+
+ with open(sysfs_kernel / 'core_pattern', 'r') as f:
+ CORE_PATTERN = f.readline().strip()
+ with open(sysfs_kernel / 'core_uses_pid', 'r') as f:
+ CORE_USES_PID = int(f.readline().strip())
+ if CORE_PATTERN == 'core' and CORE_USES_PID == 0:
+ logging.warning(
+ '[system check] your system uses the default coredump '
+ 'name \'core\'. This may conflict with equally named files. '
+ 'See \'man 5 core\' on how to change this.')
+ return
+
+ def _check_environment(self):
+ self._check_coredump_is_named()
+
def execute(self, ctx):
if os.environ.get('NETRC_FILE', False):
shutil.copy(os.environ['NETRC_FILE'],
@@ -225,6 +252,7 @@ class SetupHome(Command):
+ os.environ.get('GIT_CREDENTIAL_USEHTTPPATH')
+ '\n')

+ self._check_environment()
self._setup_aws_creds()

ctx.environ['HOME'] = self.tmpdirname
--
2.39.2

Florian Bezdeka

unread,
Jan 18, 2024, 9:35:14 AM1/18/24
to Felix Moessbauer, kas-...@googlegroups.com, adriaan...@siemens.com, jan.k...@siemens.com
On Thu, 2024-01-18 at 15:21 +0100, 'Felix Moessbauer' via kas-devel
wrote:
> This patch adds a basic sanity check of the system configuration. For
> now, we only check if the coredump files are created with a non-default
> name. Later on, more checks can be added.
>
> Signed-off-by: Felix Moessbauer <felix.mo...@siemens.com>
> ---
> This check should help to identify hard to debug issues around
> coredumps: During the build, often coredumps are written as part
> of the toolings feature probing (e.g. by CMake or autotools). These
> are created in the current folder and might conflict with equally
> named files or directories. As this unfortunately is the default config
> of Linux, we warn about it and give a hint on how to change it.

This could / should be part of the patch description, no?
No python expert, but this looks platform (Linux) specific. I'm quite
sure python has some helpers to append / build filesystem paths on all
archs.

Seems to apply below as well.

> + for entry in ['core_pattern', 'core_uses_pid']]):
> + logging.warning('[system check] could not check coredump pattern')
> + return
> +
> + with open(sysfs_kernel / 'core_pattern', 'r') as f:
> + CORE_PATTERN = f.readline().strip()
> + with open(sysfs_kernel / 'core_uses_pid', 'r') as f:
> + CORE_USES_PID = int(f.readline().strip())
> + if CORE_PATTERN == 'core' and CORE_USES_PID == 0:
> + logging.warning(
> + '[system check] your system uses the default coredump '
> + 'name \'core\'. This may conflict with equally named files. '
> + 'See \'man 5 core\' on how to change this.')

You say that the default value is insufficient, which means that all
systems with default configuration will trigger a warning now. Thinking
of all the CI systems (that typically fail on (new) warnings) this is a
significant change.

> + return
> +
> + def _check_environment(self):
> + self._check_coredump_is_named()
> +
> def execute(self, ctx):
> if os.environ.get('NETRC_FILE', False):
> shutil.copy(os.environ['NETRC_FILE'],
> @@ -225,6 +252,7 @@ class SetupHome(Command):
> + os.environ.get('GIT_CREDENTIAL_USEHTTPPATH')
> + '\n')
>
> + self._check_environment()
> self._setup_aws_creds()
>
> ctx.environ['HOME'] = self.tmpdirname
> --
> 2.39.2
>
> --
> You received this message because you are subscribed to the Google Groups "kas-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kas-devel+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/kas-devel/20240118142100.719533-1-felix.moessbauer%40siemens.com.

Jan Kiszka

unread,
Jan 18, 2024, 9:42:53 AM1/18/24
to Felix Moessbauer, kas-...@googlegroups.com, adriaan...@siemens.com
On 18.01.24 15:21, Felix Moessbauer wrote:
> This patch adds a basic sanity check of the system configuration. For
> now, we only check if the coredump files are created with a non-default
> name. Later on, more checks can be added.
>
> Signed-off-by: Felix Moessbauer <felix.mo...@siemens.com>
> ---
> This check should help to identify hard to debug issues around
> coredumps: During the build, often coredumps are written as part
> of the toolings feature probing (e.g. by CMake or autotools). These
> are created in the current folder and might conflict with equally
> named files or directories. As this unfortunately is the default config
> of Linux, we warn about it and give a hint on how to change it.
>

Is that really best detected at kas level? I'm not yet sure.

Jan
Siemens AG, Technology
Linux Expert Center

MOESSBAUER, Felix

unread,
Jan 18, 2024, 9:44:14 AM1/18/24
to Bezdeka, Florian, kas-...@googlegroups.com, Schmidt, Adriaan, Kiszka, Jan
On Thu, 2024-01-18 at 15:35 +0100, Florian Bezdeka wrote:
> On Thu, 2024-01-18 at 15:21 +0100, 'Felix Moessbauer' via kas-devel
> wrote:
> > This patch adds a basic sanity check of the system configuration.
> > For
> > now, we only check if the coredump files are created with a non-
> > default
> > name. Later on, more checks can be added.
> >
> > Signed-off-by: Felix Moessbauer <felix.mo...@siemens.com>
> > ---
> > This check should help to identify hard to debug issues around
> > coredumps: During the build, often coredumps are written as part
> > of the toolings feature probing (e.g. by CMake or autotools). These
> > are created in the current folder and might conflict with equally
> > named files or directories. As this unfortunately is the default
> > config
> > of Linux, we warn about it and give a hint on how to change it.
>
> This could / should be part of the patch description, no?

I have no strong feelings about this. I will move it in case a v2 is
required.
Yes, this is exactly how you do it (by using the pathlib.Path module).

>
> > +                    for entry in ['core_pattern',
> > 'core_uses_pid']]):
> > +            logging.warning('[system check] could not check
> > coredump pattern')
> > +            return
> > +
> > +        with open(sysfs_kernel / 'core_pattern', 'r') as f:
> > +            CORE_PATTERN = f.readline().strip()
> > +        with open(sysfs_kernel / 'core_uses_pid', 'r') as f:
> > +            CORE_USES_PID = int(f.readline().strip())
> > +        if CORE_PATTERN == 'core' and CORE_USES_PID == 0:
> > +            logging.warning(
> > +                '[system check] your system uses the default
> > coredump '
> > +                'name \'core\'. This may conflict with equally
> > named files. '
> > +                'See \'man 5 core\' on how to change this.')
>
> You say that the default value is insufficient, which means that all
> systems with default configuration will trigger a warning now.
> Thinking
> of all the CI systems (that typically fail on (new) warnings) this is
> a
> significant change.

Yes and no. The return code of the script does not change in case the
warning is triggered. By that, the CI would need to parse the output of
kas to have a behavior change, which I doubt anyone is doing. IMHO it
is the other way round: We had it multiple times in the past that
builds broke *somewhere* because of this issue and people had a really
hard time figuring out the root cause. This also happened to CI runners
on gitlab cloud CI that had different configurations for this (e.g.
some runners had proper coredump patterns, others not).

I'm actually thinking if it makes sense to provide a strict mode (as
kas parameter) where these warnings will make the build fail instantly.

Opinions?

Felix

MOESSBAUER, Felix

unread,
Jan 18, 2024, 9:48:24 AM1/18/24
to Kiszka, Jan, kas-...@googlegroups.com, Schmidt, Adriaan
On Thu, 2024-01-18 at 15:42 +0100, Jan Kiszka wrote:
> On 18.01.24 15:21, Felix Moessbauer wrote:
> > This patch adds a basic sanity check of the system configuration.
> > For
> > now, we only check if the coredump files are created with a non-
> > default
> > name. Later on, more checks can be added.
> >
> > Signed-off-by: Felix Moessbauer <felix.mo...@siemens.com>
> > ---
> > This check should help to identify hard to debug issues around
> > coredumps: During the build, often coredumps are written as part
> > of the toolings feature probing (e.g. by CMake or autotools). These
> > are created in the current folder and might conflict with equally
> > named files or directories. As this unfortunately is the default
> > config
> > of Linux, we warn about it and give a hint on how to change it.
> >
>
> Is that really best detected at kas level? I'm not yet sure.

Where else? My understanding of KAS is that is provides an isolated
environment for the build (as good as possible). By that, it should
also warn about misconfigurations it cannot change by itself.

Felix

Jan Kiszka

unread,
Jan 18, 2024, 11:04:46 AM1/18/24
to Moessbauer, Felix (T CED OES-DE), kas-...@googlegroups.com, Schmidt, Adriaan (T CED EDC-DE)
On 18.01.24 15:48, Moessbauer, Felix (T CED OES-DE) wrote:
> On Thu, 2024-01-18 at 15:42 +0100, Jan Kiszka wrote:
>> On 18.01.24 15:21, Felix Moessbauer wrote:
>>> This patch adds a basic sanity check of the system configuration.
>>> For
>>> now, we only check if the coredump files are created with a non-
>>> default
>>> name. Later on, more checks can be added.
>>>
>>> Signed-off-by: Felix Moessbauer <felix.mo...@siemens.com>
>>> ---
>>> This check should help to identify hard to debug issues around
>>> coredumps: During the build, often coredumps are written as part
>>> of the toolings feature probing (e.g. by CMake or autotools). These
>>> are created in the current folder and might conflict with equally
>>> named files or directories. As this unfortunately is the default
>>> config
>>> of Linux, we warn about it and give a hint on how to change it.
>>>
>>
>> Is that really best detected at kas level? I'm not yet sure.
>
> Where else? My understanding of KAS is that is provides an isolated
> environment for the build (as good as possible). By that, it should
> also warn about misconfigurations it cannot change by itself.

The respective build system?

Also, it still takes another bit, a file/directory called "core" at some
source topdir, to actually cause the troubles you were seeing. So we may
also issue warnings needlessly this way.

Jan

MOESSBAUER, Felix

unread,
Jan 18, 2024, 11:23:04 AM1/18/24
to Kiszka, Jan, kas-...@googlegroups.com, Schmidt, Adriaan
Well... Then we could argue that this needs to be checked either by
bitbake, cmake/autotools or the individual feature probing script that
triggers the core dump. But no one would assume that a feature probing
script also checks the coredump config of the kernel. IMHO things like
that should be checked as early as possible, by also accepting false
positives.

>
> Also, it still takes another bit, a file/directory called "core" at
> some
> source topdir, to actually cause the troubles you were seeing. So we
> may
> also issue warnings needlessly this way.

One example which bit us was DynamoRIO [1]. It took hours to find the
root cause as we had no direct access to the CI system. The name core
is just too generic and widely used to accept the risk of colliding
with the coredump. In the end it is a trade-off between warning too
much and too little. But the cost of suffering from the problematic
config without knowing is REALLY high, so I personally prefer to rather
have this warning.

[1] https://github.com/DynamoRIO/dynamorio/issues/6126

Felix

Jan Kiszka

unread,
Jan 18, 2024, 12:09:32 PM1/18/24
to Moessbauer, Felix (T CED OES-DE), kas-...@googlegroups.com, Schmidt, Adriaan (T CED EDC-DE)
That's why I consider this a build system topic: Don't compile you code
inside the source directory, then you don't run into this.

MOESSBAUER, Felix

unread,
Jan 19, 2024, 3:49:18 AM1/19/24
to Kiszka, Jan, kas-...@googlegroups.com, Schmidt, Adriaan
That does not help. The build-trees also contain directories which are
created by the build system on-demand. CMake for instance creates these
directories in the build dir before a rule in the corresponding source
dir is to be run. Also the feature probing runs in the build dir, hence
we have a conflict there if the src dir contains a 'core' directory in
its top-level (which is not rare) and a feature probe segfaults (which
is somehow expected for the probes).

For details and the exact error pattern, please have a look in the
linked issue from above.

On desktop systems you usually don't run into this issue, as systemd-
coredump is installed, which configures the pattern. However, on a CI
that has no graphical environment installed usually systemd-coredump is
not installed and by that the core_pattern is the default (except if an
admin manually changed it).

Felix
Reply all
Reply to author
Forward
0 new messages