[PATCH v5 0/7] Introducing drmgr hooks for LPM

21 views
Skip to first unread message

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Aug 24, 2022, 4:57:11 AM8/24/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
This series introduce a hooks called by drmgr when it is performing
LPM. The hook mechanism is generic and should apply later for any other
operation.

This series is only providing hooks for LPM.

There are 4 phases defined:
- check: called before the operation is initiated
- undocheck: called in the case the check phase has failed
- pre: called just before the operation is really started
- post: called once the operation is done (with or without success)

The return value of the hook is ignored except for the check phase.

The hooks can report error up to the end user through STDOUT and STDERR.

There is no timeout, so a hook may block the overall operation. But each
hook run is logged in the drmgr's log so it should be easy to identify the
culprit.

Hooks are stored in the directory /etc/drmgr.d<DRC TYPE>, and run with at
least the following environment variables set: $DRC_TYPE and $PHASE

This allows a same hook to be registered for multiple DRC types if needed.

The 3 first patches of this series are doing some cleanup.

Changes in v5:
- Remove deprecated call to 'refrsrc IBM.ManagementServer' (patch 2/7)
- Fix typo in the DRC name table (patch 4/7)
Changes in v4:
- Use environment variables to pass arguments to the hook.
- Use fork & exec instead of system to control the environment.
- Clear all environment inherited by the hook process
- Addresses Nathan and Tyrel's comments
Changes in v3:
- Manage the phase parameter in the hook framework (patches 5-6/7)
- Merge the newly "acc" DRC type introduce (patches 4-5/7)
Changes in v2:
- Don't return error when /etc/drmgr.d is not present*** SUBJECT HERE ***

Laurent Dufour (7):
drmgr/pmig: remove unused code
drmgr/pmig: remove deprecated call to refrsrc IBM.ManagementServer
drmgr: prevent file descriptor to be inherited when execing a child
drmgr: introduce a DRC type name table
drmgr: introducing the hook framework
drmgr: introducing the LPM hooks
drmgr: add the drmgr-hooks man file.

Makefile.am | 1 +
man/drmgr-hooks.8 | 82 ++++++++++++++++
man/drmgr.8 | 1 +
src/drmgr/common.c | 184 +++++++++++++++++++++++++++++++-----
src/drmgr/dr.h | 4 +
src/drmgr/drmig_chrp_pmig.c | 40 ++++----
6 files changed, 266 insertions(+), 46 deletions(-)
create mode 100644 man/drmgr-hooks.8

--
2.37.2

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Aug 24, 2022, 4:57:12 AM8/24/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
The hook framework run in a sequence any executable file found in the
relevant directory (/etc/drmgr.d/<DRC TYPE NAME>/)

The hook are run according to the versionsort()'s output order.

The hook inherits from drmgr its standard I/O streams. All others file
descriptor should have the close on exec flag set to ensure they will be
closed when executing an hook.

The hooks are run with no arguments, arguments are passed through
environment variable.

The inherited environment is cleaned and 2 environment variables
are set:
- DRC_TYPE containing the DRC type string
- PHASE containing the current phase

There are 4 known phases: check, undocheck, pre and post.

The hook's run is recorded in the drmgr's log, so blocking hook could be
identified.

Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/drmgr/common.c | 142 +++++++++++++++++++++++++++++++++++++++++++++
src/drmgr/dr.h | 4 ++
2 files changed, 146 insertions(+)

diff --git a/src/drmgr/common.c b/src/drmgr/common.c
index 12af7566a048..1d1becbaf969 100644
--- a/src/drmgr/common.c
+++ b/src/drmgr/common.c
@@ -49,6 +49,8 @@ char *remove_slot_fname = REMOVE_SLOT_FNAME;

#define SYSFS_DLPAR_FILE "/sys/kernel/dlpar"

+#define DR_SCRIPT_DIR "/etc/drmgr.d"
+
static int dr_lock_fd = 0;
static long dr_timeout;

@@ -67,6 +69,13 @@ static char *drc_type_str[] = {
[DRC_TYPE_ACC] = "acc",
};

+static char *hook_phase_name[] = {
+ [HOOK_CHECK] = "check",
+ [HOOK_UNDOCHECK] = "undocheck",
+ [HOOK_PRE] = "pre",
+ [HOOK_POST] = "post",
+};
+
/**
* set_output level
* @brief Common routine to set the output level
@@ -1546,3 +1555,136 @@ enum drc_type to_drc_type(const char *arg)
return DRC_TYPE_NONE;
}

+static int run_one_hook(enum drc_type drc_type, enum hook_phase phase,
+ const char *name)
+{
+ int rc;
+ pid_t child;
+
+ fflush(NULL);
+ child = fork();
+ if (child == -1) {
+ say(ERROR, "Can't fork to run a hook: %s\n", strerror(errno));
+ return -1;
+ }
+
+ if (child) {
+ /* Father side */
+ if (waitpid(child, &rc, 0) == -1) {
+ say(ERROR, "waitpid error: %s\n", strerror(errno));
+ return -1;
+ }
+
+ if (WIFSIGNALED(rc)) {
+ say(INFO, "hook '%s' terminated by signal %d\n",
+ name, WTERMSIG(rc));
+ rc = 1;
+ } else {
+ rc = WEXITSTATUS(rc);
+ say(INFO, "hook '%s' exited with status %d\n",
+ name, rc);
+ }
+ return rc;
+ }
+
+
+ /* Child side */
+ say(DEBUG, "Running hook '%s' for phase %s (PID=%d)\n",
+ name, hook_phase_name[phase], getpid());
+
+ if (chdir("/")) {
+ say(ERROR, "Can't change working directory to / : %s\n",
+ strerror(errno));
+ exit(255);
+ }
+
+ if (clearenv() ||
+ setenv("DRC_TYPE", drc_type_str[drc_type], 1) ||
+ setenv("PHASE", hook_phase_name[phase], 1)) {
+ say(ERROR, "Can't set environment variables: %s\n",
+ strerror(errno));
+ exit(255);
+ }
+
+ execl(name, name, (char *)NULL);
+ say(ERROR, "Can't exec hook %s : %s\n", strerror(errno));
+ exit(255);
+}
+
+static int is_file_or_link(const struct dirent *entry)
+{
+ if ((entry->d_type == DT_REG) || (entry->d_type == DT_LNK))
+ return 1;
+ return 0;
+}
+
+/*
+ * Run all executable hooks found in a given directory.
+ * Return 0 if all run script have returned 0 status.
+ */
+int run_hooks(enum drc_type drc_type, enum hook_phase phase)
+{
+ int rc = 0, fdd, num, i;
+ DIR *dir;
+ struct dirent **entries = NULL;
+
+ /* Sanity check */
+ if (drc_type <= DRC_TYPE_NONE || drc_type >= ARRAY_SIZE(drc_type_str)) {
+ say(ERROR, "Invalid DRC TYPE detected (%d)\n", drc_type);
+ return -1;
+ }
+
+ if (phase < HOOK_CHECK || phase > HOOK_POST) {
+ say(ERROR, "Invalid hook phase %d\n", phase);
+ return -1;
+ }
+
+ dir = opendir(DR_SCRIPT_DIR);
+ if (dir == NULL) {
+ if (errno == ENOENT)
+ return 0;
+ say(ERROR, "Can't open %s: %s\n", DR_SCRIPT_DIR,
+ strerror(errno));
+ return -1;
+ }
+
+ fdd = dirfd(dir);
+ num = scandirat(fdd, drc_type_str[drc_type], &entries,
+ is_file_or_link, versionsort);
+ closedir(dir);
+
+ for (i = 0; i < num; i++) {
+ struct stat st;
+ struct dirent *entry = entries[i];
+ char *name;
+
+ if (asprintf(&name, "%s/%s/%s", DR_SCRIPT_DIR,
+ drc_type_str[drc_type], entry->d_name) == -1) {
+ say(ERROR,
+ "Can't allocate filename string (%zd bytes)\n",
+ strlen(DR_SCRIPT_DIR) + 1 +
+ strlen(drc_type_str[drc_type]) + 1 +
+ strlen(entry->d_name) + 1);
+ rc = 1;
+ free(entry);
+ continue;
+ }
+
+ /*
+ * Report error only in the case the hook itself fails.
+ * Any other error (file is not executable etc.) is ignored.
+ */
+ if (stat(name, &st))
+ say(WARN, "Can't stat file %s: %s\n",
+ name, strerror(errno));
+ else if (S_ISREG(st.st_mode) && (st.st_mode & S_IXUSR) &&
+ run_one_hook(drc_type, phase, name))
+ rc = 1;
+
+ free(name);
+ free(entry);
+ }
+
+ free(entries);
+ return rc;
+}
diff --git a/src/drmgr/dr.h b/src/drmgr/dr.h
index 58fdb5c137e6..5526c29a9460 100644
--- a/src/drmgr/dr.h
+++ b/src/drmgr/dr.h
@@ -70,6 +70,8 @@ enum drc_type {DRC_TYPE_NONE, DRC_TYPE_PCI, DRC_TYPE_SLOT, DRC_TYPE_PHB,
DRC_TYPE_CPU, DRC_TYPE_MEM, DRC_TYPE_PORT,
DRC_TYPE_HIBERNATE, DRC_TYPE_MIGRATION, DRC_TYPE_ACC};

+enum hook_phase {HOOK_CHECK, HOOK_UNDOCHECK, HOOK_PRE, HOOK_POST};
+
extern enum drmgr_action usr_action;
extern int display_capabilities;
extern int usr_slot_identification;
@@ -133,6 +135,8 @@ void print_dlpar_capabilities(void);

void set_output_level(int);

+int run_hooks(enum drc_type drc_type, enum hook_phase phase);
+
#define DR_BUF_SZ 256

int drslot_chrp_slot(void);
--
2.37.2

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Aug 24, 2022, 4:57:12 AM8/24/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
When a file descriptor is opened and remain opened, the O_CLOEXEC should be
set so execed children are not inheriting it.

There is no need for file descriptor opened and closed immediately, like in
probe_cpu().

Reviewed-by: Nathan Lynch <nat...@linux.ibm.com>
Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/drmgr/common.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/drmgr/common.c b/src/drmgr/common.c
index 393a997a51b0..622cfaf597f8 100644
--- a/src/drmgr/common.c
+++ b/src/drmgr/common.c
@@ -174,7 +174,7 @@ inline int dr_init(void)
}


- log_fd = open(DR_LOG_PATH, O_RDWR | O_CREAT | O_APPEND,
+ log_fd = open(DR_LOG_PATH, O_RDWR | O_CREAT | O_APPEND | O_CLOEXEC,
S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
if (log_fd == -1) {
log_fd = 0;
@@ -314,7 +314,7 @@ int dr_lock(void)
mode_t old_mode;

old_mode = umask(0);
- dr_lock_fd = open(DR_LOCK_FILE, O_RDWR | O_CREAT,
+ dr_lock_fd = open(DR_LOCK_FILE, O_RDWR | O_CREAT | O_CLOEXEC,
S_IRUSR | S_IRGRP | S_IROTH);
if (dr_lock_fd < 0)
return -1;
@@ -1496,7 +1496,7 @@ int do_kernel_dlpar_common(const char *cmd, int cmdlen, int silent_error)

/* write to file */
if (fd == -1) {
- fd = open(SYSFS_DLPAR_FILE, O_WRONLY);
+ fd = open(SYSFS_DLPAR_FILE, O_WRONLY | O_CLOEXEC);
if (fd < 0) {
say(ERROR,
"Could not open %s to initiate DLPAR request\n",
--
2.37.2

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Aug 24, 2022, 4:57:13 AM8/24/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
There are 3 hooks run when an LPM is performed:
1. check before the LPM is really initiated
1 bis. undocheck if check failed.
2. pre just before entering the switch over
3. post at the end of the LPM operation

Only the check hook's return status is taken in account. If the check hook
return value is different from 0, the LPM is aborted and the outputs of the
check hook are reported to the end user through the HMC.

In the case at least one check hook returned a non zero status, the
undocheck event is run (for all the hooks), and the pre and post events are
not triggered.

The post event is triggered even if the LPM operation has failed.

Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/drmgr/drmig_chrp_pmig.c | 32 +++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/src/drmgr/drmig_chrp_pmig.c b/src/drmgr/drmig_chrp_pmig.c
index f78ff15adc32..f169fa500316 100644
--- a/src/drmgr/drmig_chrp_pmig.c
+++ b/src/drmgr/drmig_chrp_pmig.c
@@ -691,11 +691,15 @@ void post_mobility_update(void)
devtree_update();
}
}
-
+
int drmig_chrp_pmig(void)
{
int rc;
uint64_t stream_val;
+ enum drc_type drc_type = DRC_TYPE_NONE;
+
+ if (usr_action == MIGRATE)
+ drc_type = DRC_TYPE_MIGRATION;

/* Ensure that this partition is migratable/mobile */
if (! pmig_capable()) {
@@ -704,14 +708,11 @@ int drmig_chrp_pmig(void)
return -1;
}

- /* Today we do no pre-checks for migratability. The only check
- * we could do is whether the "ibm,suspend-me" RTAS call exists.
- * But if it doesn't, the firmware level doesn't support migration,
- * in which case why the heck are we being invoked anyways.
- */
- if (strcmp(usr_p_option, "check") == 0) {
- say(DEBUG, "check: Nothing to do...\n");
- return 0;
+ if (usr_action == MIGRATE && (strcmp(usr_p_option, "check") == 0)) {
+ rc = run_hooks(drc_type, HOOK_CHECK);
+ if (rc)
+ run_hooks(drc_type, HOOK_UNDOCHECK);
+ return rc;
}

/* The only other command is pre, any other command is invalid */
@@ -734,6 +735,9 @@ int drmig_chrp_pmig(void)

/* Now do the actual migration */
do {
+ if (usr_action == MIGRATE)
+ run_hooks(drc_type, HOOK_PRE);
+
if (usr_action == MIGRATE)
rc = do_migration(stream_val);
else if (usr_action == HIBERNATE)
@@ -748,10 +752,12 @@ int drmig_chrp_pmig(void)

syslog(LOG_LOCAL0 | LOG_INFO, "drmgr: %s rc %d\n",
(usr_action == MIGRATE ? "migration" : "hibernation"), rc);
- if (rc)
- return rc;

- post_mobility_update();
+ if (!rc)
+ post_mobility_update();

- return 0;
+ /* Post hook is called even if the migration has failed */
+ if (usr_action == MIGRATE)
+ run_hooks(drc_type, HOOK_POST);
+ return rc;
}
--
2.37.2

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Aug 24, 2022, 4:57:13 AM8/24/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
So that there is only one place to convert the name of the drc to the type
of drc and vice versa.

Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/drmgr/common.c | 48 +++++++++++++++++++---------------------------
1 file changed, 20 insertions(+), 28 deletions(-)

diff --git a/src/drmgr/common.c b/src/drmgr/common.c
index 622cfaf597f8..12af7566a048 100644
--- a/src/drmgr/common.c
+++ b/src/drmgr/common.c
@@ -52,6 +52,21 @@ char *remove_slot_fname = REMOVE_SLOT_FNAME;
static int dr_lock_fd = 0;
static long dr_timeout;

+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+static char *drc_type_str[] = {
+ [DRC_TYPE_NONE] = "unknwon",
+ [DRC_TYPE_PCI] = "pci",
+ [DRC_TYPE_SLOT] = "slot",
+ [DRC_TYPE_PHB] = "phb",
+ [DRC_TYPE_CPU] = "cpu",
+ [DRC_TYPE_MEM] = "mem",
+ [DRC_TYPE_PORT] = "port",
+ [DRC_TYPE_HIBERNATE] = "phib",
+ [DRC_TYPE_MIGRATION] = "pmig",
+ [DRC_TYPE_ACC] = "acc",
+};
+
/**
* set_output level
* @brief Common routine to set the output level
@@ -1521,35 +1536,12 @@ int do_kernel_dlpar_common(const char *cmd, int cmdlen, int silent_error)

enum drc_type to_drc_type(const char *arg)
{
- if (!strncmp(arg, "pci", 3))
- return DRC_TYPE_PCI;
-
- if (!strncmp(arg, "slot", 4))
- return DRC_TYPE_SLOT;
-
- if (!strncmp(arg, "phb", 3))
- return DRC_TYPE_PHB;
-
- if (!strncmp(arg, "cpu", 3))
- return DRC_TYPE_CPU;
-
- if (!strncmp(arg, "mem", 3))
- return DRC_TYPE_MEM;
+ enum drc_type i;

- if (!strncmp(arg, "port", 4))
- return DRC_TYPE_PORT;
-
- if (!strncmp(arg, "phib", 4))
- return DRC_TYPE_HIBERNATE;
-
- if (!strncmp(arg, "pmig", 4))
- return DRC_TYPE_MIGRATION;
-
- /*
- * Accelerator
- */
- if (!strncmp(arg, "acc", 3))
- return DRC_TYPE_ACC;
+ for (i = DRC_TYPE_NONE + 1; i < ARRAY_SIZE(drc_type_str); i++) {
+ if (!strcmp(arg, drc_type_str[i]))
+ return i;
+ }

return DRC_TYPE_NONE;
}
--
2.37.2

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Aug 24, 2022, 4:57:13 AM8/24/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
As Nathan reported while reviewing this series:

This RSCT doc indicates that the IBM.ManagementServer resource class has
been superseded:

https://www.ibm.com/docs/en/rsct/3.2?topic=security-management-domain-configuration

Removing that deprecated call in this series because it is conflicting with
a following patch. The system returned value is stored in rc to prevent
compilation error (due to warning if system returned value is not read).

Suggested-by: Nathan Lynch <nat...@linux.ibm.com>
Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/drmgr/drmig_chrp_pmig.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/src/drmgr/drmig_chrp_pmig.c b/src/drmgr/drmig_chrp_pmig.c
index 94cb49afbdab..f78ff15adc32 100644
--- a/src/drmgr/drmig_chrp_pmig.c
+++ b/src/drmgr/drmig_chrp_pmig.c
@@ -753,8 +753,5 @@ int drmig_chrp_pmig(void)

post_mobility_update();

- say(DEBUG, "Refreshing RMC via refrsrc\n");
- rc = system("/usr/sbin/rsct/bin/refrsrc IBM.ManagementServer");
-
return 0;
}
--
2.37.2

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Aug 24, 2022, 4:57:14 AM8/24/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
This man page describe the various drmgr's hooks.

Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
Makefile.am | 1 +
man/drmgr-hooks.8 | 82 +++++++++++++++++++++++++++++++++++++++++++++++
man/drmgr.8 | 1 +
3 files changed, 84 insertions(+)
create mode 100644 man/drmgr-hooks.8

diff --git a/Makefile.am b/Makefile.am
index ba7a3c59e39c..5c0ca3c01926 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -48,6 +48,7 @@ man_MANS = \
man/vcpustat.8 \
man/rtas_dbg.8 \
man/drmgr.8 \
+ man/drmgr-hooks.8 \
man/lparnumascore.8

EXTRA_DIST += $(bin_SCRIPTS) $(sbin_SCRIPTS) $(man_MANS)
diff --git a/man/drmgr-hooks.8 b/man/drmgr-hooks.8
new file mode 100644
index 000000000000..621e4f0cb54e
--- /dev/null
+++ b/man/drmgr-hooks.8
@@ -0,0 +1,82 @@
+.\"
+.\" Copyright (C) 2022 International Business Machines
+.\"
+.TH DRMGR-HOOKS 8 "May 24, 2022" Linux "Linux on Power Service Tools"
+.SH NAME
+drmgr\-hooks \- Hooks run by drmgr
+.SH DESCRIPTION
+When
+.B drmgr
+is run to perform PowerVM Dynamic Logical Partitioning (DLPAR) operations,
+a set of hooks may be triggered to validate, and, or be aware of the incoming operation.
+.P
+Not all the DLPAR operations are providing hook calls.
+Currently only the LPAR Migration operation (LPM) is concerned.
+.P
+The hooks are executable files stored in a directory named "DRC TYPE" in
+.IR /etc/drmgr.d/ .
+For instance, hooks run when a LPAR migration is done are stored in
+.IR /etc/drmgr.d/pmig .
+.P
+Hook files can be symbolic links to executable files. All the hooks can be stored in
+.IR /etc/drmgr.d
+and linked into multiple directories to provide multiple DRC type's hooks.
+.SH ARGUMENTS
+.P
+Hooks are called without any arguments but with at least these 2 environment variable set:
+.TP
+.BI "DRC_TYPE"
+The Dynamic reconfiguration connector type to act upon from the following list:
+.BR pmig ", " pci ", " cpu ", " mem ", " port ", " slot ", " phb "."
+.TP
+.BI "PHASE"
+The phase of the operation from the following list:
+.BR check ", " undocheck ", " pre ", " post "."
+.SH LPAR MIGRATION
+.P
+When a LPAR migration is initiated the
+.B check
+phase is first triggered. Hooks called at check phase may returned a non zero value to prevent the migration operation to happen.
+The error messages displayed in
+.BR STDOUT " or " STDERR
+would be reported to the end user through the HMC.
+.P
+If the
+.B check
+phase has failed, because at least one hook has returned a non null value, the
+.B undocheck
+phase is launched. Return value for the
+.B
+undocheck
+phase is ignored.
+.P
+If the
+.B check
+phase succeeded, the
+.BR pre " and later " post
+phases are triggered. Returned values for these 2 phases are ignored, and the
+.B post
+phase is triggered even if the LPM operation has failed.
+.SH ENVIRONMENT
+.P
+The drmgr's hooks are called while holding the DLPAR lock, so any other
+DLPAR operation initiated from a hook is expected to fail.
+.P
+The hooks standard input
+.B STDIN
+is redirected to
+.I /dev/null
+while STDOUT and STDERR are redirected to pipes.
+The outputs done in these pipes are reported to the end user when a hook has returned an error value and that error value is not ignored (e.g in the LPM, the
+.B check
+phase but not the
+.BR pre "or " post
+phase)
+.P
+Except the variables specified in the ARGUMENTS section, all the environment variables are unset before calling the hook.
+.SH FILES
+.IR /etc/drmgr.d/pmig/
+.SH AUTHOR
+Laurent Dufour <ldu...@linux.ibm.com>
+.SH SEE ALSO
+.BR drmgr (8)
diff --git a/man/drmgr.8 b/man/drmgr.8
index 09944bda2d41..f40136ba366d 100644
--- a/man/drmgr.8
+++ b/man/drmgr.8
@@ -158,3 +158,4 @@ was written by IBM Corporation

.SH SEE ALSO
.BR lsslot "(8)"
+.BR drmgr-hooks "(8)"
--
2.37.2

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Sep 8, 2022, 3:59:24 AM9/8/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
Hi Tyrel,

Could please consider taking this series soon?

Thanks,
Laurent.

Tyrel Datwyler

<tyreld@linux.ibm.com>
unread,
Sep 14, 2022, 4:49:58 PM9/14/22
to Laurent Dufour, powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com
This fails to handle the EINTR case where the call may be interrupted by an
unhandled signal. Need something like the following

-Tyrel

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Sep 16, 2022, 11:58:49 AM9/16/22
to Tyrel Datwyler, powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com
Fair enough, I'll add a loop over the waitpid call like this:

-- a/src/drmgr/common.c
+++ b/src/drmgr/common.c
@@ -1570,7 +1570,9 @@ static int run_one_hook(enum drc_type drc_type, enum
hook_phase phase,

if (child) {
/* Father side */
- if (waitpid(child, &rc, 0) == -1) {
+ while (waitpid(child, &rc, 0) == -1) {
+ if (errno == EINTR)
+ continue;
say(ERROR, "waitpid error: %s\n", strerror(errno));
return -1;

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Sep 16, 2022, 12:39:27 PM9/16/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
This series introduce a hooks called by drmgr when it is performing
LPM. The hook mechanism is generic and should apply later for any other
operation.

This series is only providing hooks for LPM.

There are 4 phases defined:
- check: called before the operation is initiated
- undocheck: called in the case the check phase has failed
- pre: called just before the operation is really started
- post: called once the operation is done (with or without success)

The return value of the hook is ignored except for the check phase.

The hooks can report error up to the end user through STDOUT and STDERR.

There is no timeout, so a hook may block the overall operation. But each
hook run is logged in the drmgr's log so it should be easy to identify the
culprit.

Hooks are stored in the directory /etc/drmgr.d<DRC TYPE>, and run with at
least the following environment variables set: $DRC_TYPE and $PHASE

This allows a same hook to be registered for multiple DRC types if needed.

The 3 first patches of this series are doing some cleanup.

Changes in v6:
- handle waitpid's EINTR returned error (patch 5/7)
Changes in v5:
- Remove deprecated call to 'refrsrc IBM.ManagementServer' (patch 2/7)
- Fix typo in the DRC name table (patch 4/7)
Changes in v4:
- Use environment variables to pass arguments to the hook.
- Use fork & exec instead of system to control the environment.
- Clear all environment inherited by the hook process
- Addresses Nathan and Tyrel's comments
Changes in v3:
- Manage the phase parameter in the hook framework (patches 5-6/7)
- Merge the newly "acc" DRC type introduce (patches 4-5/7)
Changes in v2:
- Don't return error when /etc/drmgr.d is not present

Laurent Dufour (7):
drmgr/pmig: remove unused code
drmgr/pmig: remove deprecated call to refrsrc IBM.ManagementServer
drmgr: prevent file descriptor to be inherited when execing a child
drmgr: introduce a DRC type name table
drmgr: introducing the hook framework
drmgr: introducing the LPM hooks
drmgr: add the drmgr-hooks man file.

Makefile.am | 1 +
man/drmgr-hooks.8 | 82 ++++++++++++++++
man/drmgr.8 | 1 +
src/drmgr/common.c | 186 +++++++++++++++++++++++++++++++-----
src/drmgr/dr.h | 4 +
src/drmgr/drmig_chrp_pmig.c | 40 ++++----
6 files changed, 268 insertions(+), 46 deletions(-)
create mode 100644 man/drmgr-hooks.8

--
2.37.3

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Sep 16, 2022, 12:39:27 PM9/16/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
As Nathan reported while reviewing this series:

This RSCT doc indicates that the IBM.ManagementServer resource class has
been superseded:

https://www.ibm.com/docs/en/rsct/3.2?topic=security-management-domain-configuration

Removing that deprecated call in this series because it is conflicting with
a following patch. The system returned value is stored in rc to prevent
compilation error (due to warning if system returned value is not read).

Suggested-by: Nathan Lynch <nat...@linux.ibm.com>
Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/drmgr/drmig_chrp_pmig.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/src/drmgr/drmig_chrp_pmig.c b/src/drmgr/drmig_chrp_pmig.c
index 94cb49afbdab..f78ff15adc32 100644
--- a/src/drmgr/drmig_chrp_pmig.c
+++ b/src/drmgr/drmig_chrp_pmig.c
@@ -753,8 +753,5 @@ int drmig_chrp_pmig(void)

post_mobility_update();

- say(DEBUG, "Refreshing RMC via refrsrc\n");
- rc = system("/usr/sbin/rsct/bin/refrsrc IBM.ManagementServer");
-
return 0;
}
--
2.37.3

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Sep 16, 2022, 12:39:28 PM9/16/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
The hook framework run in a sequence any executable file found in the
relevant directory (/etc/drmgr.d/<DRC TYPE NAME>/)

The hook are run according to the versionsort()'s output order.

The hook inherits from drmgr its standard I/O streams. All others file
descriptor should have the close on exec flag set to ensure they will be
closed when executing an hook.

The hooks are run with no arguments, arguments are passed through
environment variable.

The inherited environment is cleaned and 2 environment variables
are set:
- DRC_TYPE containing the DRC type string
- PHASE containing the current phase

There are 4 known phases: check, undocheck, pre and post.

The hook's run is recorded in the drmgr's log, so blocking hook could be
identified.

Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/drmgr/common.c | 144 +++++++++++++++++++++++++++++++++++++++++++++
src/drmgr/dr.h | 4 ++
2 files changed, 148 insertions(+)

diff --git a/src/drmgr/common.c b/src/drmgr/common.c
index 12af7566a048..9cd91d1d627a 100644
--- a/src/drmgr/common.c
+++ b/src/drmgr/common.c
@@ -49,6 +49,8 @@ char *remove_slot_fname = REMOVE_SLOT_FNAME;

#define SYSFS_DLPAR_FILE "/sys/kernel/dlpar"

+#define DR_SCRIPT_DIR "/etc/drmgr.d"
+
static int dr_lock_fd = 0;
static long dr_timeout;

@@ -67,6 +69,13 @@ static char *drc_type_str[] = {
[DRC_TYPE_ACC] = "acc",
};

+static char *hook_phase_name[] = {
+ [HOOK_CHECK] = "check",
+ [HOOK_UNDOCHECK] = "undocheck",
+ [HOOK_PRE] = "pre",
+ [HOOK_POST] = "post",
+};
+
/**
* set_output level
* @brief Common routine to set the output level
@@ -1546,3 +1555,138 @@ enum drc_type to_drc_type(const char *arg)
return DRC_TYPE_NONE;
}

+static int run_one_hook(enum drc_type drc_type, enum hook_phase phase,
+ const char *name)
+{
+ int rc;
+ pid_t child;
+
+ fflush(NULL);
+ child = fork();
+ if (child == -1) {
+ say(ERROR, "Can't fork to run a hook: %s\n", strerror(errno));
+ return -1;
+ }
+
+ if (child) {
+ /* Father side */
+ while (waitpid(child, &rc, 0) == -1) {
+ if (errno == EINTR)
+ continue;
+ say(ERROR, "waitpid error: %s\n", strerror(errno));
+ return -1;
+ }
--
2.37.3

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Sep 16, 2022, 12:39:28 PM9/16/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
There are 3 hooks run when an LPM is performed:
1. check before the LPM is really initiated
1 bis. undocheck if check failed.
2. pre just before entering the switch over
3. post at the end of the LPM operation

Only the check hook's return status is taken in account. If the check hook
return value is different from 0, the LPM is aborted and the outputs of the
check hook are reported to the end user through the HMC.

In the case at least one check hook returned a non zero status, the
undocheck event is run (for all the hooks), and the pre and post events are
not triggered.

The post event is triggered even if the LPM operation has failed.

Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/drmgr/drmig_chrp_pmig.c | 32 +++++++++++++++++++-------------
1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/src/drmgr/drmig_chrp_pmig.c b/src/drmgr/drmig_chrp_pmig.c
index f78ff15adc32..f169fa500316 100644
--- a/src/drmgr/drmig_chrp_pmig.c
+++ b/src/drmgr/drmig_chrp_pmig.c
2.37.3

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Sep 16, 2022, 12:39:29 PM9/16/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
This man page describe the various drmgr's hooks.

Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
Makefile.am | 1 +
2.37.3

Laurent Dufour

<ldufour@linux.ibm.com>
unread,
Sep 16, 2022, 12:39:29 PM9/16/22
to powerpc-utils-devel@googlegroups.com, nathanl@linux.ibm.com, tyreld@linux.ibm.com
So that there is only one place to convert the name of the drc to the type
of drc and vice versa.

Signed-off-by: Laurent Dufour <ldu...@linux.ibm.com>
---
src/drmgr/common.c | 48 +++++++++++++++++++---------------------------
1 file changed, 20 insertions(+), 28 deletions(-)

diff --git a/src/drmgr/common.c b/src/drmgr/common.c
index 622cfaf597f8..12af7566a048 100644
--- a/src/drmgr/common.c
+++ b/src/drmgr/common.c
@@ -52,6 +52,21 @@ char *remove_slot_fname = REMOVE_SLOT_FNAME;
static int dr_lock_fd = 0;
static long dr_timeout;

+#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
+
+static char *drc_type_str[] = {
+ [DRC_TYPE_NONE] = "unknwon",
+ [DRC_TYPE_PCI] = "pci",
+ [DRC_TYPE_SLOT] = "slot",
+ [DRC_TYPE_PHB] = "phb",
+ [DRC_TYPE_CPU] = "cpu",
+ [DRC_TYPE_MEM] = "mem",
+ [DRC_TYPE_PORT] = "port",
+ [DRC_TYPE_HIBERNATE] = "phib",
+ [DRC_TYPE_MIGRATION] = "pmig",
+ [DRC_TYPE_ACC] = "acc",
+};
+
/**
* set_output level
* @brief Common routine to set the output level
@@ -1521,35 +1536,12 @@ int do_kernel_dlpar_common(const char *cmd, int cmdlen, int silent_error)

enum drc_type to_drc_type(const char *arg)
2.37.3

Reply all
Reply to author
Forward
0 new messages