PMIx v6.1.0 and PRRTE v4.1.0 release candidates posted for test

0 views
Skip to first unread message

Ralph Castain

unread,
Feb 5, 2026, 1:35:01 PM (20 hours ago) Feb 5
to pm...@googlegroups.com
Hello folks

I have posted the first release candidates for PMIx v6.1.0 and PRRTE v4.1.0 in the usual places - please give them a try if you can.

Ralph

-----------------

This is the the first release candidate for the second release in the v6 family.

IMPORTANT
The release is based on a refork of the PMIx master branch as the changes since v6.0.0 were extensive. Many of the changes were bug fixes, but the following significant changes are included:

  • build requirements for Git clones includes a minimum
    Python version of 3.6 - the requirement does not apply
    to builds from tarball
  • server upcalls no longer require that the upcall return
    prior to the server executing the provided callback
    function
  • all APIs are now threadshifted prior to execution for
    thread safety. Hosts that are providing their own
    progress engine (in lieu of using the PMIx internal
    progress thread) must ensure that progress is being
    provided sufficient to avoid threadlock when calling
    PMIx APIs.
  • listener thread ports can now be specified as a comma-delimited
    list of ranges instead of only a single port
  • connection authentication has been tightened and greater
    controls provided via attributes
  • support for process and node statistics gathering has
    been added as per the Standard
  • a new API (PMIx_Progress_thread_stop) has been added to
    direct that the internal progress thread be stopped. This
    allows the host to stop the progress thread independent
    from calling a PMIx "finalize" routine.

Detailed changes since refork include:

  • PR #3800: Multiple commits
    • Make thread start/stop marker consistent
    • Final NEWS update
  • PR #3798: Update NEWS and VERSION for rc1
  • PR #3797: Multiple commits
    • fix a problem after second pmix init
    • Add missing attribute
  • PR #3794: Do not double-process IOF formats
  • PR #3792: Multiple commits
    • Cleanup inii/finalize cycle
    • Do not shutdown libevent during finalize
  • PR #3788: Multiple commits
    • Update PMIx_Fence to fully conform to Standard
    • Threadshift IOF API calls
    • Update log support to conform to Standard
    • update-my-copyright.py: properly support git workspaces
    • Seal memory leak
    • Update the monitor_multi example
    • Stop the progress thread right away in server_finalize
    • Silence valgrind issues
    • Implement new API to stop the progress thread
    • Allow passing of progress thread to stop - default NULL to all
    • Protect callbacks from threadshift when progress thread is stopped
    • Add capability: get number function available
    • Ensure to store group info in PMIx server
    • Revamp the pmix_info support
    • Fine-tune the show-version option
    • Improve description of PMIx_Compute_distances API
    • Fully support return of static values
    • Cleanup and abstract pmix_info support
    • Correctly threadshift PMIx_IOF_push directives
    • Correct cflags used for check_compiler_version.m4

Detailed changes since v6.0.0 included from master:

  • PR #3759: Silence some Coverity warnings
  • PR #3751: Potential double free and use after free (alerts 13,14)
  • PR #3750: Potentially overflowing call to snprintf (alerts 11, 12)
  • PR #3749: Workflow does not contain permissions
  • PR #3748: printf.c: fix off-by-one + underflow errors
  • PR #3747: Use floating numbers for float/double comparisons
  • PR #3743: Extend debugger CI tests
  • PR #3741: Fix indirect debugger launch
  • PR #3739: Add refresh test
  • PR #3735: Cleanup ready-for-debug announcement
  • PR #3734: Fix cmd line option checker
  • PR #3732: Fix bitmap mask literal size
  • PR #3729: Check if we fork'd the tool ourselves
  • PR #3724: Protect against equal signs in option check
  • PR #3722: Implement support for resource usage monitoring
  • PR #3718: Enable use of loopback interface
  • PR #3716: Add new PMIX_GROUP_FINAL_MEMBERSHIP_ORDER attribute
  • PR #3714: Replace sprintf with snprintf
  • PR #3713: Replace int taint limits with defined names
  • PR #3712: Silence latest Coverity warning report
  • PR #3711: Flush namespace sinks' residuals before destroying namespace
  • PR #3710: Port bug fixes to zlibng component
  • PR #3709: bitmap num_set boundary condition bugfix
  • PR #3708: preg/compress parsing bugfix
  • PR #3707: Silence Coverity warning
  • PR #3706: Update the plog framework
  • PR #3705: Check only for existence of PMIx capability flag
  • PR #3702: Remove unnecessary locks from munge psec module
  • PR #3701: Avoid use of API in PMIx_Init
  • PR #3700: Silence Coverity warnings
  • PR #3699: Silence Coverity warnings
  • PR #3697: Silence Coverity warnings
  • PR #3696: Switch to atomics for tracking initialization
  • PR #3695: Change to using atomic for show_help_enabled
  • PR #3694: Silence Coverity warning
  • PR #3693: Remove stale/unused tests
  • PR #3692: Silence more Coverity warnings
  • PR #3691: Silence Coverity warnings
  • PR #3690: Use the correct value for the number of info to unpack
  • PR #3689: Silence more Coverity warnings
  • PR #3688: Silence Coverity warnings
  • PR #3686: Extend listener thread port specification to support ranges
  • PR #3684: Fix compression components
  • PR #3682: Add attribute to request reports be in physical CPU IDs
  • PR #3681: Add set-env cmd line option definition
  • PR #3680: Minor change to thread construct/ops
  • PR #3678: Fix error code on blocking PMIx_Notify_event calls
  • PR #3676: Silence a few Coverity complaints
  • PR #3675: Improve selection of interfaces
  • PR #3674: Multiple commits
    • Do not remove nspace from global list on rejected connection
    • Allow foreign tools by default
    • Cleanup a bit on connection handling
    • Avoid duplicate namespace entries
  • PR #3672: define default MAXPATHLEN if not defined by system
  • PR #3670: Bugfix in pmix_bitmap_num_set_bits
  • PR #3669: Fix the abort server upcall
  • PR #3667: Update listener thread setting of permissions on connection files
  • PR #3664: Extend authentication support
  • PR #3663: Provide more info on connections
  • PR #3662: Pass the client's pid as well
  • PR #3661: ci: add group_bootstrap to CI
  • PR #3659: Prevent memory overrun in regx calculation
  • PR #3658: Silence Coverity complaints
  • PR #3657: Revamp stats implementation to reflect Standard
  • PR #3656: Pass the uid/gid for client connections
  • PR #3654: Provide better FQDN support
  • PR #3651: Don't fail when PMIX_IOF_OUTPUT_TO_FILE directory exists
  • PR #3650: Parameterize client finalize timeout
  • PR #3649: Update termios right away
  • PR #3648: Continue work on pty support
  • PR #3647: Work on enabling "pty" behaviors
  • PR #3646: Always search help arrays if initialized
  • PR #3643: Check return code for notify ready-for-debug
  • PR #3642: Add debugger checks to CI
  • PR #3641: Correct client notify of ready for debugger
  • PR #3640: Only report bad prefix if verbose requested
  • PR #3638: Change default show-load-errors to "none"
  • PR #3637: Prevent show-help from using IOF too soon
  • PR #3635: Update to track changes in Standard
  • PR #3633: Cleanup some group docs
  • PR #3632: Update CI
  • PR #3631: Ensure cleanup of allocated pmix_info_t
  • PR #3630: Properly trigger the "keepalive failed" event
  • PR #3629: Provide better singleton support and support blocking event notify
  • PR #3628: python-bindings: add CI and avoid 'long' integer error
  • PR #3622: Update OAC submodule
  • PR #3620: Handle some corner cases for data ops
  • PR #3619: Update the Data pack/unpack functions
  • PR #3618: First set of API updates
  • PR #3615: Check for pthread_np.h header
  • PR #3614: Complete sweep of server upcall callback functions
  • PR #3613: Add a PMIX_FWD_ENVIRONMENT attribute
  • PR #3610: Threadshift the PMIx_Notify_event API
  • PR #3609: Delete built files on "make clean"
  • PR #3608: Decrease min Py version to 3.6
  • PR #3607: Continue work on threadshifting all upcall callbacks
  • PR #3606: Continue work on threadshifting all upcall callbacks
  • PR #3605: Ensure more upcall cbfuncs threadshift
  • PR #3602: Fix the wrapper compiler
  • PR #3601: Ensure to threadshift callback functions
  • PR #3599: Update news from release branches

------------------------------

This is the first release candidate for the second release in the v4 family.


IMPORTANT
This series is based on a forking of the PRRTE master branch. The v4.1.0 release contains quite a few bug fixes, plus the following significant changes:

  • minimum PMIx required for build and execution is now v6.1.0
  • minimum Python required to build from Git clone is v3.6
  • MPMD jobs can now specify mapping directives for each app
  • Group construct now returns all required info as per Standard
  • MCA parameter support has been extended to include the display
    and runtime-options and cpus-per-rank cmd line directives
  • fixed debugger daemon rendezvous on remote nodes and debugger
    connection to launcher spawned by the debugger itself

Commits since branch was forked:

  • PR #2397: Update NEWS for rc1
  • PR #2396: Remove the "compat" macro definitions
  • PR #2394: Purge checks for earlier PMIx versions
  • PR #2392: Require cherry-picks on this release branch
  • PR #2391: Update NEWS and VERSION

Commits from master branch since v4.0.0 include:

  • PR #2389: Initialize variable
  • PR #2388: Correct cflags used for check_compiler_version.m4
  • PR #2387: Initialize variable before use
  • PR #2386: Cleanup prte_info support
  • PR #2383: Pass group info in PMIx server callback
  • PR #2382: Support backward compatibility with PMIx v6.0.0
  • PR #2381: Stop PMIx progress thread during abnormal shutdown
  • PR #2380: Treat a NULL name as indicating "stop all threads"
  • PR #2379: Use session_dir_finalize to preserve output files
  • PR #2378: Stop PMIx progress thread at beginning of finalize
  • PR #2376: Update the directive list in check_multi
  • PR #2374: Provide an MCA param to control hwloc shmem sharing
  • PR #2372: update-my-copyright.py: properly support git workspaces
  • PR #2371: Update handling of log requests
  • PR #2370: Fix typo and add MCA param for default output options
  • PR #2369: Reduce min PMIx version t v6.0.1
  • PR #2368: Silence some Coverity warnings
  • PR #2367: Add MCA param to set personality
  • PR #2365: Initialize ESS before prterun fallback to prun_common()
  • PR #2363: Create a custom signature if asymmetric topology found
  • PR #2361: Cleanup bind output when partial allocations exist
  • PR #2360: Revert back to explicitly setting hwloc support for xml import
  • PR #2357: Update prun to new cmd line option spellings
  • PR #2356: Enable MCA param support for display and runtime-options
  • PR #2353: Don't force map display for donotlaunch if bind display requested
  • PR #2351: Fix hetero node launch
  • PR #2350: Potential use after free (alerts 10-12)
  • PR #2349: Workflow does not contain permissions
  • PR #2348: Correct computation of nprocs for ppr
  • PR #2347: Support multi-app map-by specifications
  • PR #2346: Fix debugger daemon rendezvous on remote nodes
  • PR #2344: Multiple commits
    • Fix ppr mapper
    • Extend existing build check CI
  • PR #2340: Slight mod to the indirect debugger example
  • PR #2338: Correct the nprocs check to support moving to next node
  • PR #2337: Perform sanity check on cmd line
  • PR #2335: Allow setting default cpus/rank
  • PR #2333: Retrieve PMIX_PARENT_ID with wildcard rank
  • PR #2331: Remove stale references to LIKWID mapper
  • PR #2329: Centralize quickmatch to ensure consistency
  • PR #2327: Add the rmaps/lsf component
  • PR #2326: Update ras to allow rank/seq files to define allocation
  • PR #2325: Remove LSF references in rank_file mapper
  • PR #2324: Some cleanup on hostfile parsing
  • PR #2323: Do not override user-specified num procs
  • PR #2322: Correctly set the number of procs for rankfile apps
  • PR #2319: Fix prte_info to output correct version
  • PR #2317: Fix printing of binding ranges
  • PR #2316: Correct ordering of macro variables
  • PR #2315: Provide skeleton for passing allocation requests through RAS
  • PR #2313: Implement support for resource usage monitoring
  • PR #2311: PLM: mark uncached daemons as reported
  • PR #2308: Fix prun tool
  • PR #2307: Cleanup and improve autohandling of hetero nodes
  • PR #2306: Tweak the forwarding of signals
  • PR #2305: Improve hetero node detection a bit
  • PR #2304: Correct show-help message
  • PR #2303: Allow PMIx group construct caller to specify the order of the final membership
  • PR #2301: Fix map-by pe-list when using core CPUs
  • PR #2300: Add launching-apps section to docs
  • PR #2299: Extend timeout to child jobs
  • PR #2298: Replace sprintf with snprintf
  • PR #2297: Fix relative node processing
  • PR #2296: Let seq and rankfile mappers compute their own num-procs
  • PR #2295: Error out when asymmetric topologies cannot support ppr requests
  • PR #2294: Fix printout of binding cpus
  • PR #2293: Do not assign DVM's bookmark to the application job
  • PR #2291: Add some minor verbose debug output
  • PR #2290: Check only for existence of PMIx capability flag
  • PR #2289: Fix the definition checks for tool_connected2 and log2 integration
  • PR #2288: Update with new log2 and tool_connected2 upcalls
  • PR #2287: Clarify help messages
  • PR #2281: Bugfix: inconsistently setting PMIX_JOB_RECOVERABLE
  • PR #2280: Fix display of physical vs logical CPUs
  • PR #2279: Fix precedence ordering on envar operations
  • PR #2278: Fix the colocation algorithm
  • PR #2277: Silence Coverity warnings
  • PR #2276: Silence more Coverity warnings
  • PR #2275: Silence even more Coverity warnings
  • PR #2274: Silence more Coverity warnings
  • PR #2273: Extend testbuild launchers support
  • PR #2272: Silence yet more Coverity warnings
  • PR #2271: Silence more Coverity warnings
  • PR #2270: Silence more Coverity warnings
  • PR #2268: Extend inheritance to app level
  • PR #2266: Inherit env directives if requested
  • PR #2265: Move prte function into librar
  • PR #2264: Silence more Coverity warnings
  • PR #2262: Silence Coverity warnings
  • PR #2261: Ensure we have HNP node aliases
  • PR #2260: Extend control over client connections
  • PR #2257: Resolve Coverity issues
  • PR #2252: Extend support for specifying tool connection parameters
  • PR #2251: Cleanup queries and completely register tools
  • PR #2250: Correct handling of tool-based spawn requests
  • PR #2249: Tool registration updates
  • PR #2248: Include node object when registering tool
  • PR #2246: Properly implement the "abort" operation
  • PR #2243: Adjust top session dir name
  • PR #2241: Add some finer-grained connection support
  • PR #2240: Customize the OMPI "allow-run-as-root" doc snippet
  • PR #2239: Runtime check both min/max PMIx versions
  • PR #2238: Runtime check that PMIx meets min requirement
  • PR #2237: Check for PMIx version too high
  • PR #2234: Add support for client_connected2 server module upcall
  • PR #2233: Declare the process set during registration
  • PR #2231: Provide error message when ssh fails
  • PR #2230: Add some missing command strings for debug output
  • PR #2228: Minor cleanups in tool connection
  • PR #2227: Process deprecated "stop" CLI
  • PR #2226: Update CI
  • PR #2225: Ensure to progress job launch for singleton
  • PR #2224: Properly handle sigterm when started by singleton
  • PR #2221: iof/hnp: correctly handle short write to stdin
  • PR #2217: Update OAC submodule
  • PR #2216: Update child job's fwd environment flag
  • PR #2213: Remove debug output
  • PR #2211: Check for pthread_np.h header
  • PR #2209: src/docs/prrte-rst-content: Add missing file to Makefile.am
  • PR #2208: Preserve formatting in show-help output
  • PR #2207: Support fwd-environment directives for spawned child jobs
  • PR #2206: Preserve source ID across API call
  • PR #2205: Reduce min Python version to 3.6
Reply all
Reply to author
Forward
0 new messages