libmesh restart error

16 views
Skip to first unread message

hoover.a...@gmail.com

unread,
Mar 27, 2026, 7:54:37 PM (8 days ago) Mar 27
to IBAMR Users
Hi all,

I'm running into an error after a restart. The code ran fine with a previous (0.16) install, though now I'm running into the following error with the  current (0.19) install. I included my header files as well as the restart commands.

111] src/mesh/exodusII_io_helper.C, line 551, compiled nodate at notime
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
[2026-03-27T19:44:16.671] error: *** STEP 46011582.0 ON p0213 CANCELLED AT 2026-03-27T19:44:16 DUE to SIGNAL Killed ***
[2026-03-27T19:44:16.672] error: *** STEP 46011582.0 ON p0213 CANCELLED AT 2026-03-27T19:44:16 DUE to SIGNAL Killed ***
[2026-03-27T19:44:18.286] error:  mpi/pmix_v4: _errhandler: p0214 [1]: pmixp_client_v2.c:230: Error handler invoked: status = -61, source = [slurm.pmix.46011582.0:44]

restart commands:
    const bool dump_restart_data = app_initializer->dumpRestartData();
    const int restart_dump_interval = app_initializer->getRestartDumpInterval();
    const string restart_dump_dirname = app_initializer->getRestartDumpDirectory();
    const string restart_read_dirname = app_initializer->getRestartReadDirectory();
    const int restart_restore_num = app_initializer->getRestartRestoreNumber();

// Config files
#include <ibamr/config.h>
#include <ibtk/config.h>
#include <SAMRAI_config.h>

// Headers for basic PETSc functions
#include <petscsys.h>

// Headers for basic SAMRAI objects
#include <BergerRigoutsos.h>
#include <CartesianGridGeometry.h>
#include <LoadBalancer.h>
#include <StandardTagAndInitialize.h>

// Headers for basic libMesh objects
#include <libmesh/boundary_info.h>
#include <libmesh/equation_systems.h>
#include <libmesh/exodusII_io.h>
#include <libmesh/mesh.h>
#include <libmesh/mesh_function.h>
#include <libmesh/mesh_generation.h>
#include <libmesh/mesh_triangle_interface.h>

// Headers for application-specific algorithm/data structure objects
#include <ibamr/IBExplicitHierarchyIntegrator.h>
#include <ibamr/IBFEMethod.h>
#include <ibamr/INSCollocatedHierarchyIntegrator.h>
#include <ibamr/INSStaggeredHierarchyIntegrator.h>

#include <ibtk/AppInitializer.h>
// #include <ibtk/IBTKInit.h>
#include <ibtk/IBTK_MPI.h>
#include <ibtk/libmesh_utilities.h>
#include <ibtk/muParserCartGridFunction.h>
#include <ibtk/muParserRobinBcCoefs.h>

#include <boost/multi_array.hpp>

// Set up application namespace declarations
#include <ibamr/app_namespaces.h>

Boyce Griffith

unread,
Mar 28, 2026, 9:29:35 AM (7 days ago) Mar 28
to ibamr...@googlegroups.com, Users IBAMR
Are you trying to restart a run from the older version with the newer code?

On Mar 27, 2026, at 7:54 PM, hoover.a...@gmail.com <hoover.a...@gmail.com> wrote:

Hi all,
--
You received this message because you are subscribed to the Google Groups "IBAMR Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ibamr-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ibamr-users/aaab16a6-7750-44a5-b76b-c0a59bbbd8fen%40googlegroups.com.

Boyce Griffith

unread,
Mar 28, 2026, 7:01:23 PM (7 days ago) Mar 28
to IBAMR Users
To me this looks like you are running out of memory or something like that.

David, do know if we have an IBFE restart test?

On Mar 28, 2026, at 2:05 PM, hoover.a...@gmail.com <hoover.a...@gmail.com> wrote:

The run was started and restarted in the new version (0.19). The code was originally run without issue in 0.16, and, apart from a few header files, the code is essentially the same. Is there a different restart procedure?

hoover.a...@gmail.com

unread,
Mar 29, 2026, 2:17:36 PM (6 days ago) Mar 29
to IBAMR Users
Just to correct the first message, the code was ported from a 0.8 install to a 0.17 install. I have been getting more oom errors since the new install (which was necessitated due to a system wide update from RHEL7 to RHEL9), and generally things have been running slower/running into regridding issues. This run was using 4 nodes * 32 cores on an exclusive node with 192 GB (https://www.osc.edu/resources/technical_support/supercomputers/pitzer). 

Wells, David

unread,
Mar 30, 2026, 10:04:53 AM (5 days ago) Mar 30
to ibamr...@googlegroups.com
Hi Alex and Boyce,

We haven't made significant changes to how this part of IBAMR works in a long time (a quick look at the version history shows that we last did things here circa 2020 when we introduced the separate mechanics solver). I am a bit puzzled to see that assertion error in this context since we use XDR, not ExodusII, for restart files.

It would be really great if we could get a stack trace for that error. Can you run this on a workstation and get one? Longer-term: I am really curious about the regridding and memory usage problems since Boyce and I have done a lot of work to solve those issues for our own code but it appears we made some things worse. Would you be willing to share your app with us so we can do some memory profiling?

Best,
David Wells

From: ibamr...@googlegroups.com <ibamr...@googlegroups.com> on behalf of hoover.a...@gmail.com <hoover.a...@gmail.com>
Sent: Sunday, March 29, 2026 2:17 PM
To: IBAMR Users <ibamr...@googlegroups.com>
Subject: Re: [ibamr-users] libmesh restart error
 
Reply all
Reply to author
Forward
0 new messages