Late-timestep termination (Signal 9) after regridding in a 3D IBAMR simulation

12 views
Skip to first unread message

liwei yao

unread,
Jan 14, 2026, 7:48:16 AMJan 14
to ibamr...@googlegroups.com

Dear IBAMR developers and users,

I am writing to seek advice regarding a runtime termination issue encountered in a 3D IBAMR simulation. The simulation consistently terminates at a late timestep, and the operating system reports the event as a Signal 9 (Killed).

Execution command

mpirun -n 1 ./main3d ../input3d

System and software environment

  • Operating system: Ubuntu 22.04

  • IBAMR version: 0.18

  • Hardware: local workstation with 32 GB RAM

  • Execution mode: serial run (single MPI rank)

Observed simulation behavior

  • The simulation runs stably for a long duration.

  • The termination always occurs at timestep 15,942.

  • Simulation time at termination: t = 3.1884.

Observed system behavior

  • Resident memory usage increases continuously during the simulation, with occasional small drops (on the order of less than 1 GB).

  • Physical memory usage approaches the system limit prior to termination.

  • The job exits with Signal 9 (Killed), as reported by mpirun.

Terminal output at termination

At beginning of timestep #15942 Simulation time is 3.1884 -------------------------------------------------------------------------- mpirun noticed that process rank 0 exited on signal 9 (Killed).

System log (kernel OOM message)

Out of memory: Killed process 7762 (main3d) total-vm: 56191884 kB, anon-rss: 31295928 kB, file-rss: 564 kB

Relevant excerpts from eel3d.log (last timesteps)

At the end of timestep 15941, the simulation completed a regular timestep without regridding:

At end of timestep #15941 Simulation time is 3.1884

At the beginning of timestep 15942, a regridding operation was performed:

At beginning of timestep #15942 Simulation time is 3.1884 IBHierarchyIntegrator::advanceHierarchy(): regridding prior to timestep 15942 IBHierarchyIntegrator::regridHierarchy(): starting Lagrangian data movement IBHierarchyIntegrator::regridHierarchy(): regridding the patch hierarchy IBHierarchyIntegrator::regridHierarchy(): finishing Lagrangian data movement INSStaggeredHierarchyIntegrator::preprocessIntegrateHierarchy(): initializing convective operator INSStaggeredHierarchyIntegrator::preprocessIntegrateHierarchy(): initializing velocity subdomain solver INSStaggeredHierarchyIntegrator::preprocessIntegrateHierarchy(): initializing pressure subdomain solver INSStaggeredHierarchyIntegrator::preprocessIntegrateHierarchy(): initializing incompressible Stokes solver

No further output is produced after this point.

Grid and AMR configuration

  • AMR enabled

  • MAX_LEVELS = 4

  • REF_RATIO = 4

Base grid resolution

  • NX = 200, NY = 48, NZ = 24

Physical domain

  • XMIN = -54, XMAX = 10

  • YMIN = -7.68, YMAX = 7.68

  • ZMIN = -3.84, ZMAX = 3.84

Regridding parameters

  • REGRID_CFL_INTERVAL = 0.5

  • Previous test: REGRID_CFL_INTERVAL = 0.3 (similar late-timestep termination observed)

  • Vorticity-based refinement thresholds:

    vorticity_abs_thresh = 1.0, 2.0, 4.0, 8.0

For your reference, I have attached the input3d configuration file and the source code for the custom IBKinematics class. I have also included snapshots of the computational grid and the flow field taken shortly before the crash to illustrate the AMR status at that time.

Any guidance or suggestions you could provide would be deeply appreciated. Thank you very much for your time and assistance.

Thank you very much for your time and assistance.

Best regards,

Dong Liwei

微信图片_20260114191047_318_30.png微信图片_20260114191121_319_30.png
IBEELKinematics3d.cpp
input3d

Boyce Griffith

unread,
Jan 14, 2026, 8:01:32 AMJan 14
to IBAMR Users
Liwei —

I suspect that you are running out of memory. Does this crash happen right after regridding, or does it occur in the middle of a time step?

If you are using AMR and the fraction of the domain that is covered by the finest level of the AMR grid hierarchy is growing, then it may not be surprising that memory usage grows in time.

Another possibility is that, if you are using a “deep” AMR grid hierarchy, you may be encountering overlapping AMR grid patches after regridding. In this case, there are iterative Krylov solvers that will fail to converge, which in turn can generate more Krylov vectors than a typical time step. If you were already close to the limit, this could push you over. (Although, if the linear solvers don’t converge, you have more problems than just memory usage!!!)

There also were some genuine memory issues in IBAMR and SAMRAI. To address these, we made changes in the way that some memory handling works in IBAMR and IBAMR’s IBSAMRAI2 library in IBAMR 0.18. (This is why IBAMR 0.18 requires IBSAMRAI2.) Do you run into this problem with older versions of IBAMR?

Finally, you might see if the memory issue also occurs with a different MPI stack. What MPI library are you using on this system?

— Boyce

<????_2026011419 1047_318_30.png><????_2026011419 1121_319_30.png>

--
You received this message because you are subscribed to the Google Groups "IBAMR Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ibamr-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/ibamr-users/CADem8TV3%2BSqWoUO_bMKbpKSpX%3D6-Wu%2BM0gd6RgbOpxnuRCFzMQ%40mail.gmail.com.
<IBEELKinematics3d.cpp><input3d>

liwei yao

unread,
Jan 14, 2026, 11:21:49 PMJan 14
to ibamr...@googlegroups.com

Subject: Re: suspected memory issue in IBAMR

Hi Boyce,

Thank you for your detailed reply. Based on your suggestions, I checked the relevant information and summarize the observations below.

1. Crash timing and log location

The crash occurs after a regridding operation.
The IBAMR log terminates at the following line:

INSStaggeredHierarchyIntegrator::preprocessIntegrateHierarchy(): initializing incompressible Stokes solver

There is no further output beyond this point, and the process is terminated by the operating system (Signal 9).

2. Solver behavior prior to the crash

Prior to the crash, there are no indications of numerical instability or solver divergence.
In the time steps immediately preceding the regridding event, the incompressible Stokes solver converges in a single iteration with small residual norms.

Specifically, the log reports:

  • Timestep #15940
    Stokes solve number of iterations = 1
    Stokes solve residual norm = 1.13512 × 10⁻⁴

  • Timestep #15941
    Stokes solve number of iterations = 1
    Stokes solve residual norm = 1.01129 × 10⁻⁴

These values are taken directly from the IBAMR log output immediately before the regridding operation at timestep #15942.

3. System log (kernel OOM message)

From the system log, the following kernel message was recorded:

Out of memory: Killed process 7762 (main3d)
total-vm: 56191884 kB, anon-rss: 31295928 kB, file-rss: 564 kB

4. Grid hierarchy evolution 

A comparison of the grid files between the last two adjacent time steps shows an expansion of the refined area. 

The total number of grid cells increased from 9,097,152 to 9,318,336 in the single step immediately preceding the crash.  

_cgi-bin_mmwebwx-bin_webwxgetmsgimg__&MsgID=132663697272650121&skey=@crypt_62d1f718_48ec886dd22cfe664e14d148b96db22c&mmweb_appid=wx_webfilehelper.jfif_cgi-bin_mmwebwx-bin_webwxgetmsgimg__&MsgID=1567246579605791388&skey=@crypt_62d1f718_48ec886dd22cfe664e14d148b96db22c&mmweb_appid=wx_webfilehelper.jfif

5. IBAMR and MPI versions

  • IBAMR version: 0.18.0
    (The same crash pattern was also observed previously with an older IBAMR version.)

  • MPI library: Open MPI 4.1.2

 if you need additional information . I would be happy to provide it.

Best regards,
Liwei


Boyce Griffith

unread,
Jan 20, 2026, 11:07:49 PM (11 days ago) Jan 20
to IBAMR Users
I am pretty sure you are just running out of memory because the wake region is occupying more and more of the computational domain. You could try adjusting the tagging criteria so that you resolve less of the wake, or adjusting the job setup so that you have more memory available.

— Boyce

<_cgi-bin_mmwebwx-bin_webwxgetmsgimg__&MsgID=132663697272650121&skey=@crypt_62d1f718_48ec886dd22cfe664e14d148b96db22c&mmweb_appid=wx_webfilehelper.jfif><_cgi-bin_mmwebwx-bin_webwxgetmsgimg__&MsgID=1567246579605791388&skey=@crypt_62d1f718_48ec886dd22cfe664e14d148b96db22c&mmweb_appid=wx_webfilehelper.jfif>

liwei yao

unread,
Jan 21, 2026, 3:04:53 AM (11 days ago) Jan 21
to ibamr...@googlegroups.com

Dear Boyce,

  Thank you very much for your helpful suggestions. I will make the necessary adjustments based on them.

  Best regards,

Liwei

Reply all
Reply to author
Forward
0 new messages