dedalus3 new install problems macOS Sequoia Intel

27 views
Skip to first unread message

Stefan Llewellyn Smith

unread,
Dec 9, 2024, 12:04:15 PMDec 9
to Dedalus Users
Hi all,


1)

% python3 -m dedalus test
[mae-chair-imac.dynamic.ucsd.edu:18386] shmem: mmap: an error occurred while determining whether or not /var/folders/kq/_bp5lbq96c3gqdvyj_3dllrm0000gq/T//ompi.mae-chair-imac.503/jf.0/1255604224/sm_segment.mae-chair-imac.503.4ad70000.0 could be created.
/opt/anaconda3/envs/dedalus3/lib/python3.13/site-packages/pytest_parallel/__init__.py:221: PytestDeprecationWarning: The hookimpl ParallelRunner.pytest_sessionstart uses old-style configuration options (marks or attributes).
Please use the pytest.hookimpl(tryfirst=True) decorator instead
 to configure the hooks.
 See https://docs.pytest.org/en/latest/deprecations.html#configuring-hook-specs-impls-using-markers
  @pytest.mark.tryfirst

The second warning doesn't seem to matter but the first error will be a problem later. Then the tests run and pass: 6586 passed, 16 skipped, 143 xfailed, 18 xpassed in 200.93s (0:03:20)

I looked online for information on the shmem warning and didn't find anything that helpful.

2)

% python3 -m dedalus get_examples
[mae-chair-imac.dynamic.ucsd.edu:18511] shmem: mmap: an error occurred while determining whether or not /var/folders/kq/_bp5lbq96c3gqdvyj_3dllrm0000gq/T//ompi.mae-chair-imac.503/jf.0/524288000/sm_segment.mae-chair-imac.503.1f400000.0 could be created.
/opt/anaconda3/envs/dedalus3/lib/python3.13/site-packages/dedalus/__main__.py:36: DeprecationWarning: Python 3.14 will, by default, filter extracted tar archives and reject files or modify their metadata. Use the filter argument to control this behavior.
  archive.extractall('dedalus_examples')

This is a problem with newer Pythons. I could reinstall dedalus with successively older pythons but that seems inefficient. I looked for a command line option to use the older behavior but I couldn't find one. So I downloaded the raw files for the example problems one by one to test them. So this problem isn't insurmountable but it's annoying.

3) A number of the serial test cases run (e.g. waves_on_a_string.py, lane_emden.py).

4) poisson.py fails:

% python3 poisson.py
[mae-chair-imac.dynamic.ucsd.edu:18801] shmem: mmap: an error occurred while determining whether or not /var/folders/kq/_bp5lbq96c3gqdvyj_3dllrm0000gq/T//ompi.mae-chair-imac.503/jf.0/1362362368/sm_segment.mae-chair-imac.503.51340000.0 could be created.
2024-12-09 08:36:40,827 subsystems 0/1 INFO :: Building subproblem matrices 1/128 (~1%) Elapsed: 0s, Remaining: 3s, Rate: 4.4e+01/s
[...]
Traceback (most recent call last):
  File "/Users/stefanllewellynsmith/Library/CloudStorage/GoogleDrive-s...@ucsd.edu/My Drive/Numerics/lbvp_2d_poisson/poisson.py", line 67, in <module>
    x = xbasis.global_grid()
TypeError: IntervalBasis.global_grid() missing 2 required positional arguments: 'dist' and 'scale'

This looks like dedalus3 rather than a system problem? I think I have the most recent version of the test problem and of dedalus3.

5) 

%  mpiexec -n 4 python3 rayleigh_benard.py
[mae-chair-imac.dynamic.ucsd.edu:18854] shmem: mmap: an error occurred while determining whether or not /var/folders/kq/_bp5lbq96c3gqdvyj_3dllrm0000gq/T//ompi.mae-chair-imac.503/jf.0/846725121/sm_segment.mae-chair-imac.503.32780001.0 could be created.
[mae-chair-imac.dynamic.ucsd.edu:18855] shmem: mmap: an error occurred while determining whether or not /var/folders/kq/_bp5lbq96c3gqdvyj_3dllrm0000gq/T//ompi.mae-chair-imac.503/jf.0/846725121/sm_segment.mae-chair-imac.503.32780001.0 could be created.
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
  PML add procs failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
  PML add procs failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
  ompi_mpi_init: ompi_mpi_instance_init failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--------------------------------------------------------------------------
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
  ompi_mpi_init: ompi_mpi_instance_init failed
  --> Returned "Not found" (-13) instead of "Success" (0)
--------------------------------------------------------------------------
[mae-chair-imac:00000] *** An error occurred in MPI_Init_thread
[mae-chair-imac:00000] *** reported by process [846725121,1]
[mae-chair-imac:00000] *** on a NULL communicator
[mae-chair-imac:00000] *** Unknown error
[mae-chair-imac:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[mae-chair-imac:00000] ***    and MPI will try to terminate your MPI job as well)
[mae-chair-imac:00000] *** An error occurred in MPI_Init_thread
[mae-chair-imac:00000] *** reported by process [846725121,2]
[mae-chair-imac:00000] *** on a NULL communicator
[mae-chair-imac:00000] *** Unknown error
[mae-chair-imac:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[mae-chair-imac:00000] ***    and MPI will try to terminate your MPI job as well)
--------------------------------------------------------------------------
prterun detected that one or more processes exited with non-zero status,
thus causing the job to be terminated. The first process to do so was:
   Process name: [prterun-mae-chair-imac-18852@1,2]
   Exit code:    14
--------------------------------------------------------------------------

This is presumably due to the shmem error, and it's a major problem as you can imagine.

Suggestions and fixes gratefully received. Thank you,

Stefan


Keaton Burns

unread,
Dec 9, 2024, 12:09:52 PMDec 9
to dedalu...@googlegroups.com
Hi Stefan,

The “shmem: mmap” error is an upstream issue from openmpi v5. There is some discussion here, but a temporary fix should be to set the environment variable
 
export OMPI_MCA_btl_sm_backing_directory=/tmp

Does that help? I’ll have to take a closer look to see if anything else broke with the examples recently.

Thanks,
-Keaton


--
You received this message because you are subscribed to the Google Groups "Dedalus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dedalus-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/dedalus-users/76bbe15e-5814-4ce0-8536-48c92bab40b0n%40googlegroups.com.

Stefan Llewellyn Smith

unread,
Dec 9, 2024, 12:13:51 PMDec 9
to Dedalus Users
No, still the same error.. I'll look at the discussion you reference.

Keaton Burns

unread,
Dec 9, 2024, 12:27:37 PMDec 9
to dedalu...@googlegroups.com
Hi Stefan, 

Another option should be to specify “mpich” along with “dedalus” when initially installing via conda-forge, to avoid pulling openmpi. I don’t have an intel mac available to test this on, though.

Best,
-Keaton


Stefan Llewellyn Smith

unread,
Dec 9, 2024, 12:49:36 PMDec 9
to Dedalus Users
That now runs mpi programs, although they don't terminate properly (e.g. for rayleigh-benard the program doesn't return after outputting

2024-12-09 09:43:07,489 solvers 0/16 INFO :: Simulation stop time reached.
2024-12-09 09:43:07,489 solvers 0/16 INFO :: Final iteration: 3577
2024-12-09 09:43:07,489 solvers 0/16 INFO :: Final sim time: 50.001137236725725
2024-12-09 09:43:07,489 solvers 0/16 INFO :: Setup time (init - iter 0): 1.595 sec
2024-12-09 09:43:07,489 solvers 0/16 INFO :: Warmup time (iter 0-10): 0.6587 sec
2024-12-09 09:43:07,489 solvers 0/16 INFO :: Run time (iter 10-end): 98.17 sec
2024-12-09 09:43:07,489 solvers 0/16 INFO :: CPU time (iter 10-end): 0.4363 cpu-hr
2024-12-09 09:43:07,489 solvers 0/16 INFO :: Speed: 3.035e+05 mode-stages/cpu-sec

nd have to be stopped manually using Ctrl-C. On a workstation or laptop that's OK for test programs, etc.

Reply all
Reply to author
Forward
0 new messages