PYCUDA Error while running simulations on gpu.

233 views
Skip to first unread message

Ankur Jyoti Kalita

unread,
Jun 11, 2024, 1:59:31 PM6/11/24
to gprMax-users
Hi I am getting the error:

(gprMax) C:\Users\spacewalkie>python -m gprMax gprMax/user_models/cylinder_Ascan_2D.in -gpu

=== Electromagnetic modelling software based on the Finite-Difference Time-Domain (FDTD) method =======================

    www.gprmax.com   __  __
     __ _ _ __  _ __|  \/  | __ ___  __
    / _` | '_ \| '__| |\/| |/ _` \ \/ /
   | (_| | |_) | |  | |  | | (_| |>  <
    \__, | .__/|_|  |_|  |_|\__,_/_/\_\
    |___/|_|
                     v3.1.7 (Big Smoke)

 Copyright (C) 2015-2023: The University of Edinburgh
 Authors: Craig Warren and Antonis Giannopoulos

 gprMax is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as
  published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
 gprMax is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty
  of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
 You should have received a copy of the GNU General Public License along with gprMax.  If not, see
  www.gnu.org/licenses.

Host: DESKTOP-OPHOOS7 | Micro-Star International Co., Ltd. MS-7D75 | 0 x unknown (12 cores, 24 cores with Hyper-Threading) | 31.2GiB RAM | Windows 11 (64-bit)
GPU(s) detected: 0 - NVIDIA GeForce RTX 4070 SUPER, 12GiB

--- Model 1/1, input file: gprMax/user_models/cylinder_Ascan_2D.in ----------------------------------------------------

Constants/variables used/available for Python scripting: {c: 299792458.0, current_model_run: 1, e0: 8.8541878128e-12, inputfile: C:\Users\spacewalkie\gprMax\user_models\cylinder_Ascan_2D.in, m0: 1.25663706212e-06, number_model_runs: 1, z0: 376.73031366686166}

Model title: A-scan from a metal cylinder buried in a dielectric half-space
Number of CPU (OpenMP) threads: 12
GPU solving using: 0 - NVIDIA GeForce RTX 4070 SUPER
Spatial discretisation: 0.002 x 0.002 x 0.002m
Domain size: 0.24 x 0.21 x 0.002m (120 x 105 x 1 = 12600 cells)
Mode: 2D TMz
Time step (at CFL limit): 4.71731e-12 secs
Time window: 3e-09 secs (637 iterations)

Waveform my_ricker of type ricker with maximum amplitude scaling 1, frequency 1.5e+09Hz created.
Hertzian dipole with polarity z at 0.1m, 0.17m, 0m, using waveform my_ricker created.
Receiver at 0.14m, 0.17m, 0m with output component(s) Ex, Ey, Ez, Hx, Hy, Hz created.
Material half_space with eps_r=6, sigma=0 S/m; mu_r=1, sigma*=0 Ohm/m created.
Geometry view from 0m, 0m, 0m, to 0.24m, 0.21m, 0.002m, discretisation 0.002m, 0.002m, 0.002m, with filename base cylinder_half_space created.

Memory (RAM) required: ~51.5MB host + ~51.5MB GPU

Box from 0m, 0m, 0m, to 0.24m, 0.17m, 0.002m of material(s) half_space created, dielectric smoothing is on.
Cylinder with face centres 0.12m, 0.08m, 0m and 0.12m, 0.08m, 0.002m, with radius 0.01m, of material(s) pec created, dielectric smoothing is off.
Processing geometry related cmds: 100%|██████████████████████████████████████████████| 2/2 [00:00<00:00, 2003.49cmds/s]

PML: formulation: HORIPML, order: 1, thickness: x0: 10, y0: 10, z0: 0, xmax: 10, ymax: 10, zmax: 0 cells
Building PML boundaries: 100%|█████████████████████████████████████████████████████████| 4/4 [00:00<00:00, 1999.91it/s]

Building main grid: 100%|██████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 1998.24it/s]

Materials:
    |                                             |                     |       | sigma |      | sigma*  | Dielectric
 ID | Name                                        | Type                | eps_r | [S/m] | mu_r | [Ohm/m] | smoothable
----+---------------------------------------------+---------------------+-------+-------+------+---------+------------
  0 | pec                                         | builtin             | 1     | inf   | 1    | 0       | False
  1 | free_space                                  | builtin             | 1     | 0     | 1    | 0       | True
  2 | half_space                                  |                     | 6     | 0     | 1    | 0       | True
  3 | free_space+free_space+half_space+half_space | dielectric-smoothed | 3.5   | 0     | 1    | 0       | True

Numerical dispersion analysis: estimated largest physical phase-velocity error is -0.42% in material 'half_space' whose wavelength sampled by 14 cells. Maximum significant frequency estimated as 4.32623e+09Hz

Writing geometry view file 1/1, cylinder_half_space.vti: 100%|█████████████████████████| 75.6k/75.6k [00:00<?, ?byte/s]

Output file: C:\Users\spacewalkie\gprMax\user_models\cylinder_Ascan_2D.out

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\gprmax-3.1.7-py3.12-win-amd64.egg\gprMax\__main__.py", line 6, in <module>
    gprMax.gprMax.main()
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\gprmax-3.1.7-py3.12-win-amd64.egg\gprMax\gprMax.py", line 69, in main
    run_main(args)
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\gprmax-3.1.7-py3.12-win-amd64.egg\gprMax\gprMax.py", line 191, in run_main
    run_std_sim(args, inputfile, usernamespace)
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\gprmax-3.1.7-py3.12-win-amd64.egg\gprMax\gprMax.py", line 232, in run_std_sim
    run_model(args, currentmodelrun, modelend - 1, numbermodelruns, inputfile, modelusernamespace)
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\gprmax-3.1.7-py3.12-win-amd64.egg\gprMax\model_build_run.py", line 373, in run_model
    tsolve, memsolve = solve_gpu(currentmodelrun, modelend, G)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\gprmax-3.1.7-py3.12-win-amd64.egg\gprMax\model_build_run.py", line 508, in solve_gpu
    kernels_fields = SourceModule(kernels_template_fields.substitute(REAL=cudafloattype, COMPLEX=cudacomplextype, N_updatecoeffsE=G.updatecoeffsE.size, N_updatecoeffsH=G.updatecoeffsH.size, NY_MATCOEFFS=G.updatecoeffsE.shape[1], NY_MATDISPCOEFFS=1, NX_FIELDS=G.nx + 1, NY_FIELDS=G.ny + 1, NZ_FIELDS=G.nz + 1, NX_ID=G.ID.shape[1], NY_ID=G.ID.shape[2], NZ_ID=G.ID.shape[3], NX_T=1, NY_T=1, NZ_T=1), options=compiler_opts)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\pycuda\compiler.py", line 355, in __init__
    cubin = compile(
            ^^^^^^^^
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\pycuda\compiler.py", line 304, in compile
    return compile_plain(source, options, keep, nvcc, cache_dir, target)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\pycuda\compiler.py", line 90, in compile_plain
    checksum.update(preprocess_source(source, options, nvcc).encode("utf-8"))
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\pycuda\compiler.py", line 58, in preprocess_source
    raise CompileError(
pycuda.driver.CompileError: nvcc preprocessing of C:\Users\SPACEW~1\AppData\Local\Temp\tmpnz_ofuio.cu failed
[command: nvcc --preprocess -w -arch sm_89 -m64 -IC:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\pycuda\cuda C:\Users\SPACEW~1\AppData\Local\Temp\tmpnz_ofuio.cu --compiler-options -EP]
-------------------------------------------------------------------
PyCUDA ERROR: The context stack was not empty upon module cleanup.
-------------------------------------------------------------------
A context was still active when the context stack was being
cleaned up. At this point in our execution, CUDA may already
have been deinitialized, so there is no way we can finish
cleanly. The program will be aborted now.
Use Context.pop() to avoid this problem.
-------------------------------------------------------------------


I used three version of cuda 11.6, 12.0, 12.5, I get the same errors for all
I have an RTX 4070 super.
Visual Studio 2022

Craig Warren

unread,
Jun 12, 2024, 4:50:31 PM6/12/24
to gprMax-users
Can you successfully run any of the examples provided with the CUDA installation? That way it will help us know if it is a CUDA issue or gprMax issue.

Kind regards,
Craig

Ankur Jyoti Kalita

unread,
Jun 16, 2024, 2:33:20 AM6/16/24
to gprMax-users
Hey,
Thanks for the response.
I ran the examples as given by the guide and it build and ran successfully using cuda 12.4.
https://docs.nvidia.com/cuda/cuda-quick-start-guide/

Craig Warren

unread,
Jun 20, 2024, 9:29:44 AM6/20/24
to gprMax-users
Odd. Since you can run the CUDA examples, gprMax in GPU mode should also run with the same version of CUDA. Could you also try one of the basic PyCUDA examples, e.g. https://github.com/minrk/PyCUDA/blob/master/examples/demo.py

Craig

Ankur Jyoti Kalita

unread,
Jun 30, 2024, 10:24:30 AM6/30/24
to gprMax-users
Hey,

I ran the above example: This is the error code I got.


Traceback (most recent call last):
  File "C:\Users\spacewalkie\Desktop\demo.py", line 16, in <module>
    mod = SourceModule("""

          ^^^^^^^^^^^^^^^^
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\pycuda\compiler.py", line 355, in __init__
    cubin = compile(
            ^^^^^^^^
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\pycuda\compiler.py", line 304, in compile
    return compile_plain(source, options, keep, nvcc, cache_dir, target)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\pycuda\compiler.py", line 154, in compile_plain
    raise CompileError(
pycuda.driver.CompileError: nvcc compilation of C:\Users\SPACEW~1\AppData\Local\Temp\tmp0qdtn_xu\kernel.cu failed
[command: nvcc --cubin -arch sm_89 -m64 -IC:\Users\spacewalkie\miniconda3\envs\gprMax\Lib\site-packages\pycuda\cuda kernel.cu]
[stdout:
nvcc fatal   : Value 'sm_89' is not defined for option 'gpu-architecture'
]

Craig Warren

unread,
Jul 2, 2024, 4:08:04 AM7/2/24
to gprMax-users
Hmmm....looks like a mismatch between CUDA and PyCUDA. I'd suggest uninstalling both and re-installing CUDA using the latest version (certainly >11.8 which I think is required for your GPU), then re-installing PyCUDA and ensure you choose the --no-cache option to force pip to download the latest version. Then try the example again.
Reply all
Reply to author
Forward
0 new messages