Dear WRF-Chem-run community,
I am running WRF-Chem version 4.3 with chem_opt=202 (MOZART chemistry with MOSAIC aerosol). My simulation consistently crashes around the first hour, and I would greatly appreciate any insights.
I have already added ‘ulimit -s unlimited’ and ‘export I_MPI_STACKSIZE=unlimited’ in my job script, but the segmentation(174) problem persists.
I also checked for ‘CFL warnings’ in the rsl files and found none.
I have set `debug_level = 1000`, but it gives no useful information.
From the rsl.error.* files, I observe two types of errors across different processes:
1. Segmentation fault (SIGSEGV) – e.g., in rsl.error.0000:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
2. Process killed (SIGTERM) – e.g., in rsl.error.0001:
forrtl: error (78): process killed (SIGTERM)
Both errors occur after 1 hour estimation.
I have carefully checked the format and time stamp of my emission files, and they appear to be consistent with the User's Guide for WRF-Chem v4.4 and the MOZART_MOSAIC_V3.6.readme.
Environment & job configuration:
l Compiler: Intel 2017
l MPI: Intel MPI 2017 Update 4
l Slurm launch: srun --mpi=pmi2
l Nodes: 1 node, 32 cores (requested via #SBATCH -n 32)
l Memory per node: Login node shows 251 GB total, and the job did not exceed memory based on sacct output
l Model: WRF-Chem 4.3, compiled with MOZART and MOSAIC options, and with KPP enabled (-DWRF_KPP in configure.wrf)
I am attaching the following files for reference:
l namelist.input
l Job submission script (WRF_run2new.sh)
l A snippet of the rsl.error.* files showing the crash points
l edgarv5_MOZART_MOSAIC_drh (Anthro)
l MOZART_MOSAIC_4bins_drh (mozbc)
l finn_wrf_drh (fire_emis)
If anyone has encountered a similar issue or notices any misconfiguration in my setup, I would be very grateful for your guidance.
Thank you very much in advance for your time and help!
Best regards,
Ruohan Deng