calibration run: error in jobInit.py

46 views
Skip to first unread message

Aarti Soni

unread,
Aug 1, 2025, 6:03:43 AMAug 1
to wrf-hydro_users
Hello everyone, 

While doing the calibration process, I am getting an error in running jobInit.py

ERROR: Unable to assign values from config file.
ERROR: Failure to initialize calibration workflow job.

In the `setup.parm` file, acctKey = none/default. I also left it blank, but still getting the same error. 
Would anyone be able to help me out?

Thanks

Arezoo RafieeiNasab

unread,
Aug 4, 2025, 11:28:27 AMAug 4
to wrf-hyd...@ucar.edu
Hi Aarti, 

Could you share the setup file you are using? From the message it is hard to guess what went wrong in the setup, but it seems an entry is not set up correctly. 

Thanks!
Arezoo

--
You received this message because you are subscribed to the Google Groups "wrf-hydro_users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wrf-hydro_use...@ucar.edu.
To view this discussion visit https://groups.google.com/a/ucar.edu/d/msgid/wrf-hydro_users/50ce5a39-31aa-4efe-992b-1ff0d89d13a6n%40ucar.edu.


--
Arezoo Rafieei Nasab, Ph.D.
NCAR/RAL Project Scientist II

Aarti Soni

unread,
Aug 4, 2025, 11:49:42 AMAug 4
to wrf-hydro_users, Arezoo RafieeiNasab
Hi Arezoo,
Thanks for the reply.

Today, I have fixed that error in
 jobInit.py by installing PyYAML. However, I am now encountering an error in the next step. .err files are empty, and .out files only show core counts and group info.
When I run spinOrchestrator.py, no spinup outputs are created. 
I manually changed job_run_type from 4 to 2, it throws: Exception in statusMod.checkBasGroupJob

Arezoo RafieeiNasab

unread,
Aug 5, 2025, 1:59:25 AMAug 5
to Aarti Soni, wrf-hydro_users
Hi Aarti, 

Unfortunately you cannot change things manually after you initialize the calibration, most of the information is written in the database, and therefore even if you change it manually it is not going to have an impact. I recommend re-doing the initialization steps if you change anything in the setup. I know this does not answer your question, but I thought I would let you know.  The error is very general and vague right now, and I cannot guess what could be the cause of it. 

Thanks!
Arezoo

Aarti Soni

unread,
Aug 5, 2025, 2:24:53 AMAug 5
to wrf-hydro_users, Arezoo RafieeiNasab, wrf-hydro_users, Aarti Soni
Hi Arezoo,
I am trying to re-run the jonInit.py. Getting the error: 

WARNING: Zero length account key passed to program.
ERROR: Failure to delete entries for job ID: 1

Aarti Soni

unread,
Aug 5, 2025, 2:29:44 AMAug 5
to wrf-hydro_users, Aarti Soni, Arezoo RafieeiNasab, wrf-hydro_users
Should I delete the entries for job ID 1 from wherever they are exiting? 
sqlite3 /home/myfolder/wrfhydro_calibration/wrfHydroCalib.db \
"DELETE FROM Calib_Params WHERE jobID=1;"

and re-run : 
python /home/SSPMOPER/spred/AARTI/wrfhydro_calibration/jobInit.py \
/home/myfolder/wrfhydro_calibration/setup.parm \
--optExpID 1 \
--optDbPath /home/myfolder/wrfhydro_calibration/wrfHydroCalib.db

Aarti Soni

unread,
Aug 5, 2025, 3:38:05 AMAug 5
to wrf-hydro_users, Arezoo RafieeiNasab, wrf-hydro_users
Sorry, I forgot to attach setup.parm file in the previous message. 
PFA

setup.parm

Arezoo RafieeiNasab

unread,
Aug 6, 2025, 12:54:24 PMAug 6
to wrf-hyd...@ucar.edu, Arezoo RafieeiNasab
Hi Aarti, 

I do not see anything in the setup file that I could point out. So you usually do not need to have an account key to submit a job to your system? I cannot test your setup since I have to have an account key to be able to submit a `qsub`. 

As for how to restart, I would recommend removing the database that you created, /home/myfolder/wrfhydro_calibration/wrfHydroCalib.db and redoing the steps all over again. Since the first few steps are just a few seconds to a few minutes, that is the safe way. 

Thanks!
Arezoo

Aarti Soni

unread,
Aug 6, 2025, 2:14:18 PMAug 6
to wrf-hydro_users, Arezoo RafieeiNasab, Aarti Soni
Arezoo, Thank for the reply.

I have done it from scratch, and now I can run the .py files up to calibOrchestrator.py (if I am not making any mistakes, please see the attached screenshot of folders created in RUN.CALIB folder)

After running the jobInit.py file it has not displayed any error or message: WORKFLOW HAS BEEN SETUP FOR OWNER: X JOB ID = JobID
so I checked jobID: sqlite3 /home/SSPMOPER/spred/AARTI/wrfhydro_calibration/wrfHydroCalib.db "SELECT jobID, Job_Directory FROM Job_Meta;"
It displayed 1, which I used in spinOrchestrator.py  and calibOrchestrator.py

For submitting the job, I use qsub run.pbs 
run.pbs file is having PBS details below;
#!/bin/bash
#PBS -N WRF_HYD
#PBS -l select=1:ncpus=36:vntype=cray_compute
#PBS -l walltime=600:00:00
#PBS -q sspmres
#PBS -l place=scatter

Just for clarification, is this sspmres is the account key? I was confused, so I kept it empty

Untitled.png

Aarti Soni

unread,
Aug 7, 2025, 9:25:17 AMAug 7
to wrf-hydro_users, Aarti Soni, Arezoo RafieeiNasab

Arezoo,

In continuation of my previous email, the calibScript.R and proj_data.Rdata files have not been generated in the RUN.CALIB folder. 
Additionally, the RUN.SPINUP folder only contains symbolic links to wrf_hydro.exe, GENPARM.TBL, MPTABLE.TBL, SOILPARM.TBL, and W16.

I also have a query regarding RESTART files. Since I am running the model over a long period, with the first 5 years as a warm start and the subsequent 5-year blocks as cold starts. In that case, how can the restart files be used in model calibration? Can calibration proceed without RESTART files?

Regarding the account key, we generally do not use one on our system.

Thank you.

Arezoo RafieeiNasab

unread,
Aug 7, 2025, 12:45:14 PMAug 7
to Aarti Soni, wrf-hydro_users
Aarti, 

You could ask your system admin for the project account key info. I have an -A option for the account key on our system and -q is for the queue name. This is system independent I guess. 

Let me ask you a few questions. 
1- Do you see the job getting submitted to the queue on your system? There should be a job called "WSG_0" or something like that. Do you see that in the queue? 
2- Do you see the run script? Inside the output directory, there should be a file named: "run_group_0.sh" take a look at this script and see whether the options are set properly for submission of a job in queue. If not, then change it as you see fit and re-submit it manually. Meaning do "qsub run_group_0.sh" and see what kind of error you get. Basically, the orchestrator is doing the initial prep and submit this script and this script does the rest of the thing. If the problem is with the system setup and submission you could get that resolved with manually tuning this script, and figure out what is wrong. 

About the restart files, I am not sure if I understand your question. However, that made me look at the setup file of yours, and I see the start/end Dates are not setup properly. The only indication of the model that the simulation is complete or going fine is the restart files. Right now,  the dates are setup as : 

bSpinDate = 2001-01-01
eSpinDate = 2002-12-31

But it should be the following: 

bSpinDate = 2001-01-01
eSpinDate = 2003-01-01

Please fix the above dates. All the dates should be the first day of the month, as you have the restart frequency set  to -9999. 
Anyhow, this is not the cause of your error since you do not get to this point at all. 
Thanks!
Arezoo

Aarti Soni

unread,
Aug 12, 2025, 11:15:26 AMAug 12
to wrf-hydro_users, Arezoo RafieeiNasab, wrf-hydro_users, Aarti Soni
Hi Arezoo,

I inquired about the account key with my HPC admin, and they informed me that there is no provision for a project account key on this system, so it can be ignored. However, before receiving their response, I ran the script with accKey = sspmres, which is my group name, and the script ran without any errors. I tried to re-run all the scripts and encountered the errors below. Please have a look.

1. After running the jobInit.py file, it has not displayed any error or message: WORKFLOW HAS BEEN SETUP FOR OWNER: X JOB ID = JobID

2. When I run the spinOrchestrator.py file, the following error occurs:
python /home/SSPMOPER/spred/AARTI/wrfhydro_calibration/spinOrchestrator.py 1 \ > --optDbPath /home/SSPMOPER/spred/AARTI/wrfhydro_calibration/wrfHydroCalib.db  
 
['1', '2', '3', '4', '5', '6'] 
NUM CORES PER NODE = 8 
NUM CORES AVAIL = 8 
NUM BASINS PER GROUP = 1 
NUM BASINS = 6 
NUM GROUPS = 6 
WORKING ON GROUP: 0 
Traceback (most recent call last):  
  File 
"/home/SSPMOPER/spred/AARTI/wrfhydro_calibration/spinOrchestrator.py", line 290, in <module> main(sys.argv[1:]) File 
"/home/SSPMOPER/spred/AARTI/wrfhydro_calibration/spinOrchestrator.py", line 258, in main groupStatus = statusMod.checkBasGroupJob(jobData,basinGroup,pbsJobId,'WSG') File 
"/home/SSPMOPER/spred/AARTI/wrfhydro_calibration/core/statusMod.py", line 2130, in checkBasGroupJob raise Exception() 
Exception  

3. As you suggested, to run the .sh file manually,  I tried to run qsub run_group_0.sh, but it stopped after some time. I have attached all the generated files for your reference. The namelist is created in RUN.SPINUP/OUTPUT folder, but it is empty. 
qsub run_group_0.sh showing
Job id                 Name             User       Time Use        S             Queue
----------------  ---------------- ----------------  -------- - -----  -------- - -----  -------- - -----
1232841.sdb     WSG_1_0     aartisoni    00:00:00        R            sspmres
WSG_1_0.out
run_group_0.sh
hydro.log
WSG_1_0.err

Arezoo RafieeiNasab

unread,
Aug 12, 2025, 12:12:39 PMAug 12
to Aarti Soni, wrf-hydro_users
Hi Aarti, 

I did not mean for you to modify the command in the run_group_0.sh, and just submit it manually and try fixing the PBS items till you get it working (if it fails). For example this is what I get for running calibration. 


(r-4.3) [arezoo@derecho2 wnse_str_3]$  cat run_group_0.sh
#!/bin/bash
#
# PBS Batch Script to Run WRF-Hydro Group Calibrations
#
#PBS -N WCG_5_0
#PBS -A NHAP0008
#PBS -l walltime=12:00:00
#PBS -q main
#PBS -o 
PATH_TO/WCG_5_0.out
#PBS -e 
PATH_TO/WCG_5_0.err
#PBS -l select=1:ncpus=128:mpiprocs=128

cd PATH_TO/PyWrfHydroCalib/
module purge
module load ncarenv/23.09
module load craype/2.7.23
module load intel/2023.2.1
module load ncarcompilers/1.0.0
module load cray-mpich/8.1.27
export PALS_CPU_BIND=none
module load hdf5/1.12.2
module load netcdf/4.9.2
module load nco/5.1.9
module load conda
conda activate vscode 
python calib.py 5 0 --optDbPath 
PATH_TO/db_str_wnse_3.db

The above example is from calibration step, yours should looks something like "python spinup.py". When you submit the script which calls spinup.py script, it will prepare the run directory, namelist and etc. Then it will run the WRF-Hydro. You have empty namelists since you did not run the python script instead called the wrf-hydro executable. 

Thanks!
Arezoo
Reply all
Reply to author
Forward
0 new messages