RuntimeError: Could not save _csm: maximum recursion depth exceeded

166 views
Skip to first unread message

Clara Duverger

unread,
Sep 28, 2022, 1:15:11 PM9/28/22
to OpenQuake Users
Hi there,

Is someone has an idea of what is the meaning of this OpenQuake engine error:

RuntimeError: Could not save _csm: maximum recursion depth exceeded while pickling an object in /home/oqdata/calc_XXX.hdf5

How to resolve this issue ?
I'm running oq version 3.15.0.

Cheers,

Peter Pažák

unread,
Sep 28, 2022, 2:22:56 PM9/28/22
to OpenQuake Users
Hi, this is interesting, could you share the source to reproduce the error?
(Maybe not so important: Is it Windows, Mac or Linux?)

Peter

Dátum: streda 28. septembra 2022, čas: 19:15:11 UTC+2, odosielateľ: Clara Duverger

Clara Duverger

unread,
Oct 4, 2022, 6:14:38 AM10/4/22
to openqua...@googlegroups.com

Hi there,
I did some more tests and it seems to be related to the number of branchsets or branches in total in the logic tree.
Here are the sources to reproduce the error. They are not a all optimized (not a minimal example).
I run calculation on Ubuntu 18.
Cheers,


--
You received this message because you are subscribed to a topic in the Google Groups "OpenQuake Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openquake-users/aRrAgglesmo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openquake-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/openquake-users/dbcc7dab-16bb-4229-adef-48f576418fbdn%40googlegroups.com.
gmpe_logic_tree_test.xml
job.ini
source_model_1.xml
source_model_2.xml
source_model_3.xml
source_model_logic_tree_test.xml

Michele Simionato

unread,
Oct 4, 2022, 8:12:32 AM10/4/22
to OpenQuake Users
Go in the source code, in the file openquake/baselib/parallel.py. There is a line 

sys.setrecursionlimit(1200)

Raise a bit the limit. I checked that  

sys.setrecursionlimit(1500)

is enough to make your problem disappear. Otherwise upgrade to the current master.

             Michele Simionato

Clara Duverger

unread,
Oct 4, 2022, 11:07:50 AM10/4/22
to openqua...@googlegroups.com
Ok, thank you Michele, I increase the limit to 1800 for another calculation to work.
Could you explain me if this limit is correlated to the total number of branches in the logic tree ? Or how to better set it (if important) according to which parameter ?
I have a default value of 3000 on my system.
Cheers,

--
You received this message because you are subscribed to a topic in the Google Groups "OpenQuake Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/openquake-users/aRrAgglesmo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to openquake-use...@googlegroups.com.

Michele Simionato

unread,
Oct 4, 2022, 11:26:54 AM10/4/22
to OpenQuake Users
On Tuesday, October 4, 2022 at 5:07:50 PM UTC+2 Clara Duverger wrote:
Ok, thank you Michele, I increase the limit to 1800 for another calculation to work.
Could you explain me if this limit is correlated to the total number of branches in the logic tree ? Or how to better set it (if important) according to which parameter ?

I have NO IDEA of what is happening here. The pickle algorithm is recursive and for some reason the structures in the source model require a high degrees of recursion for your kind of logic trees.
An investigation would be long and difficult and maybe without a better answer than to increase the recursion limit, so I am reluctant to spend more time on it, given that you are the only ones with
this issue, at least to my knowledge. The most that I can do is to raise again the limit in the engine.

        Michele

gcontre...@gmail.com

unread,
Oct 5, 2022, 7:27:39 AM10/5/22
to OpenQuake Users
Hi, I am doing the same test than Clara and I applided the advise of Micheli on "sys.setrecursionlimit(1500)
I used 1800 : or the first run  worked usign a "full" propagation uncertainty 50 branches.
Then when I try an other model,  there is the following error:

INFO:root:Upgrade completed in 0.00022029876708984375 seconds
WARNING:root:DB server started with /home/h63357/venv/bin/python on tcp://127.0.0.1:1908, pid 13211
Waiting for jobs [112, 113, 114, 115, 116, 117, 118, 119]

the number of 112.... is the times that I made test, new run each time.

I stoped the computer, starting again, and again and nothing happen, just Waiting for job

I dont' understand, if someone can help !!

Thank you

Michele Simionato

unread,
Oct 6, 2022, 11:07:29 AM10/6/22
to OpenQuake Users
On Wednesday, October 5, 2022 at 1:27:39 PM UTC+2 gcontre...@gmail.com wrote:
Hi, I am doing the same test than Clara and I applided the advise of Micheli on "sys.setrecursionlimit(1500)
I used 1800 : or the first run  worked usign a "full" propagation uncertainty 50 branches.
Then when I try an other model,  there is the following error:

INFO:root:Upgrade completed in 0.00022029876708984375 seconds
WARNING:root:DB server started with /home/h63357/venv/bin/python on tcp://127.0.0.1:1908, pid 13211
Waiting for jobs [112, 113, 114, 115, 116, 117, 118, 119]

the number of 112.... is the times that I made test, new run each time.

The jobs died badly and the engine thinks they are still running. Try to give the command

$ oq abort 112 113 114 115 116 117 118 119 120 121 122
 

gcontre...@gmail.com

unread,
Oct 7, 2022, 5:09:13 AM10/7/22
to OpenQuake Users
Hi Michele, thank you for your answer, unfortunatly " oq abort .... " doesn't work, below the error.
Maybe there is another way to abort OQ, I have alrredy installed a new version and it is the same.
---------
(venv) (base) h63357@dsp0962168:~$ oq abort 112 113 114 115 116 117 118 119 120 121 122
Traceback (most recent call last):
  File "/home/h63357/venv/bin/oq", line 33, in <module>
    sys.exit(load_entry_point('openquake.engine', 'console_scripts', 'oq')())
  File "/home/h63357/venv/src/oq-engine/openquake/commands/__main__.py", line 56, in oq
    sap.run(commands, prog='oq')
  File "/home/h63357/venv/src/oq-engine/openquake/baselib/sap.py", line 225, in run
    return _run(parser(funcdict, **parserkw), argv)
  File "/home/h63357/venv/src/oq-engine/openquake/baselib/sap.py", line 216, in _run
    return func(**dic)
  File "/home/h63357/venv/src/oq-engine/openquake/commands/abort.py", line 30, in main
    job = logs.dbcmd('get_job', job_id)  # job_id can be negative
  File "/home/h63357/venv/src/oq-engine/openquake/commonlib/logs.py", line 57, in dbcmd
    res = sock.send((action,) + args)
  File "/home/h63357/venv/src/oq-engine/openquake/baselib/zeromq.py", line 164, in send
    raise TimeoutError(
openquake.baselib.zeromq.TimeoutError: While sending ('get_job', 112); probably the DbServer is off

Gloria

Michele Simionato

unread,
Oct 7, 2022, 8:04:16 AM10/7/22
to OpenQuake Users
Looks like the DbServer is off or you have a mess with your installation. You can always remove the engine database in oqdata/db.sqlite3 and then start the dbserver.

 Michele

gcontre...@gmail.com

unread,
Oct 10, 2022, 5:03:53 AM10/10/22
to OpenQuake Users
Thank you again Michele for your answer, pls look results of tests below, maybe they can help to find raisons that OQ does not working when using uncertainty propagation.
With your last advise it was workin "delete in oqdata db.sqlite3", then below results of the test:

Test 1: not uncertainty propagation - running is ok
Test 2: uncertainty propagation using the smart method (zise of the model: 80 branches in total) - running is ok
Test 3: the same model than 2 but using full incertainty, the calculation is waiting with the message below, it was doing for one night, after this process I have to delete in oqdata db.squlite3, because OQ is not running the others simple models.
Please find attached the file " source_model_logic_tree.xml" in case it could useful. 
--------
s_ok/FrenchDomainsAnalysis/OpenQuake/inputs_prior_weights_1E0/ab-uncertainty-full/job.ini [--hc=None]
[2022-10-10 09:52:59 #3 INFO] Using engine version 3.16.0-gitcfe395090a
[2022-10-10 09:52:59 #3 WARNING] Using 12 cores on dsp0962168
[2022-10-10 09:52:59 #3 INFO] Checksum of the inputs: 2909767877 (total size 1.04 MB)
[2022-10-10 09:52:59 #3 INFO] Running PreClassicalCalculator with concurrent_tasks = 24
[2022-10-10 09:52:59 #3 INFO] Read N=1 hazard sites and L=330 hazard levels
[2022-10-10 09:52:59 #3 INFO] Validated source_model_logic_tree.xml in 0.02 seconds

It was turning for one nigth in the same position, I closed the terminal.
-----------
Test 4: the same than Test 2, that running before Test 3 - below the message:
--------
Waiting for jobs [3]
---------
Test 5 :  delete in oqdata db.squlite3and runing the same that Test 4 - is not runing, here the error 
If I put of ma computer I coming back it will run.
-------
chDomainsAnalysis/OpenQuake/inputs_prior_weights_1E0/ab-uncertainty-smart$ oq engine --run job.ini --exports csv

Traceback (most recent call last):
  File "/home/h63357/venv/bin/oq", line 33, in <module>
    sys.exit(load_entry_point('openquake.engine', 'console_scripts', 'oq')())
  File "/home/h63357/venv/src/oq-engine/openquake/commands/__main__.py", line 56, in oq
    sap.run(commands, prog='oq')
  File "/home/h63357/venv/src/oq-engine/openquake/baselib/sap.py", line 225, in run
    return _run(parser(funcdict, **parserkw), argv)
  File "/home/h63357/venv/src/oq-engine/openquake/baselib/sap.py", line 216, in _run
    return func(**dic)
  File "/home/h63357/venv/src/oq-engine/openquake/commands/engine.py", line 170, in main
    jobs = create_jobs(job_inis, log_level, log_file, user_name,
  File "/home/h63357/venv/src/oq-engine/openquake/engine/engine.py", line 337, in create_jobs
    logs.init('job', dic, log_level, log_file,
  File "/home/h63357/venv/src/oq-engine/openquake/commonlib/logs.py", line 274, in init
    return LogContext(job_ini, calc_id, log_level, log_file,
  File "/home/h63357/venv/src/oq-engine/openquake/commonlib/logs.py", line 181, in __init__
    self.calc_id = dbcmd(
  File "/home/h63357/venv/src/oq-engine/openquake/commonlib/logs.py", line 59, in dbcmd
    return res.get()
  File "/home/h63357/venv/src/oq-engine/openquake/baselib/parallel.py", line 404, in get
    raise etype(msg)
sqlite3.OperationalError:
  File "/home/h63357/venv/src/oq-engine/openquake/baselib/parallel.py", line 427, in new
    val = func(*args)
  File "/home/h63357/venv/src/oq-engine/openquake/server/db/actions.py", line 118, in create_job
    return db('INSERT INTO job (?S) VALUES (?X)',
  File "/home/h63357/venv/src/oq-engine/openquake/commonlib/dbapi.py", line 338, in __call__
    raise exc.__class__('%s: %s %s' % (exc, templ, args))
OperationalError: attempt to write a readonly database: INSERT INTO job (id, is_running, description, user_name, calculation_mode, hazard_calculation_id, ds_calc_dir, host) VALUES (?, ?, ?, ?, ?, ?, ?, ?) (5, 1, 'phebus_model, site de Grenoble', 'h63357', 'classical', None, '/home/h63357/oqdata/calc_5', 'dsp0962168')
source_model_logic_tree.xml
job.ini
Reply all
Reply to author
Forward
0 new messages