Problem with stochastic event based calculator

144 views
Skip to first unread message

Mohsen Kohrangi

unread,
Jul 5, 2018, 12:37:00 AM7/5/18
to openqua...@googlegroups.com
Hi,

I am trying to run a risk model and I am facing a couple of problem over and over. Here are some useful information of the job running file:

[hazard_calculation]
random_seed = 24
truncation_level = 3
maximum_distance = 200.0
investigation_time = 100000

[risk_calculation]
master_seed = 42
risk_investigation_time = 50
asset_correlation = 0.0
loss_curve_resolution = 10
---------------------------------------------

1. My exposure model is a bit large with about 106000 asset assignments. Note that the coordinates of each asset may have been repeated several times because different building classes are assigned to each coordinate. In other words I have about 14800 building blocks and by repetitions the number of assignments reaches 106000. The problem is that the run moves up to a point with this message: "sent 1.12 GB of data in 11 tasks (s)" and then it does not progress any more even after two days of run time! I have tried to increase the "concurrent_tasks = 2000". Also reducing the GMPEs and using only one source model in the logic tree. Also have tried to play with the  investigation_time  and number_of_ground_motion_fields. I am using a server with 60GB of available  memory so I do not think I should have memory problems.

2. I tried this model with only one asset in my exposure and run completes successfully, however, the other problem is with the loss maps that for any PoE that I define gives the same loss values which corresponds with the last number (i.e., the largest) in the loss curves while I expect to see different values because the loss curves do show different values for different return periods. 
================================================================

Could you please help me resolving these issues?
Thanks

Mohsen

Michele Simionato

unread,
Jul 5, 2018, 3:45:25 AM7/5/18
to OpenQuake Users
The only way we can help you is if you send us the input files you are using. Then we can give you some recommendations.
The address where to send the files is engine....@openquake.org (just send a .zip; you can produce it with the command oq zip)

            Michele

Michele Simionato

unread,
Jul 6, 2018, 5:06:54 AM7/6/18
to OpenQuake Users
Hi Mohsen,
I am answering here since it may help others in the same situation.
The problem you have is that you are trying to include spatial correlation in a case with 13,959 distinct sites.
There is no hope of doing that. On my workstation with 48 GB of RAM I get immediately a memory error when building the
correlation matrix:

  File "/home/michele/oqcode/oq-engine/openquake/hazardlib/correlation.py", line 111, in _get_correlation_matrix
    return jbcorrelation(sites, imt, self.vs30_clustering)
  File "/home/michele/oqcode/oq-engine/openquake/hazardlib/correlation.py", line 149, in jbcorrelation
    return numpy.exp((- 3.0 / b) * distances)
MemoryError: 

Even if you had more memory, the performance would be terrible and you would never be able to complete
the calculation. Spatial correlation should be used with care. There are severe technical limits on what can be
done. The solution is to disable it and then your computation will become quite easy.

              Michele

On Thursday, July 5, 2018 at 6:37:00 AM UTC+2, Mohsen Kohrangi wrote:

Michele Simionato

unread,
Jul 6, 2018, 5:25:56 AM7/6/18
to OpenQuake Users


On Friday, July 6, 2018 at 11:06:54 AM UTC+2, Michele Simionato wrote:
 and then your computation will become quite easy

Correction: the hazard part will become easy but the risk part will still be hard, you have more than 100,000 assets after all.
You need a lot of memory for that, probably more than 60 GB. Increasing concurrent_tasks should help.

    Michele

Danai Kazantzidou-Firtinidou

unread,
Aug 28, 2018, 7:10:19 AM8/28/18
to OpenQuake Users
Hello,

I am wrtiing here as my problem would have a similar title.
I am trying to run a stochastic event based analysis (based on SHARE) with 2050 assets and when running with 4000 or 5000 years investigation time, the run seems to crash at 98% of rupture computation it remains there forever executing without giving any message. The very same file runs Ok for 3000 years. Could you please advise what is the problem?

One more issue is when creating the job.risk.ini from the IPT, I need to delete by hand the Return periods for aggregate loss curve (years) and to add by hand values for conditional_loss_poes that although I select them in the IPT, in the job.risk.ini a "None" value is marked.

Thank you,

Michele Simionato

unread,
Aug 28, 2018, 8:31:17 AM8/28/18
to OpenQuake Users


On Tuesday, August 28, 2018 at 1:10:19 PM UTC+2, Danai Kazantzidou-Firtinidou wrote:
Hello,

I am wrtiing here as my problem would have a similar title.
I am trying to run a stochastic event based analysis (based on SHARE) with 2050 assets and when running with 4000 or 5000 years investigation time, the run seems to crash at 98% of rupture computation it remains there forever executing without giving any message. The very same file runs Ok for 3000 years. Could you please advise what is the problem?

The SHARE model is really large, potentially with 3200 realizations. My guess is that with 5000 years your effective source model becomes larger and you do not have anymore the resources to complete the calculation. My advice would be to reduce the source model and therefore the number of realizations.


      Michele

Danai Kazantzidou-Firtinidou

unread,
Aug 28, 2018, 8:52:09 AM8/28/18
to OpenQuake Users
Thanks, Michele, for answering.
I am afraid I'm not that familiar with the hazard model so as to manage to reduce it safely... I will try to reduce the realizations though.

Could you please also tell me if this is the reason why although managing to run 3000 years, in the aggregated loss curves output I only take results up to 2000 years? In the end I am interested in running specific scenarios with 10% and 2% poe in 50 years and I need the 500 and 2500 aggregated loss to find the corresponding events, correct?

Thanks again,
Reply all
Reply to author
Forward
0 new messages