Hello,
I have installed OQ Engine version 2.1 on a cluster running Ubuntu 14.04. I upgraded from OQ Engine version 2.0.1 using the instructions in
https://github.com/gem/oq-engine/blob/engine-2.1/doc/upgrading/ubuntu.md. The machine has a head node with 3 compute nodes, each of which has 72 CPUs. We are running oq engine on each node separately, with no cross-node communication.
We did quite a few runs using oq engine, on all the different compute nodes as well as the head node. The results looked fine. One of the source files is attached, which shows each of 13 sources using a single-bin ArbitraryMFD. We ran using this file on one of the compute nodes, and found it failed with the error:
File "/usr/lib/python2.7/dist-packages/openquake/commonlib/sourceconvert
er.py", line 216, in split_fault_source_by_magnitude
min_mag=mag, bin_width=src.mfd.bin_width,
AttributeError: 'ArbitraryMFD' object has no attribute 'bin_width'
This seems odd, since in the OQ manual it says about ArbirtrayMFDs: "There is no bin-width as the rates correspond exactly to the specific magnitude". No bin width is specified in our source model file.
We found at first that the OQ engine on the head node did not produce this error, even though it had been installed in exactly the same way as on the compute nodes. But then after several runs we noticed that it started encountering this error on the head node as well, even though the source input file has not changed.
Fortunately, we also have a development version of OQ engine 2.1.0 installed on an Redhat HPC cluster at my university, and it seemed to run fine there... for a while. Now we find that runs on that machine hit this error as well. With the same source file that ran successfully before. It seems pretty bizarre.
We're also noticing that output is inconsistently written. We use, e.g. "--exports xml,vsv", but no csv gets written. We have "mean_hazard_curves = true" and "hazard_maps = true" in the job.ini file, but we find that mean hazard maps don't get written. This seemed to start happening around the same time we started encountering the ArbirtaryMFD error (before that I think oq engine always produced the output we asked for), but I don't know if they are related.
Is anyone else experiencing these problems and does anyone have suggestions for work-arounds?
Cheers,
- Phil