ATTENTION GMPE CONTRIBUTORS: the API for implementing GMPEs has changed

302 views
Skip to first unread message

Michele Simionato

unread,
Jul 11, 2021, 1:48:25 AM7/11/21
to OpenQuake Users
Dear contributors,

hazardlib has kept the same API for implementing GMPEs for over 10 years but,
eventually, we had to change it.

The reason for the change is to better support single-site computations: we needed a numpy-friendly API that could be vectorized both by ruptures and intensity measure
types.

The major change is that while before you were supposed to implement a method
like this

    def get_mean_and_stddevs(self, sites, rup, dists, imt, stddev_types):
        C = self.COEFFS[imt]
        mean = (self._compute_magnitude_terms(rup, C) +
                self._compute_distance_terms(dists, C) +
                self._compute_site_response_term(C, imt, sites) + ...
        stddevs = self._get_stddevs(coeffs, stddev_types)
        return mean, stddevs

now you have to implement the same method like this

    def compute(self, ctx, imts, mean, sig, tau, phi):
        for m, imt in enumerate(imts):
            C = self.COEFFS[imt]
            mean = (_compute_magnitude_terms(ctx, C) +
                    _compute_distance_terms(ctx, C) +
                    _compute_site_response_term(C, imt, ctx) + ...)
            tau[m] = C['tau']
            phi[m] = C['phi']
            sig[m] = numpy.sqrt(C['tau']**2 + C['phi']**2)

Notice the changes in the signature.

1. There is a single context object (ctx) containing all the rupture, site and distance parameters.
2. We are passing a sequence of IMTs rather than a single IMT.
3. We are passing four arrays (mean, sig, tau, phi) of shape (M, C) being M the number of intensity measure types
(len(imts)) and C the size of the context (len(ctx)).

Such arrays are initially zeros and are filled inside the `compute`
procedure. Nothing is returned.

This is quite different than before but it allows for major speedups
in the single-site case if the context object can be vectorized (a 200x
speedup has been measured).

The second major change is that now helper methods are forbidden in
GMPE classes: the only accepted methods are the `compute` method and
the `__init__` method (plus a method `get_poes` that however you
should never need to override). Everything else must be implemented as
an helper function. This avoids nightmares with inheritance: it was
quite difficult to understand from which level in the hierarchy a
method was coming from, and which subclasses would be affected by a
change in it, but now that complexity has been completely removed.
Also, using only functions rather than methods make it possible (in principle)
to compile such functions with numba and possibly get even bigger speedups.

An example of how to implement a GMPE with the new API is
the following:


The method `get_mean_and_stddevs` is still accepted for compatibility
reasons but you should never use it when implementing new
GMPEs. Implement the `compute` method instead, and then the method
`get_mean_and_stddevs` inherited from the base class will do the right
thing.

*All the code invoking directly `gsim.get_mean_and_stddevs`
 will keep working and there is no plan to break it ever*.

However, calling directly `gsim.get_mean_and_stddevs` should be
considered a bad practice.  Instead, one should instantiate a
ContextMaker and then call `cmaker.get_mean_stds(ctxs, stddev_type)`
That is the recommended way and the fastest way to use hazardlib from
external code.

The best practices for working with hazardlib
are documented in detail in the Advanced Manual, see


If you want to know more about the rationale for the change and the plan for the future,
you should read this document:


All of hazardlib has been migrated to a form where the switch to the new API is easy,
so we expect to gradually migrate all of the GMPEs to the `compute` API.

While hazardlib has good test coverage it is not necessarily complete
and it is possible that the recent huge refactoring broke
something. If you have contributed a GMPE in the past you should check
how your original code has been refactored, and kindly report any
problem.

Thanks for your cooperation,

          Michele Simionato on behalf of the Hazard Team

Michele Simionato

unread,
Feb 15, 2022, 3:45:34 AM2/15/22
to OpenQuake Users
I am happy to announce that the vectorization program started 9 months ago is now complete: as of today all of the GMPEs in hazardlib are vectorized by rupture. That means that all rupture parameters like magnitude, rake, hypo_depth, etc.
are now arrays inside the `compute` method of GMPE classes, while before they were scalar parameters. Thanks to the vectorization, all classical calculations are faster than before, sometimes dramatically: for instance the Canada 2015 model is 5 times faster in current master than in engine 3.11.

To avoid performance regressions, since today non-vectorized GMPEs will be rejected from hazardlib and users are expected to contribute only vectorized GMPEs. Fortunately, vectorizing a GMPE is easy and you can follow the
many examples in hazardlib. Sometimes, just replacing the signature `compute(self, ctx, imts, mean, sig, tau, phi)` with ` compute(self, ctx: np.recarray, imts, mean, sig, tau, phi)` is enough. In other cases one needs to change
a few things since previously scalar variables are now arrays; as an example, you can refer to https://github.com/gem/oq-engine/blob/master/openquake/hazardlib/gsim/akkar_cagnan_2010.py.
If you have doubts about how to vectorize a GMPE,  please do not hesitate to ask for help. If you have many non-vectorized GMPEs and no time to vectorize all of them, please do not upgrade to current master.
The last version of the engine working with non-vectorized GMPEs is version 3.13.

Thanks for your cooperation,

          Michele Simionato on behalf of the Hazard Team

Reply all
Reply to author
Forward
0 new messages