Dear users and developers,
First of all, congratulations for the new release of OQ-engine 2.1. I've followed the huge effort (and commits) involved on improving performance and provide new features, in particular, on disaggregations, where many optimisations increasing performance, efficiency and consuming less hardware resources.
These days I was running a simple disaggregation (on development version 2.0.1 after disagg filtering corrections and improvements from 09/16/2016) for one site and some realisations of logic-tree branches for a single disagg_poe and some intensity measure levels of a certain IMT. The computation were correctly done and I got as result, the branch levels and realisations weights and disaggregation matrixes for the combination disagg_poe and IMT. I found no problem with the amount of hardware resources nor with the splitting or tasks parallelisation for all branches. But something got my attention.
Accordingly section 2.4.2 (p. 21) of the OQ Hazard Book, is well known that OQ disaggregation is implemented in a non-traditional way, but also that they are equivalent for a Poissonian source model. So, in my humble opinion, the only thing needed to present a correct and consolidated result seems to be to perform a weighted average with each individual disaggregation matrix. They should be converted from XML format to CSV using NRML_converters toolkit as previously detailed discussed in 6/5/14 (
https://groups.google.com/d/msg/openquake-users/VfBIx4kju3A/-10tppAAqjcJ) by D. Monelli.
Is this still correct or something else need to be done, checked or recomputed in order to provide a comprehensive mean disaggregation?
This question is also motivated by some topics (
https://groups.google.com/d/topic/openquake-users/-ptg1XJitNs/discussion) and recent issues (
https://github.com/gem/oq-engine/issues/2296) involving disaggregation performance and enhancement. Even more deeply I've failed in to convince myself and some people I have been discussed about the correctness of the OQ disaggregation as it is actually implemented. They often argue that, by using the target disagg_poe to interpolate a given iml for each INDIVIDUAL hazard curve (
https://github.com/gem/oq-hazardlib/blob/master/openquake/hazardlib/calc/disagg.py#L85), it is not the same as to compute the desired probability of no exceedance corresponding to the interpolated iml from the target disagg_poe of the MEAN HAZARD CURVE. In other words, to get a 'correct' consolidated answer will be need to compute these values 'by hand' from the individual hazard curves and than send many simpler jobs with these values (disag_poes computed from individual curves for THE iml of the mean hazard curve at the target/configured disag_poe). Or so, to take a correct disaggregation, I'll need not only one huge (imagine I have machine for that), but many 'single jobs' parametrised, which sounds very rare for me.
Have you observed any of these issues before?! Where I might being not too stupid?! I'm really surprised, as some people have argued, that for to get a single site disaggregation with a not very complex logic tree branches, we may need to handle individually each branch, and sending many single jobs to find the 'correct' disaggregation and to provide a consolidated view. I'll really appreciate any comments and shared experiences.
My best wishes for all you from a long time since my last communication.
marlon