Cerberus overhead

47 views
Skip to first unread message

Darrell Schiebel

unread,
Jan 17, 2020, 10:12:18 AM1/17/20
to Eve REST Framework
Hello,

As part of a recent major upgrade of CASA, I reworked the portion of our build system which generates Python + parameter validation from XML (e.g. widebandpbcor.xml). For the parameter validation, I used Cerberus. For our "task" functions which have a long runtime and are called moderately often, this works great. However for our "tool" object methods which are called repeatedly and have shorter runtimes, we have encountered performance issues. At the end is the summary of the situation from another developer.

Does anyone have any suggestions for how to improve this situation? At this point, I haven't looked into the Cerberus implementation very much, but it seems like some amount of this could be cached or perhaps the schema could be "pre-compiled" and the generated version could be used for doing the type checking. This is a typical example of our generated Cerberus type checking (generated from widebandpbcor.xml):

    def __call__( self, vis='', imagename='', nterms=int(2), threshold='', action='pbcor',

reffreq='', pbmin=float(0.2), field='', spwlist=[ int() ], chanlist=[ int() ], weightlist=[ float() ] ):

        schema = {'vis': {'type': 'cReqPath', 'coerce': _coerce.expand_path}, 'imagename': {'type': 'cStr'}, 'nterms': {'type': 'cInt'}, 'threshold': {'type': 'cStr'}, 'action': {'type': 'cStr', 'allowed': [ 'pbcor', 'calcalpha' ]}, 'reffreq': {'type': 'cStr'}, 'pbmin': {'type': 'cFloat', 'coerce': _coerce.to_float}, 'field': {'type': 'cStr'}, 'spwlist': {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}, 'chanlist': {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}, 'weightlist': {'type': 'cFloatVec', 'coerce': [_coerce.to_list,_coerce.to_floatvec]}}

        doc = {'vis': vis, 'imagename': imagename, 'nterms': nterms, 'threshold': threshold, 'action': action, 'reffreq': reffreq, 'pbmin': pbmin, 'field': field, 'spwlist': spwlist, 'chanlist': chanlist, 'weightlist': weightlist}

        assert _pc.validate(doc,schema), str(_pc.errors)

        _logging_state_ = _start_log( 'widebandpbcor', [ 'vis=' + repr(_pc.document['vis']), 'imagename=' + repr(_pc.document['imagename']), 'nterms=' + repr(_pc.document['nterms']), 'threshold=' + repr(_pc.document['threshold']), 'action=' + repr(_pc.document['action']), 'reffreq=' + repr(_pc.document['reffreq']), 'pbmin=' + repr(_pc.document['pbmin']), 'field=' + repr(_pc.document['field']), 'spwlist=' + repr(_pc.document['spwlist']), 'chanlist=' + repr(_pc.document['chanlist']), 'weightlist=' + repr(_pc.document['weightlist']) ] )

        return _end_log( _logging_state_, 'widebandpbcor', _widebandpbcor_t( _pc.document['vis'], _pc.document['imagename'], _pc.document['nterms'], _pc.document['threshold'], _pc.document['action'], _pc.document['reffreq'], _pc.document['pbmin'], _pc.document['field'], _pc.document['spwlist'], _pc.document['chanlist'], _pc.document['weightlist'] ) )



The validator loaded like:

from casatools.typecheck import CasaValidator as _val_ctor

_pc = _val_ctor( )

from casatools.coercetype import coerce as _coerce


casatools.typecheck is also generated while casatools.coercetype is not. The logging wrappers allow for sending parameters and return values to a log file.

Thanks for any advice,
Darrell


To have an idea of the overall impact of the cerberus validator overheads on this dataset, I profiled a full calib+img pipeline with this  MOUS:uid://A001/X133d/X2d99.

    • Total time (executeppr): 23397.772s
    • Cumulative time in casatools/_cerberus_/validator.py:797(validate): 2178.792s
    • => That makes 9.3% of total pipeline execution time.

There are over 5 million calls to => casatools/_cerberus_/validator.py:797(validate)

From that cerberus time, the hightlights are:

    • "normalize_mapping", casatools/_cerberus/validator.py:552(_normalize_mapping) is called over 5 M times and takes 1158.911s
    • Building schemas, casatools/_cerberus/schema.py:27(init_) is called over 10 M times and takes 938.300s

 

This dataset seems to be particularly sensitive to the cerberus overheads. But I've found a few worse cases (worse in the sense of the impact of the cerberus validation on the total pipeline runtime). I don't have yet CASA 6 profiles for very large datasets though.

Some examples:

    • Project 2018.1.00657.S, MOUS uid___A001_X133d_X21fd, cerberus takes 12.6% of the total runtime (4.4h)
    • Project 2015.1.00665.S, MOUS uid___A001_X2d8_X2c5, cerberus takes 14.5% of the total runtime (3.2h).
    • Project E2E7.1.00149.S, MOUS uid___A001_X1456_X53, cerberus takes 9.7% of the total runtime (9.7h)

It looks like the penalty can be as high as ~15% of total runtime for the smaller datasets.


Reply all
Reply to author
Forward
0 new messages