Seeking guidance on using multiple sessions to partition the processing of rules

77 views
Skip to first unread message

Jb Osborne

unread,
May 14, 2020, 10:30:23 AM5/14/20
to NRules Users
My scenario involves processing json payloads with lots of rules logic, but the processing has a few clear phases (transformation, enrichment, and validation). 
- Transformation: we support multiple input formats for json, xml, ... that get transformed into a common, internal format.
- Enrichment: the enrichment phase needs to call out to multiple services in order to gather the enrichment data to fill out the document in its internal format.
- Validation: validates the internal format and enrichment data to ensure it can be processed.

I've inherited a codebase using NRules and I'm confused about a few approaches I see in the code. The current version of the solution puts all the rules for the processing phases into a single session. This can get complex. I've seen that the current code already resorts to providing ordering hints for processing the rules, which seems like something to avoid when possible.

There are lots of rules, so I was looking to reduce the complexity and partition the rules by their phase in separate sessions. Basically like a persistent pipeline processing model where each phase hands off to the next phase in processing. I've read other posts explaining that using multiple sessions could help with parallelism as well, which could potentially be an added benefit. My main concern is around complexity in an already large number of rules, which are certain to expand even more as we try to include more domain-specific logic.

My goals are to not mix rules of different types (transformation, enrichment, and validation) into the same sessions, since there are quite a few rules due to the business domain being fairly complex. I want to keep the complexity down as much as possible and having multiple single purpose sessions for each well defined phase of processing seems to be a reasonable idea to me. I just want to make sure my logic is reasonable.

Thanks,
Jeremy

Sergiy Nikolayev

unread,
May 18, 2020, 9:28:32 PM5/18/20
to nrules...@googlegroups.com
Hi Jeremy,

I think your approach makes sense, and you have the right intuitions about it. In general, you only need to have the rules in the same session as long as they need to interact with each other via forward chaining. But practically, even if the rules are independent, you want the engine to do the heavy lifting and dynamically figure out the order of execution based on the facts present and the rules present, so that as you add new rules the overall structure of the application/service that hosts the logic would remain the same. On the two extremes you have one session with all the rules in it, and a session per rule (which likely doesn't make sense). So, I think a balance of breaking up one giant rule set into a workflow/pipeline/phases, with a rule session per step and orchestrating steps either statically in the code or via a workflow engine, state machine etc. makes sense. Depending on the workflow it may be sequential or may present parallelization opportunities, as you indicated.
A similar spectrum exists around putting all facts/data into the same session vs breaking them up into units of work. This mostly depends on the amount of data the rules need to consider at once to make a decision. If your app is mostly processing transactions (e.g. customers, orders, requests, policies, etc.) then creating a session per transaction/unit of work (along with some related reference data) may make sense as well, and also allow scaling the logic out along the lines of those transactions or enable data partitioning and parallelization. If you have more of a batch case, where you do indeed need to reason across many facts in the same session, then bulk inserting facts into the same session (within a workflow step) is reasonable.
From the implementation perspective, a given set or a subset of rules is compiled to a session factory once (and this is where you can easily filter rules based on metadata, partition into workflow/pipeline steps and compile into different session factories). Then, you can create a new session from a given session factory, to process a given unit of work of facts/data (this can be done per request/transaction/batch). Throw the session out when done with it (or hold on to it forever in case of a long-running reasoning scenario where facts are constantly changing and rules need to re-evaluate, though this does not sound like your case). A session factory is thread safe; a session is not (when considering parallelization).

Hope this helps,
Sergiy

Jb Osborne

unread,
May 19, 2020, 12:43:15 PM5/19/20
to NRules Users
Much appreciated Sergiy. Thanks
Reply all
Reply to author
Forward
0 new messages