Large DMN Optimization

393 views
Skip to first unread message

Michael Gallo

unread,
Sep 21, 2021, 4:45:27 PM9/21/21
to Drools Usage
I am testing the performance of DMNs at scale and am wondering if there is any form of a hash map or other performance tweaks I am missing? I have a 50,000 line simple relationship, within a DMN file using Kogito. The relationship is just positive integers from 0-50,000 with a second column being the positive integer divided by 2 (calculated ahead of time and hard coded into the DMN). When making a request via Postman trying to access a specific line in the relation, I see an average response time of 1,200 ms which is way too much for our current use case. If I make the same request on a 5,000 line version of the same relation I see a response time of roughly 500ms. The FEEL expression I am using to access the row I want is RELATION[PARAM=input].column.

I have done a very similar test with Decision Tables. A 5,000 row decision table gives me response times of roughly 387 and a 50,000 row decision table gives me a response time of 500 for the same calculation so I am still seeing some performance trade off based purely on the size of the decision.

Matteo Mortari

unread,
Sep 21, 2021, 5:23:51 PM9/21/21
to Drools Usage
Hi Michael,
thank you for your interest in Kogito and Drools DMN engine.

I believe it should be clarified that A DMN Relation boxed expression is not a Map and it does not work like a HashMap.
So, there is no specific "optimization" that works specifically on DMN Relation boxed expressions. 
I am saying this based on the use-case you refer to in the example FEEL expression; that, in fact, is not semantically a look-up as you are suggesting, but a filtering.

When it comes to a DMN Decision Table boxed expression, it is a different story.
In that case, some optimizations are performed internally, automatically.
Further, some specific configuration flags might improve on big scale tables (such as this flag)
Further and most importantly, we are continuously working on new performance improvements by leveraging even better Drools Core capabilities (ref DROOLS-4605)
A DMN Decision Table might also be used for look-up use cases too.

So this is from the purely perspective on where you can actually expect to gain performance benefit, both semantically, and on an engine implementation.

That said, it is quite interesting to hear you have a case for a 50K lines look-up table in DMN!
I suspect this is not manually edited in the DMN Editor by some Business Analyst --but please correct me if I'm wrong here!

So if these 50K lines are originally stored in a RDBMS, and you need to integrate that content within some DMN model, I would advise to evaluate Integration patterns too.
E.g.: a BKM node performs the look-up on the DB based on the criteria you identified. Once the result is pulled by the BKM inside the DMN, you can use it in the remainder of the decision graph of DMN.
Just some initial food for thought.
As the original use-case you seems to be hinting to, is about a look-up on some massive relational data, that is likely already existing in some DB, I would use the right-tool-for-the-job of keeping the relational data in the DB where it is currently sitting, and integrate it with DMN where it shines best, which is expressing decision logic.

It could also be the case that you identified some specific use-case which merits some further investigation on the DMN engine side.
In that case don't hesitate to share with us more details, and we'd be happy to look further into it.
But in any case we will need to always respect the real DMN specification semantics.

I hope this provided meaningful pointers!
MM

On Tue, 21 Sept 2021 at 22:45, Michael Gallo <michael...@gmail.com> wrote:
I am testing the performance of DMNs at scale and am wondering if there is any form of a hash map or other performance tweaks I am missing? I have a 50,000 line simple relationship, within a DMN file using Kogito. The relationship is just positive integers from 0-50,000 with a second column being the positive integer divided by 2 (calculated ahead of time and hard coded into the DMN). When making a request via Postman trying to access a specific line in the relation, I see an average response time of 1,200 ms which is way too much for our current use case. If I make the same request on a 5,000 line version of the same relation I see a response time of roughly 500ms. The FEEL expression I am using to access the row I want is RELATION[PARAM=input].column.

I have done a very similar test with Decision Tables. A 5,000 row decision table gives me response times of roughly 387 and a 50,000 row decision table gives me a response time of 500 for the same calculation so I am still seeing some performance trade off based purely on the size of the decision.

--
You received this message because you are subscribed to the Google Groups "Drools Usage" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drools-usage...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/drools-usage/1e4f98d2-9fff-4857-ad7b-734bbb421cc3n%40googlegroups.com.


--

Michael Gallo

unread,
Sep 22, 2021, 4:34:23 PM9/22/21
to Drools Usage

Hello Matteo,

Thank you for getting back to me so quickly.

You are correct this DMN was not manually created and that we would probably be better off using a different tool for the job. Would you happen to have any public examples of a Business Knowledge Model node being used to call custom Java code that my team and I can reference? 

Matteo Mortari

unread,
Sep 23, 2021, 5:02:15 AM9/23/21
to Drools Usage
Hi Michael,

> Would you happen to have any public examples of a Business Knowledge Model node being used to call custom Java code that my team and I can reference? 
Example 100, 
invoking java.lang.Math#abs
Please note by the DMN specification, the invoked Java method must be a static method, etc.

As I stated in my earlier email, I hope you can correspond to these references and guidance, with some more details about your use-case?

Specifically, we would like to be sure not only that you are applying the right tool for the job,
but that if you have on-hand a case where we could evaluate some specific engine optimizations, we understand the extent to which these could be generalized for the benefit of every Drools user.

We understand the full business logic cannot be shared, but talking some more on the general business problem could help identify potential enhancements.
Unfortunately we only have so far ~"a lookup over 50K rows not edited in DMN, with index of the natural positive numbers" which does not offer many indications to DMN/engine advantages :)

Thank you in advance if you could share more insights.
In any case, I hope you found value in the pointes shared so far!
MM

Michael Gallo

unread,
Sep 23, 2021, 5:51:55 PM9/23/21
to Drools Usage

Hello Matteo

Thank you so much for pointing me in the right direction.  I have been able to get much better performance by having storage of massive amounts of relational data to external sources and only using the DMN for actual decision logic.

The general business problem that we are working on is that we have an existing Excel file with a 5 digit number of rules that we are looking to integrate with another project that we are building and we are just exploring what role DMNs will play in that and which rules should and shouldn’t be transferred to DMNs.

Matteo Mortari

unread,
Sep 24, 2021, 7:42:50 AM9/24/21
to Drools Usage
Hi Michael,
that's great to hear.

If those are Rules,
e.g.: also FEEL-like constraints of ranges, inequalities, etc. and also in different rows you have partial overlaps of the rules, etc.
then imho you should really consider measuring what I mentioned about configuration flag in my earlier email,
such as org.kie.dmn.compiler.execmodel

If enabling the option gives you performance benefits, you may have then indications to look forward to DROOLS-4605 for additional improvements.

But to be clear this A. needs that are really rules of a decision table B. needs proper measurement on top of the specific use-case.

On a different topic, since you have mentioned Excel, did you have a chance to look into:

Would similar utilities benefit on your use case?

Let us know, hope this helps,
MM


Michael Gallo

unread,
Sep 24, 2021, 4:32:38 PM9/24/21
to Drools Usage
Hello Matteo

Wow, thank you. I think this tool will be very useful for me and my team.
Reply all
Reply to author
Forward
0 new messages