Kie jar and precompiled rules

1,879 views
Skip to first unread message

Kyle Meadows

unread,
Aug 25, 2014, 1:29:31 PM8/25/14
to drools...@googlegroups.com
Hey everyone,

I have a project set up that is using the kei-maven-plugin to build a kjar for my project. The build takes approx. 02:39 min to run which is a big improvement over the old build process from Drools 5! The kbase.cache file is similar in size (37 MB) to the serialized KnowledgePackages file I was generating with Drools 5. However I noticed a few things about the kjar that is generated:

1) All of the DRLs are present in the jar. The jar structure looks like this:

        com/ 
            (class files for all declared types are here)
        KBase1/ 
            (the directory containing all of my DRLs)
        META-INF/
            KBase1/
                kbase.cache
            maven/
                pom and properties
            kmodule.info
            kmodule.xml
            MANIFEST.MF

2) To get a new stateless ksession and test firing the rules takes approx. 03:40 min. In Drools 5 firing all these rules requires less than one second so that is quite a jump. The code I am using to run the rules is as follow:

        KieServices ks = KieServices.Factory.get();
        KieContainer kContainer = ks.getKieClasspathContainer();
        StatelessKieSession kSession = kContainer.newStatelessKieSession("KBase1.session");
        kSession.execute(factData);

3) If I delete all of the DRL files out of the jar, Drools reports that it can't find any resources for "KBase1" and no rules fire.

These things combined lead me to believe that Drools is rebuilding the rule base at runtime. Is this expected behavior or is there a way to fully precompile them? Since the build is relatively slow and memory intensive (both heap and perm gen space) I'd like to keep it out of the runtime.

Thanks,
Kyle

Mark Proctor

unread,
Aug 25, 2014, 5:42:06 PM8/25/14
to drools...@googlegroups.com
On 25 Aug 2014, at 18:29, Kyle Meadows <dark...@gmail.com> wrote:

Hey everyone,

I have a project set up that is using the kei-maven-plugin to build a kjar for my project. The build takes approx. 02:39 min to run which is a big improvement over the old build process from Drools 5! The kbase.cache file is similar in size (37 MB) to the serialized KnowledgePackages file I was generating with Drools 5. However I noticed a few things about the kjar that is generated:

1) All of the DRLs are present in the jar. The jar structure looks like this:

        com/ 
            (class files for all declared types are here)
        KBase1/ 
            (the directory containing all of my DRLs)
        META-INF/
            KBase1/
                kbase.cache
            maven/
                pom and properties
            kmodule.info
            kmodule.xml
            MANIFEST.MF
Not sure what the Q is here?


2) To get a new stateless ksession and test firing the rules takes approx. 03:40 min. In Drools 5 firing all these rules requires less than one second so that is quite a jump. The code I am using to run the rules is as follow:

        KieServices ks = KieServices.Factory.get();
        KieContainer kContainer = ks.getKieClasspathContainer();
        StatelessKieSession kSession = kContainer.newStatelessKieSession("KBase1.session");
        kSession.execute(factData);

If the maven plugin is working it should pre-cache all generated code. However we still do more work on 6 than in 5 due to portability. In 5.x it was just a serialised pojo - this made it impossible to have portable jars between drools versions. In 6.x while it will use the pre-cached generated code, it will still parse the DRLs to rebuild all the data structures and wire in the pre-generated code. However this should be a one off thing, when first accessing the KJar.

Could you break down your timings? Timing for first access to get the KieContainer and timing for subsequent access (make sure it’s not re-compiling each time). then time for session executions, to make sure that is still fast.


3) If I delete all of the DRL files out of the jar, Drools reports that it can't find any resources for "KBase1" and no rules fire.
See answer above, to keep things portable we rebuild things from parsing, and use pre-cached generated classes. Remember Drools is a mixture of pojo data structures and code generation.

These things combined lead me to believe that Drools is rebuilding the rule base at runtime. Is this expected behavior or is there a way to fully precompile them? Since the build is relatively slow and memory intensive (both heap and perm gen space) I'd like to keep it out of the runtime.

Thanks,
Kyle

--
You received this message because you are subscribed to the Google Groups "Drools Setup" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drools-setup...@googlegroups.com.
To post to this group, send email to drools...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/drools-setup/888e6d24-647d-485e-ba3f-d9e5f1e270ef%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mark Proctor

unread,
Aug 25, 2014, 5:55:42 PM8/25/14
to drools...@googlegroups.com
I’d probably also add that if the MVEL dialect is used, there is no generated code and thus no saving at all from the cache. Which means it would effectively give the same times as if compiling for the first time.

Mark

Mario Fusco

unread,
Aug 26, 2014, 11:09:33 AM8/26/14
to drools...@googlegroups.com
Hi Kyle,

as Mark wrote, in 6.x we rebuild things from parsing for portability reasons, but anyway this should impact only the very first time you create a KieSession from the KieContainer.
This means that probably there's no much we could about this. Nevertheless I'd like to profile your use case to check if performance wise there are some low hanging fruits that could allow us to at least improve this situation. Do you think you could share your project so I could use it in a profiling session?

Thanks,
Mario


Kyle Meadows

unread,
Aug 27, 2014, 5:13:33 PM8/27/14
to drools...@googlegroups.com
Hi Mark and Mario,

Thank you for your responses, at the very least I know we're not building/running incorrectly now. Running the rules seems as fast as ever, here are some simple timings of my test runner getting the KieServices, the KieContainer, and then repeatedly getting new sessions and firing the rules:

09:30:18,482         INFO Drools6Test:34 - Start
09:30:18,490         INFO Drools6Test:36 - Obtained KieServices
09:30:18,503         INFO ClasspathKieProject:115 - Found kmodule: file:/development/work/drools6test/target/classes/META-INF/kmodule.xml
09:30:18,504        DEBUG ClasspathKieProject:370 - KieModule URL type=file url=/development/work/drools6test/target/classes
09:30:18,661         WARN ClasspathKieProject:259 - Unable to find pom.properties in /development/work/drools6test/target/classes
09:30:18,667         INFO ClasspathKieProject:301 - Recursed up folders,  found and used pom.xml /development/work/drools6test/pom.xml
09:30:18,668        DEBUG ClasspathKieProject:102 - Discovered classpath module com.test.rules:drools6test:0.0.1-SNAPSHOT
09:30:18,670         INFO KieRepositoryImpl:73 - KieModule was added:FileKieModule[ ReleaseId=com.test.rules:drools6test:0.0.1-SNAPSHOTfile=/development/work/drools6test/target/classes]
09:30:18,670         INFO ClasspathKieProject:115 - Found kmodule: jar:file:/home/kmeadows/.m2/repository/com/test/rules/drools6build/0.0.1-SNAPSHOT/drools6build-0.0.1-SNAPSHOT.jar!/META-INF/kmodule.xml
09:30:18,671        DEBUG ClasspathKieProject:370 - KieModule URL type=jar url=/home/kmeadows/.m2/repository/com/test/rules/drools6build/0.0.1-SNAPSHOT/drools6build-0.0.1-SNAPSHOT.jar
09:30:18,678        DEBUG ClasspathKieProject:240 - Found and used pom.properties META-INF/maven/com.test.rules/drools6build/pom.properties
09:31:37,669        DEBUG ClasspathKieProject:102 - Discovered classpath module com.test.rules:drools6build:0.0.1-SNAPSHOT
09:31:37,669         INFO KieRepositoryImpl:73 - KieModule was added:ZipKieModule[ ReleaseId=com.test.rules:drools6build:0.0.1-SNAPSHOTfile=/home/kmeadows/.m2/repository/com/test/rules/drools6build/0.0.1-SNAPSHOT/drools6build-0.0.1-SNAPSHOT.jar]
09:31:37,670         INFO Drools6Test:38 - Obtained KieContainer
09:31:37,670         INFO Drools6Test:58 - Starting session
09:33:33,576        DEBUG KnowledgeBaseImpl:186 - Starting Engine in PHREAK mode
09:33:57,052         INFO Drools6Test:72 - Closing session
09:33:57,052         INFO Drools6Test:58 - Starting session
09:33:57,175         INFO Drools6Test:72 - Closing session
09:33:57,175         INFO Drools6Test:58 - Starting session
09:33:57,280         INFO Drools6Test:72 - Closing session
09:33:57,280         INFO Drools6Test:58 - Starting session
09:33:57,381         INFO Drools6Test:72 - Closing session
09:33:57,381         INFO Drools6Test:58 - Starting session
09:33:57,444         INFO Drools6Test:72 - Closing session
09:33:57,444         INFO Drools6Test:58 - Starting session
09:33:57,517         INFO Drools6Test:72 - Closing session
09:33:57,517         INFO Drools6Test:58 - Starting session
09:33:57,575         INFO Drools6Test:72 - Closing session


So getting the KieContainer takes ~1 min 20 seconds, then getting the first session/executing takes ~2 min 20 seconds. Creating each subsequent session/execute takes less than one second each. 

That said I'd still really love to get the memory consumption of the rule compilation down because it is problematic for the app server. I have been running with -Xmx4000m -XX:MaxPermSize=1024m as my memory settings to get around both heap and perm gen errors. I don't know what is considered "good" memory consumption in Drools but it seems high to me. I have had a suspicion for a while now that maybe our domain model has been constructed in a way that is un-optimal for the Drools and/or MVEL data structures. All of our rules reference the same root object like this:

when
p : Participant(bd: birthDate != null)
then
.. calculate age ..
p.setAge(age);

when
Age(value > 80)
p : Participant()
then
AgeCategory ac = AgeCategory.OLD;
insert(ac);
p.setAgeCategory(ac);

Almost every single rule references this Participant object. In the 5.x code this resulted in a Rete tree with a very, very wide horizontal level below the root node. Does this look to you guys like it could be the cause of the high memory consumption? Unfortunately Mario I do not have permission to share the rules. When I let jProfiler profile my Drools6Test.java for a while it looks like most of CPU time is being spent here:

86.4% - 4,552 s - 10,330 inv. org.drools.compiler.rule.builder.PatternBuilder.buildRelationalExpression
45.6% - 2,409 s - 10,163 inv. org.drools.compiler.rule.builder.PatternBuilder.addConstraintToPattern
... more stack..
40.5% - 2,191 s - 10,322 inv. org.drools.compiler.rule.builder.dialect.mvel.MVELExprAnalyzer.analyzeExpression
33.8% - 1,830 s - 20,658 inv. org.mvel2.MVEL.analyze
6.7% - 361 s - 17,686 inv. org.mvel2.util.PropertyTools.getFieldOrAccessor
40.2% - 2,134 s - 20,820 inv. org.drools.compiler.rule.builder.PatternBuilder.getExprBindings
40.2% - 2,138 s - 20,862 inv. org.drools.compiler.rule.builder.PatternBuilder.setInputs
29.9% - 1,595 s - 20,877 inv. org.mvel2.MVEL.analysisCompile
10.2% - 545 s - 28,234 inv. org.mvel2.util.PropertyTools.getFieldOrAccessor

By far the biggest chunks of memory are being taken up by char[] data and java.util.HashMap$Entry data. The biggest memory allocation hotspot is:

60.5% - 11,543 kB - 144,297 alloc. java.lang.Class.getMethods
35.1% - 6,683 kB - 83,545 alloc. org.mvel2.util.PropertyTools.getGetter
35.1% - 6,683 kB - 83,545 alloc. org.mvel2.util.PropertyTools.getFieldOrAccessor
19.1% - 3,645 kB - 45,564 alloc. org.drools.compiler.rule.builder.PatternBuilder.setInputs
12.7% - 2,430 kB - 30,376 alloc. org.drools.compiler.rule.builder.dialect.mvel.MVELExprAnalyzer.analyzeExpression
25.5% - 4,860 kB - 60,752 alloc. org.mvel2.ParserContext.initializeTables
25.5% - 4,860 kB - 60,752 alloc. org.mvel2.compiler.ExpressionCompiler._compile
25.5% - 4,860 kB - 60,752 alloc. org.mvel2.compiler.ExpressionCompiler.compile
12.7% - 2,430 kB - 30,376 alloc. org.mvel2.MVEL.analysisCompile
12.7% - 2,430 kB - 30,376 alloc. org.mvel2.MVEL.analyze

In my experience with profiling the data can be deceptive sometimes so I may not be looking at the right stuff. Also this is only from letting it run for 1.5 hours so I may not have even let it run long enough to get an accurate picture. Let me know if this tells you anything or if there is some additional information I can provide.

Thanks,
Kyle

Mario Fusco

unread,
Sep 5, 2014, 11:08:46 AM9/5/14
to drools...@googlegroups.com
Hi again Kyle,

I did some benchmarking of this use case and found that the algorithm we used to read an InputStream into a byte array was very very inefficient especially for large files (like the cache file of a project as big as yours). I fixed this issue with this commit https://github.com/droolsjbpm/drools/commit/ef77b58dc743140ee4240892e7c161cb3d72d759 and backported the fix also to the 6.1.x amd 6.0.x branches, so this improvement will be available in all the next patch releases.

I believe that the biggest part of the performance issues you found will be solved by this fix. Please let me know if you'll give it a try and if it actually helps in your cases.

Regards and thanks again for having reported this,
Mario


--
You received this message because you are subscribed to the Google Groups "Drools Setup" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drools-setup...@googlegroups.com.
To post to this group, send email to drools...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages