Rete or PHREAK - if not set for version 6.5, Stateful vs Stateleness, Performance Benchmarking, UI to edit rules

283 views
Skip to first unread message

Ajay Chowdary Kandula

unread,
May 27, 2021, 8:56:20 AM5/27/21
to Drools Setup
kieBaseConfiguration.setOption(RuleEngineOption.PHREAK | RETEOO)

6.5 shows PHREAK and no need to setOption?

There are Sequential and Fast path in ODM that comes to mind based on the situational implementation based on the application, has anyone tried any other algorithms

2) Any articles from redhat explaining Performance Comparison based on Stateful vs statelessness

3) Are there any tools that can be added as dependency in maven for obtaining performance benchmarks of the drools that shows the details as to which rules have been taken time and being overriding other or affecting other rules??

4)  display rules in UI written in Drools Rules Language files ?

Toshiya Kobayashi

unread,
Jun 1, 2021, 11:47:05 PM6/1/21
to Drools Setup
Hi,

> 6.5 shows PHREAK and no need to setOption?

Default is PHREAK so you don't need to call setOption(). Since 7.x, RETEOO has been removed.

> 2) Any articles from redhat explaining Performance Comparison based on Stateful vs statelessness

Stateful vs Stateless is about how you use KieSession. If your rules are the same, the performance would be the same.

If you enable sequential mode (= you don't need insert, modify, and update in rules) with StatelessKieSession, the performance may improve.


> 3) Are there any tools that can be added as dependency in maven for obtaining performance benchmarks of the drools that shows the details as to which rules have been taken time and being overriding other or affecting other rules??

drools-metric would likely meet the purpose. Please refer to the article and the kie-live video.


https://www.youtube.com/watch?v=fFvobeuFHvk&t=1748s  (This link starts the video where I talk about drools-metric but maybe good to watch it from "Performance issues" part at 23:05)

> 4)  display rules in UI written in Drools Rules Language files ?

Probably, Guided Rule Editor is what you want. You can write rules with editor in the workbench.


In addition...

If you mean syntax highlight, you can use Drools VS Code Extension.


If you want to view Rete diagram of your DRL, try drools-retediagram


If you want to view relationships between rules, try drools-impact-analysis  (available since 7.54.0.Final)


Cheers,
Toshiya

2021年5月27日木曜日 21:56:20 UTC+9 maila...@gmail.com:

Ajay Chowdary Kandula

unread,
Jun 2, 2021, 7:17:02 AM6/2/21
to Drools Setup
@Toshiya Thank you very much for the detailed response. 

This is very helpful. Was planning to post the same on zulip chat, never got the time
The above link is something that caught my attention on multi threading benefits on zulip chat that I was browsing through

was also thinking of the time taken by individual threads during execution of stateful sessions and hopefully drools-metric might solve the purpose like you mentioned.

Only time will say.

Ajay Chowdary Kandula

unread,
Jun 2, 2021, 1:18:24 PM6/2/21
to Drools Setup
For Drools version 6.5 is there something like drools metrics?

@Kobayashi your video talks of Drools version 7 and above where metrics can be utilized

is that compatible with Drools version 6.5 ?

Toshiya Kobayashi

unread,
Jun 2, 2021, 9:23:11 PM6/2/21
to Drools Setup
> For Drools version 6.5 is there something like drools metrics?
> @Kobayashi your video talks of Drools version 7 and above where metrics can be utilized
> is that compatible with Drools version 6.5 ?

This is the same question/answer in the zulip chat but I paste it here for those who will search in the future :)


~~~
drools-metric is available since 7.41.0.Final. Unfortunately, there is no similar feature for drools 6.5.

TBH, I strongly recommend you update your drools version. 6.5 is pretty old :(

If you cannot update drools version in production and still want to analyze rules, you may simply test your rules with drools 7.41.0.Final with drools-metric. DRLs are basically compatible so it would be able to run anyway. However, drools engine has been improved over versions so the analysis results might not be accurate for your version 6.5. But it may be worth trying.
~~~

Cheers,
Toshiya


2021年6月3日木曜日 2:18:24 UTC+9 maila...@gmail.com:

Dimitri Gamkrelidze

unread,
Oct 22, 2021, 7:00:47 AM10/22/21
to Drools Setup
Hello,

I have performance issue regarding Drools 7x, tested 7.52.final and above. I concentrated to 7.52.final, because of redhat builds, but same issue I have with most resent versions too
I made some benchmarks and posted question on stackoverflow:  Drools 4.X vs Drools 7.x performance . I had great performance with drools 4.x (We are using 4.0.3 version) with Java 1.6 (we are using Jrockit realtime with deterministic GC, but with standart Java 6 and Parallel GC performance is better, because it doesn't overhead of GC tunning). I shared the JMH benchmark code .  But I'm getting performance degradation with Drools 7.x, in my tests 4.x with Java 6(ParallelGC) is nearly 2 times faster, than 7.x with Java 1.8, 11, 17 (compared different vendor's builds, with Jdk 11 was  "little" better). I made Throughput and Latency benchmarks and here are results. This numbers are from my descktop PC :

Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz 12 Threads (hyperthreading enabled ), 64 GB Ram, "Ubuntu" VERSION="20.04.2 LTS (Focal Fossa)" Linux homepc 5.8.0-59-generic #66~20.04.1-Ubuntu SMP Thu Jun 17 11:14:10 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux 

Drools 4.0.3 :
Benchmark                     (shadowProxy) Mode Cnt Score           Error            Units 
DroolsBenchmark.send false                  thrpt   30 138888.339 ± 6603.057 ops/s 
DroolsBenchmark.send true                    thrpt   30   1704.062   ± 178.104   ops/s
DroolsBenchmark.send                            avgt    30   75.845       ± 2.668       us/op

Drools 7.52.Final

DroolsBenchmark.send                           thrpt   30   67881.788 ± 941.384 ops/s 
DroolsBenchmark.send                           avgt   30   147.362      ± 2.510      us/op

We started to compare performance, because we want to migrate our realtime system from Java 6 to Java 11 (maybe Java 17), also to update drools version. Also performance of Drools 7X was degrades on Xeon server(Linux 5.4.17-2102.201.3.el8uek.x86_64 #2 SMP Fri Apr 23 09:05:57 PDT 2021 x86_64 x86_64 x86_64 GNU/Linux Intel(R) Xeon(R) Gold 6258R CPU @ 2.70GHz with 56 Threads(hyperthreading disabled, the same when enabled and there is 112 cpu threads ) and 1 TB RAM I have half of performance (Even increasing threads) NAME="Oracle Linux Server" VERSION="8.4"
) , couldn't achieve performance above 30K , increasing threads, degrades more and more. Making KieBase ThreadLocal also doesn't helps. We use only Stateless sessions. 
On stackoverflow  Roddy of the Frozen Peas suggested, that "your Drools 7 version isn't actually threadsafe", but actually didn't mentioned what he means by it. Is there other library ?

I do not thing it's not threadsafe. I made FlameGraphs with AsyncProfiler and it shows nearly 17% time on ContextImpl, which uses UUID.randomUUID() for StatelessKnowledgeSessionImpl. Also there are a lot of synchronized blocks in DefaultAgenda. Also a lot of Collections.synchronizedMap. I understand that it's very important for Statefull Sessions, but Stateless executes once and then releases. As I see Stateless is really wrapper for Statefull Session in block "try fireAllRules, finally release session". I tested to build drools core source (Tag 7.52.Final) replacing UUID.randomUUID() to nameInc.incrementAndGet()+"_Cnt", also removing synchronized blocks in DefaultAgenda(for testing purpose) and used Kiebase as ThreadLocal. Performance icreesed to ~20% , which is not as good as in version 4.0.3.

Also, I could manage to run these tests on Drools 4.0.3 on Java 11 and 17 (compilation of DRL doesn't works in Java 17, but I could used precompiled and saved KiePackages by Java 11), which was really great surprise. I got better performance comparing Drools 7.x, but not as much as using Java 6 with Drools 4.0.3, also it degrades on Xeon Server. So I made some profiling and found that there is a lot of time spend in  PrimitiveLongMap.Page method put . With AsyncProfiler&FlameGraphs I found that there is a lot of native calls after OptoRuntime::multianewarray2_C and found very good article Why does allocating a single 2D array take longer than a loop allocating multiple 1D arrays of the same total size and shape?. I replaced the code as suggested, also replaced synchronized block of nextWorkingMemoryCounter 
method in AbstractRuleBase with AtomicInteger and got better results with Java11/17 (>140K msg/ps) than with Java 6(<140Kmsg/ps). I tested this on Server Machine and the Throughput were perfect (~300K msg/sec with 10 Threads ), also increesing Threads were increesing Throughput (of course not Lineary, for example, with 50 Threads, I got ~560K msg/sec)

Here is a result of running Benchmark with Jdk 17 and drools 4.0.3 (pached)  running on Intel(R) Xeon(R) Gold 6258R CPU @ 2.70GHz :
# JMH version: 1.32
# VM version: JDK 17, OpenJDK 64-Bit Server VM, 17+35
# VM invoker: /opt/jdk/microsoft/jdk-17+35/bin/java
# VM options: -server -Xms10G -Xmx10G -XX:+UseShenandoahGC -XX:+UseNUMA -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:gc*,gc+ref*,gc+ergo*,gc+heap*,gc+stats*,gc+compaction*,gc+age*:logs/gc.log:time,tags:filecount=25,filesize=30m
# Blackhole mode: full + dont-inline hint
# Warmup: 50 iterations, 10 s each
# Measurement: 100 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 10 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: com.drools.perf.test.DroolsBenchmarkTest.send

DroolsBenchmarkTest.send         thrpt  100  272833.101 ± 435.835  ops/s

On as I said, on my desktop PC, the same test (Jdk 17 and drools 4.0.3 (pached) ) I have about 150K msg/sec

Then I made benchmarks on real code (realtime system) and I'm very happy with these results (~55K msg/sec).
 JMH version: 1.32
# VM version: JDK 17, OpenJDK 64-Bit Server VM, 17+35
# VM invoker: /opt/jdk/microsoft/jdk-17+35/bin/java
# VM options: -server -Xms50G -Xmx50G -XX:+UseNUMA -XX:+UseShenandoahGC -XX:+UseLargePages -XX:+UseTransparentHugePages -XX:MaxMetaspaceSize=1G -XX:MetaspaceSize=256M --add-opens=java.base/java.la
ng=ALL-UNNAMED -Xlog:gc*,gc+ref*,gc+ergo*,gc+heap*,gc+stats*,gc+compaction*,gc+age*:logs/gc.log:time,pid,tags:filecount=25,filesize=30m
# Blackhole mode: full + dont-inline hint
# Warmup: 50 iterations, 10 s each
# Measurement: 250 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 45 threads, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: ge.magticom.ocs.rulecompiler.benchmark.DroolsBenchmarkTest.send
# Parameters: (threadLocal = true, useNewRules = false)

Benchmark                 (threadLocal)  (useNewRules)  Mode  Cnt    Score   Error  Units
DroolsBenchmarkTest.send           true          false  avgt  250  829.179 ± 1.398  us/op

Run 3 forks with the same configuration for Throughput Msg/sec
# Benchmark mode: Throughput, ops/time
DroolsBenchmarkTest.send           true          false  thrpt  750  55015.309 ± 51.340  ops/s

Also UseShenandoahGC on JDK 17 was perfectly working, avg latency is 0.3 msc , max pause time 2 msc, allocation rate 7-7.5 GB/sec

But I worry that I couldn't do it with Drools 7X (on real code , I have ~10-11K msg/sec on my desktop PC and <5K msg/sec on server, the same JVM, different kernel, different gclib). 

What can you suggest?

Toshiya Kobayashi

unread,
Oct 25, 2021, 5:13:30 AM10/25/21
to Drools Setup
Thank you very much for the detailed analysis. I will look into it and will get back to you. Please kindly allow some time.

Regards,
Toshiya


2021年10月22日金曜日 20:00:47 UTC+9 dit...@gmail.com:

Mark Proctor

unread,
Oct 25, 2021, 8:22:55 PM10/25/21
to drools...@googlegroups.com
This is some good insights, please be patient while we look into it.

One thing we are aware of is that ksession creation has got heavier and more convoluted over time. Some fundamental design issues impact our ability to change this - such as mutable listeners. We are working on some new code that aims to ensure we can always have light weight and fast session creation times.

In the meantime, we'll look over your stuff and see if there are less drastic changes we can do, that involves tweaking to address things.

Mark

--
You received this message because you are subscribed to the Google Groups "Drools Setup" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drools-setup...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/drools-setup/c8978862-b182-4998-a5a9-4183785af298n%40googlegroups.com.

Toshiya Kobayashi

unread,
Oct 27, 2021, 6:46:54 AM10/27/21
to Drools Setup
Thanks again for your detailed analysis.

1) ksession creation time

It's 2.57% (StatelessKnowledgeSessionImpl.newWorkingMemory()) in the framegraph (droolsbenchmark/Drools7/com.drools.perf.test.DroolsBenchmarkTest.send-AverageTime/flame-cpu-forward.html) so it may not be important for your use case. Anyway, we are working on lightweight session so it will be available some time soon.

2) segment memory creation time

Mark is working on it and hopefully will be available some time soon.

3) UUID.randomUUID()

I filed https://issues.redhat.com/browse/DROOLS-6683 . Thank you for reporting!

# Btw, you may already know, StatelessKnowledgeSessionImpl.execute(Iterable objects) instead of Command can skip the ContextImpl() bottle-neck.

Regards,
Toshiya


2021年10月26日火曜日 9:22:55 UTC+9 mdpr...@gmail.com:

Dimitri Gamkrelidze

unread,
Oct 27, 2021, 7:57:46 PM10/27/21
to Drools Setup
Hi Toshiya, Mark,

Also thank you so much for such attention, and it's big pleasure for me.

I played with some changes and maybe it will be interesting for you wich also little bit improved performance for my usercase.
Here are Gist from my modification.
For example, if I'm creating KnowlageBase in  ThreadLocal and session configuration is not threadsafe
properties.put(ThreadSafeOption.PROPERTY_NAME,(String.valueOf(!threadLocal)));
conf.getChainedProperties().addProperties(properties);

KnowledgeBaseImpl
lockEnabled = sessionConfiguration.isThreadSafe() || mutable;
if lock is not enable, then create non concurrent collections.
also lock , unlock, readLock, readUnlock and tryLockAndDeactivate methods will be lock free. 

Also played with ConcurrentNodeMemories, replacing AtomicReferenceArray with simple array when using nonthreadsafe session configuration.
DefaultAgenda also has synchronyzed blocks, which is I think very hard to avoid.

Thank you alerting about execute(Iterable objects) , but sadly I think it's not be siuted for our case. We are using filters to ignore some rules and as I saw (I haven't dive very deeply), this is not possible with this method.

BTW, I also tested new version on real rools with unit and integration tests (~85K iteration with ~500 sample classes) and logically it was 100% the same as in drools 4.0.3. Only needed to change little syntax in LHS like this:

Udr(usageTimePart >= 000000 && < 090000) // time between 00 AM to 09 AM , better for reading visibility
to 
Udr(usageTimePart >= 0 && < 90000)

and shorthand ifelse in 
exists SubscriberAccount(balance > (($consumed > 0 ? ceil(40) : ceil(1040)) + $atv))
with custom function
I think, logically its identical and do not need much worry about algorithm implementation.

Once again, thank you very much
Dito

Toshiya Kobayashi

unread,
Oct 29, 2021, 4:39:24 AM10/29/21
to Drools Setup
Hi Dito,

Thank you again about various insights and suggestions.

Also it's glad to hear that you greatly improved the performance of your rules.

Regards,
Toshiya

2021年10月28日木曜日 8:57:46 UTC+9 dit...@gmail.com:

Dimitri Gamkrelidze

unread,
Oct 29, 2021, 9:38:42 PM10/29/21
to Drools Setup
Hello, 

Thank you for your answers and very sorry for disturbing, a lot of posts and impatience. Just want to make my job better and improve my working experience :)

I made little more modifications and want to share my experience. After these changes my tests were improoved ~20% from the original code 7.52.0.Final tag. For comparision, I used execute(Iterable objects) to avoid UUID overhead.

I already described some changes in ConcurrentNodeMemories and KnowledgeBaseImpl.
I posted additional changes in the same Gist
in KnowledgeBaseImpl.java registerSegmentPrototype, invalidateSegmentPrototype, createSegmentFromPrototype and getSegmentPrototype, I replaced node.getId() with node.getIdObj(). 

I declared Integer getIdObj(); in  NetworkNode interface and implemented it in BaseNode.java (also in some other implementation, for example in Mocks too, but as I see main implementation is BaseNode, all other implementations of getId return 0 ). as you can see idObj is "synchronized" with id, i.e it will have the same value as id, any changes to id, will be propogated to idObj. This gave better performance for accesing Map operations in KnowledgeBaseImpl, avoiding new Integer creations every time.

In SegmentMemory.java, method getNodesInSegment I replaced new java.util.LinkedList<>(); with new java.util.ArrayList<>();. because as I see it was only used in getNodesInSegment and then where iterated in newSegmentMemory, I think LinkedList is not nessesary here, it is just iteration and will avoid LinkedList's overhead.

In PathMemory.java isRuleDataDriven just returned false. As I see ForceEagerActivationFilter has only one implementation and it returnes false. Also rule.isDataDriven() had quiet overhead , always calculating value of dataDriven (is it possible to make it as field and precalculate it's value? Is RuleImpl mutable after compilation?)

In DefaultAgenda avoided synchronized blockes in two classes, FireAllRulesRestHandler and FireUntilHaltRestHandler

Best
Dito

Toshiya Kobayashi

unread,
Nov 2, 2021, 5:15:21 AM11/2/21
to Drools Setup
Thank you again for your suggestions.

Currently Mark and Mario are working on large refactoring (including lightweight session) in main branch which is aiming version 8. We will revisit your suggestions after those works are done. Also we would evaluate if we can apply them to version 7.x as well. Not to forget, I filed a JIRA : https://issues.redhat.com/browse/DROOLS-6690

Thanks!
Toshiya

2021年10月30日土曜日 10:38:42 UTC+9 dit...@gmail.com:
Reply all
Reply to author
Forward
0 new messages