Drools 7.4.1.Final Performance issue when Facts are inserted continuously to Drools memory

2,201 views
Skip to first unread message

Srinivas ev

unread,
Apr 11, 2018, 10:28:05 AM4/11/18
to Drools Usage
We are currently using Drools  7.4.1.Final version.

Issue - 
1. We have the requirement to insert up to 55 facts/ second into Drools memory. (each may have a size of up to 420kb: in Java object size)
2. Currently, we use StatefulKnowledgeSession for inserting this fact.
3. For 10,000 to 20000 facts it processes smoothly.
4. Facing issue when more facts keep on inserting into Drools as explained below.
5. Facts are not retracted up to 48 hours or clear/drop fact comes for the same fact (Business case).

Attached heap snapshot with this.

Issue - 
1. After some 4lakh to 5lakh Facts inserted into Drools, rules won't trigger properly.
2. Seen, only facts are able to insert continuously to Drools but corresponding rules will invoke at different intervals.


Questions - 
1. How to know the threshold of Drools engine?
2. How to know the current state of Drools engine at which place it got stuck?
3. Do it goes to any loop or thread hung when these many facts present in its memory.

As this is single threaded, does it able to handle this load and execute rules parallel?


for google group.PNG

Tibor Zimányi

unread,
Apr 12, 2018, 4:49:00 AM4/12/18
to Drools Usage
Hi,

do you use fireUntilHalt to fire the rules? When yes, inserting huge amounts of facts can cause the fire thread to not get scheduled by the OS, because the insert threads takes all CPU time. So in such case I would recommend doing inserts in batches. Also from your description, your facts are very large, so you might be hitting system memory very hard. I would revise if the system and the jvm has enough memory to process such large amount of facs (taking account their size). 

Regards,

Tibor 

Dňa streda, 11. apríla 2018 16:28:05 UTC+2 Srinivas ev napísal(-a):

Srinivas ev

unread,
Apr 12, 2018, 5:43:02 AM4/12/18
to Drools Usage
Hi Tibor,

We follow the below process to fire the rules continuously. 

After generating each kieSession from kie base, will construct object of FireUntilHalt.  start() method is invoked to continuously invoke this method inside while loop.

class FireUntilHalt extends Thread {
private KieSession kieSession;
public void run() {
while (run) {
try {
           kieSession.fireUntilHalt();
    }
}


Questions - 

1. As per our business case, we don't know when the new facts will arrive at our system to decide the batch insertion. Is there any alternative?
2. Will the creation of the nodes size depends on the fact size that we insert into Drools engine?. Because I can see the Tuple nodes(NotNodeLeftTuple, JoinNodeLeftTuple, RuleTerminalNodeLeftTuple) took considerable size in the heap.
3. How to check the exact fact size in the heap dump ?. I cannot find by name, is there any other way to check this. (Suppose I created 5 lakh instances of POJO in java code(with new POJO()) and will insert same into Drools engine).

Tibor Zimányi

unread,
Apr 12, 2018, 7:05:55 AM4/12/18
to Drools Usage
Hi,

1. If that is the case I don't see any alternative now except having a queue for incoming facts. But not sure how plausible is it for you. 
2. For the nodes the answer is no. In your heap dump you see "node tuples", not nodes. These represent your facts in the rete network, but the memory size is dependant just on fact count, not on fact size. That means - more facts -> more tuples. From your heap dump I consider the amount of memory they take reasonable. They take 13,4% of memory. I would recommend investigating the origin of the top memory consumers from the dump. 
3. You can find facts in the heap dump by your model classes names. E.g. I expect OBJETO_ERICSSON_2G_alarma to be some of your facts. 

As I wrote before, I would recommend checking if you are not hitting the memory limits of your system or if the JVM process in which your application runs has enough memory. If you want, you can create a small Maven project as a reproducer and I can take a look at the fireUntilHalt behaviour. 

Regards,

Tibor

Dňa štvrtok, 12. apríla 2018 11:43:02 UTC+2 Srinivas ev napísal(-a):

Srinivas ev

unread,
Apr 12, 2018, 8:46:32 AM4/12/18
to Drools Usage
Hi Tibor,

1. Do the increase in node tuples slow down the performance?
2. Regarding facts OBJETO_ERICSSON_2G_alarma, these model class objects are created in Java code like below to set its instance properties before inserting into kieSession. I am suspecting the same which is showing in heap dump and not the facts inserted in Drools engine.
            new OBJETO_ERICSSON_2G_alarma()
3.Regarding memory limits, we have a RAM of 51GB and observed maximum it hit the 40GB when 35lakh facts are inserted for a day with 40 facts per second. During this, Drools stopped responding(not invoking any rules) for some interval and slowly it started triggering the pending rules.
4.If we retract the facts after processing, do the count of "node tuples" also reduce and increase the performance?
5.Is Drools engine single threaded? Suppose if I am inserting facts continuously(40 facts per second), how it decides about the priority? For which facts it will trigger the rules - New Facts or Existing facts.
6.Are there any Listeners to check currently what is the state of Drools engine ? I tried DebugAgendaEventListener/DebugRuleRuntimeEventListener which shows only about which rules Drools triggered and not the state of Drools engine(like whether it hunged).
7.Observed drools-worker threads in Visualvm which never ran. What is the purpose of this threads.
for google group2.PNG
memory graph 2.PNG

Tibor Zimányi

unread,
Apr 13, 2018, 2:49:01 AM4/13/18
to Drools Usage
Hi,

1. Not sure.
4. Yes the tuples are deleted when the facts are deleted. It is logical that when less facts are presented, the performance is faster. 
5. Facts are processed in the same order as they come in. Yes, Drools is single-threaded. There is an experimental multithreaded execution, but the functionality is currently very limited. See here for more information [1]. 

As I said. If you want, please provide a small Maven project as a reproducer and I can take a look at the fireUntilHalt behaviour. 

Regards,

Tibor


Dňa štvrtok, 12. apríla 2018 14:46:32 UTC+2 Srinivas ev napísal(-a):

Srinivas ev

unread,
Apr 13, 2018, 12:43:06 PM4/13/18
to Drools Usage
Hi Tibor,

>> As I said. If you want, please provide a small Maven project as a reproducer and I can take a look at the fireUntilHalt behavior. 
I will create Maven project and update here tomorrow. Meanwhile, I observed following thread behavior in profiler. Attached the snapshots, Please let me know if you find anything with this.

1. I did the code changes to retract the fact once it completed its job inside the memory. But, I can still see the tuple nodes in the heap snapshot.
Suppose I inserted around 12 lakh facts, among these - I retracted almost 6lakh facts. But, i see the large number of tuple nodes which are more than the facts inserted.

2. Even if fireUntilHalt consumes much of CPU time, why it is failing to execute the rules? >> As you said earlier, do the OS doesn't schedule the firethread when insert thread is taking all its time. What is actually showing in the Thread-140. Is it insert thread?

3. I can see CPU usage hitting 100% frequently when the facts grow beyond 5lakh. 
Thread Sampler 140 thread consuming time.PNG
Thread 140 Visualization.PNG
Thread 140 profiler 1.PNG
Thread 140 profiler 2.PNG
Thread 140 profiler 3.PNG
Thread 140 profiler 4.PNG
CPU usage.PNG

Srinivas ev

unread,
Apr 14, 2018, 3:13:13 PM4/14/18
to Drools Usage
Hi Tibor,

Attached maven project reproducer with this. Please let me know for any updates/ any corrections.
KiesessionProject.zip

Tibor Zimányi

unread,
Apr 16, 2018, 7:59:30 AM4/16/18
to Drools Usage
Hi,

thanks for the reproducer, I will take a look when I have a bit of time. Will let you know if I find anything. 

Tibor

Dňa sobota, 14. apríla 2018 21:13:13 UTC+2 Srinivas ev napísal(-a):

Tibor Zimányi

unread,
Apr 16, 2018, 9:24:54 AM4/16/18
to Drools Usage
Hi,

I took a quick look at your reproducer. I hoped for something that could simulate your load a bit. Could you please update the reproducer? I also see that in the FireUntilHalt thread you do locking. Why you do that? fireUntilHalt is a blocking operation, so it should work without that. 

Regards,

Tibor

Dňa pondelok, 16. apríla 2018 13:59:30 UTC+2 Tibor Zimányi napísal(-a):

Srinivas ev

unread,
Apr 16, 2018, 3:03:10 PM4/16/18
to Drools Usage
Hi Tibor, I attached updated reproducer which inserts 30 events/second for 10 minutes. durationInMinutes,factsPerInterval can be updated while running for different load. Thanks
KiesessionProjectLoad.zip

Srinivas ev

unread,
Apr 17, 2018, 12:27:02 AM4/17/18
to Drools Usage
Hi Tibor, just wanted to know, is it required to run fireUntilHalt() inside a while loop? Can it be called only once inside the thread created in the FireUntilHalt class?. I ran a simple test by removing while loop in the run() method. It was still able to invoke the timer rules. Please let me know your thoughts on this.

Tibor Zimányi

unread,
Apr 17, 2018, 6:34:49 AM4/17/18
to Drools Usage
Hi,

thanks for the reproducer, will take a look when I will have a bit of time. To fireUntilHalt - yes, you don't need to have that in a loop. It just needs to be invoked from a separate thread. It is a blocking operation. So when you call it, it doesn't return until session.halt() is called, that means that the run method doesn't end until session.halt() is called (in your case). 

T. 

Dňa utorok, 17. apríla 2018 6:27:02 UTC+2 Srinivas ev napísal(-a):

Srinivas ev

unread,
Apr 20, 2018, 10:44:21 AM4/20/18
to Drools Usage
Hi Tibor, 

any update on the above request? I have some questions below.

1. Is there any chance the timer won't be expired(Timer rule) if there is continuous insertions of new facts?. All the facts which I insert have a persistence object created with 1-second timer. I can see most of the facts inserted will create timer object. But the timer won't expire for most of the events after 1 second.

You already replied as below. But in case if you have a detailed explanation, it will help me.
>>do you use fireUntilHalt to fire the rules? When yes, inserting huge amounts of facts can cause the fire thread to not get scheduled by the OS, because the insert threads take all CPU time.

What do you mean by the insert threads here? Do the fireUntilHalt for kieSession and insert thread for kieSession are different threads?
kSession.insert(Fact);
kieSession.fireUntilHalt();

Srinivas ev

unread,
Apr 22, 2018, 11:56:58 AM4/22/18
to Drools Usage
Hi Tibor, 

I found the below details in jconsole when I used detect deadlock option on Thread-133 in threads tab. I can see large numbers in Total Blocked/ Total Waited. Please share any inputs on this.

Name: Thread-133
State: WAITING on org.drools.core.phreak.SynchronizedPropagationList@4b40f489
Total blocked: 18,516,583  Total waited: 5,621,285

Stack trace: 
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:502)
org.drools.core.phreak.SynchronizedPropagationList.waitOnRest(SynchronizedPropagationList.java:128)
org.drools.core.common.DefaultAgenda$RestHandler$FireUntilHaltRestHandler.handleRest(DefaultAgenda.java:1138)
org.drools.core.common.DefaultAgenda.fireLoop(DefaultAgenda.java:1079)
org.drools.core.common.DefaultAgenda.internalFireUntilHalt(DefaultAgenda.java:996)
org.drools.core.common.DefaultAgenda.fireUntilHalt(DefaultAgenda.java:988)
org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1359)
org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1338)
com.*****************.KieSessionImpl$FireUntilHalt.run(KieSessionImpl.java:97)

Tibor Zimányi

unread,
Apr 23, 2018, 6:48:33 AM4/23/18
to Drools Usage
Hi, 

sorry I hadn't time to take a look at your reproducer. I want to do that this week. The wait stacktrace you linked is OK. It is the fireUntilHalt thread waiting for new inserts or fact changes. If you have more such thread stacks means you are creating more fireUntilHalt threads, which should happen only when you have more KieSessions created. There should be just one fireUntilHalt thread for one KieSession. That is worth checking on your part. 

I will let you know when I take a look at your reproducer. 

Tibor

Dňa nedeľa, 22. apríla 2018 17:56:58 UTC+2 Srinivas ev napísal(-a):

Srinivas ev

unread,
Apr 23, 2018, 6:59:04 AM4/23/18
to Drools Usage
Hi Tibor,

Thanks,

Yes, I am creating only one fireUntilHalt for each KieSession. We have 3 KieSession's and 3 threads are created with it. I found the same thread got blocked for few other instances.

Name: Thread-133
State: BLOCKED on java.lang.Object@11c6e67 owned by: Thread-43 (ActiveMQ-client-global-threads-1096828975)
Total blocked: 1,502,068  Total waited: 1,179,709

Stack trace: 
org.jboss.logmanager.handlers.WriterHandler.doPublish(WriterHandler.java:56)
org.jboss.logmanager.ExtHandler.publish(ExtHandler.java:76)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:314)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.Logger.logRaw(Logger.java:850)
org.slf4j.impl.Slf4jLogger.log(Slf4jLogger.java:326)
org.slf4j.impl.Slf4jLogger.log(Slf4jLogger.java:70)
org.apache.commons.logging.impl.SLF4JLocationAwareLog.debug(SLF4JLocationAwareLog.java:133)
org.apache.commons.beanutils.converters.AbstractConverter.convert(AbstractConverter.java:140)
org.apache.commons.beanutils.converters.ConverterFacade.convert(ConverterFacade.java:61)
org.apache.commons.beanutils.BeanUtilsBean.convert(BeanUtilsBean.java:1072)
org.apache.commons.beanutils.BeanUtilsBean.setProperty(BeanUtilsBean.java:1005)
org.apache.commons.beanutils.BeanUtils.setProperty(BeanUtils.java:454)
*************************************************** Project specific code *************************************
org.drools.core.phreak.RuleExecutor.innerFireActivation(RuleExecutor.java:431)
org.drools.core.phreak.RuleExecutor.fireActivation(RuleExecutor.java:379)
org.drools.core.phreak.RuleExecutor.fire(RuleExecutor.java:135)
org.drools.core.phreak.RuleExecutor.evaluateNetworkAndFire(RuleExecutor.java:88)
org.drools.core.concurrent.AbstractRuleEvaluator.internalEvaluateAndFire(AbstractRuleEvaluator.java:34)
org.drools.core.concurrent.SequentialRuleEvaluator.evaluateAndFire(SequentialRuleEvaluator.java:43)
org.drools.core.common.DefaultAgenda.fireLoop(DefaultAgenda.java:1067)
org.drools.core.common.DefaultAgenda.internalFireUntilHalt(DefaultAgenda.java:996)
org.drools.core.common.DefaultAgenda.fireUntilHalt(DefaultAgenda.java:988)
org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1359)
org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1338)
com.****************************.KieSessionImpl$FireUntilHalt.run(KieSessionImpl.java:97)


Name: Thread-133
State: BLOCKED on java.lang.Object@11c6e67 owned by: ConnectionValidator
Total blocked: 1,041,787  Total waited: 1,050,192

Stack trace: 
org.jboss.logmanager.handlers.WriterHandler.doPublish(WriterHandler.java:56)
org.jboss.logmanager.ExtHandler.publish(ExtHandler.java:76)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:314)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.LoggerNode.publish(LoggerNode.java:322)
org.jboss.logmanager.Logger.logRaw(Logger.java:850)
org.slf4j.impl.Slf4jLogger.log(Slf4jLogger.java:326)
org.slf4j.impl.Slf4jLogger.log(Slf4jLogger.java:320)
org.slf4j.impl.Slf4jLogger.debug(Slf4jLogger.java:150)
org.drools.core.common.DefaultAgenda$ExecutionStateMachine.setCurrentState(DefaultAgenda.java:1389)
org.drools.core.common.DefaultAgenda$ExecutionStateMachine.waitAndEnterExecutionState(DefaultAgenda.java:1374)
org.drools.core.common.DefaultAgenda$ExecutionStateMachine.toFireUntilHalt(DefaultAgenda.java:1341)
   - locked java.lang.Object@11bc16ec
org.drools.core.common.DefaultAgenda$RestHandler$FireUntilHaltRestHandler.handleRest(DefaultAgenda.java:1144)
org.drools.core.common.DefaultAgenda.fireLoop(DefaultAgenda.java:1079)
org.drools.core.common.DefaultAgenda.internalFireUntilHalt(DefaultAgenda.java:996)
org.drools.core.common.DefaultAgenda.fireUntilHalt(DefaultAgenda.java:988)
org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1359)
org.drools.core.impl.StatefulKnowledgeSessionImpl.fireUntilHalt(StatefulKnowledgeSessionImpl.java:1338)
com.******************************.KieSessionImpl$FireUntilHalt.run(KieSessionImpl.java:97)


Please let me know on this.

Srinivas ev

unread,
Apr 24, 2018, 11:37:39 PM4/24/18
to Drools Usage
Hi Tibor,

from the thread dump, I feel this Thread-133 which is been created as part of fireUntilHalt() for kieSession is being used to write for Database and to the Message queues in my code. As the thread got busy doing these operations, it cannot invoke the timer rules/other rules for the new facts/existing facts in Drools engine.  We used a singleton class as globalhandler for kieSession for java based operations such as inserting into DB/placing the result on ActiveMq once the event is processed completely.

Please share your thoughts

Tibor Zimányi

unread,
Apr 25, 2018, 3:17:18 AM4/25/18
to Drools Usage
Hi,

the basic rule is that you shouldn't use performance hard operations in rule consequences (writing to DB etc), because you will block the rules evaluation and firing till these operations are done. Not sure if that is your case though. If you need to do DB writes based on rule fires, the rules should just notify some async handler with sending of an event or similar action. That means that the writes will happen outside of the rule engine. 

Tibor

Dňa streda, 25. apríla 2018 5:37:39 UTC+2 Srinivas ev napísal(-a):

Srinivas ev

unread,
Apr 29, 2018, 4:36:25 AM4/29/18
to Drools Usage
Hi Tibor, thanks for the above answer.

I enabled,DebugAgendaEventListener and DebugRuleRuntimeEventListener while creating KieSession.

Some of the observations I made  - 

For the first ~1 Lakh facts, ObjectInsertedEventImpl and its related ActivationCreatedEvent,  BeforeActivationFiredEvent,  AfterActivationFiredEvent for creating new timer objects and its related next rules are able to invoke in very fast manner. 

In the reproducer which I shared, there are 3 rules. For first 80k to 1 Lakh facts, it is able to call the related rules for the new and existing facts very quickly. As the facts go on increasing, this will drastically reduce and I can see only ObjectInsertedEventImpl in my logs. Very rarely it will have Activation related logs.

As already discussed, you said OS may not schedule the firethread. Can I know what the Drools engine will be performing during that time ?Will it sit idle/ is it trying to evaluate some rule. With the increased number of facts, will it take time to evaluate for comparing? Suppose for the below rule, it should execute only when Fact object exists in memory with no related persistence object.

For assumption, if the current Drools memory is having 4lakh facts, will it work like a loop and check after each insertion of fact? if the new fact is inserted, will it try to execute the same rule for 4lakh + 1(newly inserted fact) times ?. Is this is causing the evaluation speed decrease and making it not possible for invoking other rules? 

My end point is to insert processed facts into the Database. I attached hourly decrease of processed facts insertion into the Database.






rule "Create Persistence Object"
    when
        fact:Fact($id:getId(), "NO" == getFactHandled())
        not Persistance($id == getEventID())
    then
        System.out.println("Create Persistence Object Rule for Fact id "+$id);
        KieSessionMain.createPersistence($id);
end
DB insertion pattern.PNG

Srinivas ev

unread,
Apr 30, 2018, 2:34:15 PM4/30/18
to Drools Usage
Hi Tibor,

Just thought of checking the Rete tree of one of my rule file. I attached with this, Do you suspect this also one of the reason for not calling the rules quickly when the load is more ?
rete tree.PNG

Srinivas ev

unread,
Apr 30, 2018, 3:47:25 PM4/30/18
to Drools Usage
Hi Tibor,

Just thought of checking the Rete tree for one of my rule file. I attached with this, Do you suspect this also one of the reason for not calling the rules quickly when the load is more ? I attached heap snapshot too, I dont understand why the tuple nodes consuming huge amount of memory 10Gb~ almost. What If i don't do the retraction and continously insert the events.

    1. Why are tuple nodes created? As per the documentation, whenever a node is evaluated, it creates a tuple and propagates for the next node in the path. If my requirement doesn't allow to delete the fact references, my guess it may cause performance issue?.

Please share your thoughts. 
rete heap.PNG

Srinivas ev

unread,
May 10, 2018, 7:16:05 AM5/10/18
to Drools Usage
Hi Tibor, Just wanted to check whether if you got any chance for look into it.

Tibor Zimányi

unread,
May 11, 2018, 7:43:14 AM5/11/18
to Drools Usage
Hi,

sorry for late response. From what I see in your reproducer, your facts never expire from the working memory. That could cause the high memory load you see. At least if your real application is similar to your reproducer. In your rules you just set the fact attributes so the rules don't match anymore, but the facts are still present in the session. You should use Drools event processing or manage to delete the facts yourselves when needed. For event processing see the docs here [1]. The other problem is that you could be really overwhelming the engine. When I run your reproducer (and workaround the fact deletion in it), the fire thread is firing rules, but the fire count cannot catch the number of inserts. Firing the rules takes some time, so you need to give the fireUntilHalt thread enough (processor) time. 

I am not sure if I can give you any other advice. I would still try to achieve inserts in a batch if I were you. In that case I would not use fireUntilHalt, but I would insert a batch of facts and call fireAllRules. And don't forget to fix the fact expiration. 

Tibor

Srinivas ev

unread,
May 15, 2018, 7:51:54 AM5/15/18
to Drools Usage
Hi Tibor,

Thanks for the reply.

My application has declarations as below for each type of fact we insert into Drools memory.

 1.declare Persistance
   @role(event)
@expires(172800s)
end

2.declare Fact
   @role(event)
@expires(172800s)
end

>> Necessary validations are handled in the pre-insertion to match one of the rules written. So, I can rule out the
situation where facts inserted to Drools engine will not match any rule.

>> When my application recieves the Clear facts, the Active Facts are removed along with this Clear Facts for handling 
different functionality.This Clear Facts may arrive anytime within 48 hours and Active Facts will stay in Drools engine till then.

>> As you said about "overwhelming the rule engine", Do you think at this rate of insertions, Drools fire thread fails 
to pick the insert count resulting in delay of the rules invocations for the newly inserted facts. What can be done for 
Real time processing of facts using Drools.

>> Can you please eloborate how it can be designed to insert facts at interval and invoke fireAllRules.As my understanding,
batch wise insertions involved grouping the incoming facts with count 50/100 in any data structure. Insert all this facts to Drools session.
Invoke fireAllRules after this. So, the next batch insertion will wait for the specific threshold(50/100) and insert only when it
crosses this threshold limit.

Please share your thoughts on this.

Srinivas ev

unread,
Jun 5, 2018, 6:38:26 AM6/5/18
to Drools Usage
Hi Tibor, pls consider this question too.

+ In the above, you said fire count is not able to catch the inserts. How is it able to perform at good amount of speed during the initial hours(1-3 hours) and the speed drops rapidly later.
Reply all
Reply to author
Forward
0 new messages