Lazyily loading data into working memory

358 views
Skip to first unread message

Googleis Antiprivacy

unread,
Aug 23, 2016, 8:17:23 AM8/23/16
to Drools Usage
What's the elegant pattern for lazily loading facts from external systems and asserting them into working memory in Drools?  For example, with Jess you could use backward chaining and declare a need-* rule to perform this action.  It would also be great to be able to retract facts when they're no longer needed.  I was hoping to load the data asynchronously and there doesn't seem to be a way to programmatically do a logical assertion through the Java API.  I'm building an event driven system that will reason over some TBs of data in the course of a day.  So, even with partitioning, efficient memory usage is a priority.

Thanks,

-Jess

 

Mark Proctor

unread,
Aug 23, 2016, 8:34:25 AM8/23/16
to drools...@googlegroups.com
We’ve followed the prolog style of backward chaining, rather than “needs”. You can instantiate new objects as part of the query, but they are scoped to that derivation tree. You would need to have a rule insert the results of the query. We also don’t have  a cut operator, so it can be hard to do a get or create style pattern

Mark
--
You received this message because you are subscribed to the Google Groups "Drools Usage" group.
To unsubscribe from this group and stop receiving emails from it, send an email to drools-usage...@googlegroups.com.
To post to this group, send email to drools...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/drools-usage/3612a104-4736-4007-ba1a-230bc3b4fd37%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Googleis Antiprivacy

unread,
Aug 23, 2016, 9:57:31 AM8/23/16
to Drools Usage
Thanks Mark,

I didn't realize this was possible.  I don't think I've seen an example in the manual of a query instantiating new objects or exercising imperative code e.g. dao.getSomeData().  I'd rather not have the engine stalled for precious millis synchronously waiting on 3rd party data services anyway.

So, is there a better / common approach to handling this sort of problem?  Maybe there's a post in your blog archives that I can read?  :)

Jess

Mark Proctor

unread,
Aug 23, 2016, 10:19:46 AM8/23/16
to drools...@googlegroups.com
I’ve not written anything on this, but here is a rough outline.

We can use the ‘from’ keyword to introduce out of working memory data, this can use any valid java statement.

Because we don’t support ‘cut’ you will need to check for the absence of data, before calling ‘from’.
query X when
not Person( name == “Mark” )
Person() from helper.asyncInsert( new Person( “Mark” ) ) 
end

helper here is a global you set that has a reference to the session and will handle any async work that you need. Here it would just insert the passed object, but it could call out to a DB. It’s up to you to use the JDK to correctly do async operations.

What I don’t like about this is the query may be called too often, it’ll short cut out due to the ‘not’, but still the lack of efficiency annoys me - but may not matter in your case.

Mark

laphroaig15

unread,
Aug 23, 2016, 11:23:40 AM8/23/16
to drools...@googlegroups.com
I guess the alternative is to pull all of the dependent data for a batch of events upstream of the rules engine and then just apply them against a stateless session. This is straightforward and more deterministic. You'll potentially have more hits to your data layer, but there's likely a cache there. You lose Drools CEP capabilities, but maybe the streaming engine can compensate with its own CEP operations. I don't like the concept of having to manage the data dependencies outside of the rule definitions. I guess the dependencies could be defined as a subset of rules and Drools executed in the upstream data collection stage as well, but that's building in more overhead and latency.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Premkumar Stephen

unread,
Aug 23, 2016, 4:34:48 PM8/23/16
to Drools Usage
> What I don’t like about this is the query may be called too often, it’ll short cut out due to the ‘not’, but still the lack of efficiency annoys me - but may not matter in your case.

Mark, 

Could some sort of timer be added to your logic below so that it does this only every 30 seconds or so?

Mark Proctor

unread,
Aug 23, 2016, 4:36:41 PM8/23/16
to drools...@googlegroups.com
You can’t put timers on queries, but you can on a rule that calls the query.

Mark

Googleis Antiprivacy

unread,
Aug 23, 2016, 9:37:14 PM8/23/16
to Drools Usage
In this example, the helper is doing an asynchronous call to the db and a subsequent insert, but what does it immediately return to the invoking query?  I'm blindly assuming that the engine stalls until the query returns.  If there is some selector that polls for completion of the queries then I'm over engineering.

Mark Proctor

unread,
Aug 23, 2016, 9:48:00 PM8/23/16
to drools...@googlegroups.com
On 24 Aug 2016, at 02:37, Googleis Antiprivacy <laphr...@gmail.com> wrote:

In this example, the helper is doing an asynchronous call to the db and a subsequent insert, but what does it immediately return to the invoking query?  I'm blindly assuming that the engine stalls until the query returns.  If there is some selector that polls for completion of the queries then I'm over engineering.

you could use eval maybe?
eval( helper.myFunc( var, var, var ) )

As long as myFunc returns true, this will basically be a nullop, and do nothing and the rule/query will continue to match. however your myFunc will in another thread be doing it’s DB async thing.

Mark


Googleis Antiprivacy

unread,
Aug 24, 2016, 2:32:03 PM8/24/16
to Drools Usage
Thanks for all the help guys.  I'll work these suggestions into my PoC and see how it goes.
Reply all
Reply to author
Forward
0 new messages