Revisiting forward-chaining rules in Clojure

1,148 views
Skip to first unread message

Ryan Brush

unread,
Aug 18, 2013, 2:16:14 PM8/18/13
to clo...@googlegroups.com
Perhaps the best aspect of Clojure is how it can adopt the best ideas from other domains to concisely solve problems, as we've seen with core.logic, core.async and other libraries. I recently came across a problem domain that is easily expressed in forward-chaining rules, and found Clojure to be a powerful way to solve it. 

While working through this problem space I started to suspect there is a more general need for forward-chaining rules in Clojure to complement core.logic and other libraries. So as a side project I implemented a raw but working engine to do so. I'm posting here to share a draft of this engine, its major design goals, and to ask for input on how we should approach forward-chaining rules in Clojure in general.

The rationale and some examples are on the github page for this engine, which I've tentatively named Clara. The state of the code isn't where it needs to be yet; I've been learning the intricacies of Rete and discovering better ways to solve problems along the way, and some of that is reflected in the code base. However, the major pieces of design and functionality are in place, so I'd like to get input on those. 

The idea is to draw a bit from Jess and Lisa, with the Java interop strength of Drools, but the advantages and idioms available in Clojure. The major goals are:
  • Focus on problem spaces naturally expressed as forward-chaining rules. Data-driven needs like eventing, data validation, or application of arbitrary business rules fit here. 
  • Embrace immutability. The rule engine's working memory is a persistent Clojure data structure that can participate in transactions. All changes produce a new working memory that shares state with the previous.
  • Rule constraints and actions are simply Clojure s-expressions.
  • First-class Java interoperability and APIs. This should be an alternative to Jess or Drools from Java-land.
  • Usable either as a library to simplify Clojure code, or as a DSL to externalize business logic.
  • Working memory facts are typically (but not exclusively) Clojure records or Java objects following the Java Bean conventions.
  • Support the major advantages of existing rules systems, such as explainability of why a rule fired and automatic truth maintenance.
  • Collections of facts can be reasoned with using accumulators similar to Jess or Drools. These accumulators leverage the reducers API and are transparently parallelized.
  • The working memory is independent of the logic flow, and can be replaced with a distributed processing system. A prototype that uses Storm to apply rules to a stream of incoming events already exists.
I'd love to hear thoughts on a couple questions:

What else would you want from a forward-chaining rules engine in Clojure?
What design changes would you make, given the above criteria and examples on github?

All thoughts are appreciated!

-Ryan


Shantanu Kumar

unread,
Aug 18, 2013, 3:54:32 PM8/18/13
to clo...@googlegroups.com
Thanks for posting. I will certainly explore this.

Did you look at Mimir? https://github.com/hraberg/mimir Could you outline how is Clara's approach different from Mimir?

Shantanu

Ryan Brush

unread,
Aug 18, 2013, 4:41:56 PM8/18/13
to clo...@googlegroups.com
Shantanu,

I appreciate it. I did look at Mimir, but had some different objectives, and therefore tradeoffs, and didn't see a straightforward way to reconcile them. 

First, I wanted to use existing data models in the rules as is -- be it Clojure records, Java Beans, or other structures. Drools has a number of drawbacks, but has success in Java-land largely because it interoperates with existing models so well. A pure Clojure solution with strong interop could offer a number of advantages over existing engines. In fairness, I have yet to add first-class Java support, but the same structure that uses Clojure records right now will be extended to seamlessly use JavaBeans-style classes.

Second, I have a broader goal of executing rules against arbitrarily large data sets in a distributed system, and there are semantic and structural tradeoffs to make that happen. For instance, the underlying working memory is a separate subsystem in Clara, and all join operations are hash-based and structured in such a way that they need not be performed in the same process. The clara-storm project is a very raw proof-of-concept of distributing rules across a cluster, but we should be able to layer it on top of other infrastructures as well, such as Hadoop (although that's another, involved conversation in itself.) At this point I'm more focused on getting the core system correct and useful on its own, doing enough to ensure this will be scalable in the future. 

There are a number of other distinctions that could probably be reconciled with Mimir's approach, such as the use of Jess- or Drools-like Accumulators to reason over collections of objects. To be honest I didn't look closely on how that could be done given some of the above differences.

Thanks!

-Ryan

Alan Moore

unread,
Aug 19, 2013, 12:51:46 AM8/19/13
to clo...@googlegroups.com
On Sunday, August 18, 2013 1:41:56 PM UTC-7, Ryan Brush wrote:
Shantanu,

I appreciate it. I did look at Mimir, but had some different objectives, and therefore tradeoffs, and didn't see a straightforward way to reconcile them. 

First, I wanted to use existing data models in the rules as is -- be it Clojure records, Java Beans, or other structures. Drools has a number of drawbacks, but has success in Java-land largely because it interoperates with existing models so well. A pure Clojure solution with strong interop could offer a number of advantages over existing engines. In fairness, I have yet to add first-class Java support, but the same structure that uses Clojure records right now will be extended to seamlessly use JavaBeans-style classes.

Second, I have a broader goal of executing rules against arbitrarily large data sets in a distributed system, and there are semantic and structural tradeoffs to make that happen. For instance, the underlying working memory is a separate subsystem in Clara, and all join operations are hash-based and structured in such a way that they need not be performed in the same process. The clara-storm project is a very raw proof-of-concept of distributing rules across a cluster, but we should be able to layer it on top of other infrastructures as well, such as Hadoop (although that's another, involved conversation in itself.) At this point I'm more focused on getting the core system correct and useful on its own, doing enough to ensure this will be scalable in the future. 


I have been working on a similar approach - only using datomic as the "working memory". At one point I briefly considered putting the rete nodes into datomic but quickly realized that wasn't a good idea.

The datomic datalog engine is very impressive. I wondered if using that in combination with datomic's tx-report-queue API might be worth looking into - instead of running full queries there could be incremental or "persistent queries" that yield tuples matching the datalog expressions in a lazy fashion.

 

There are a number of other distinctions that could probably be reconciled with Mimir's approach, such as the use of Jess- or Drools-like Accumulators to reason over collections of objects. To be honest I didn't look closely on how that could be done given some of the above differences.


Let me know if you need help. It looks like you are farther along than I am. I agree that clojure is a great medium for a rules engine... I was looking at trying to leverage a lot of the existing match/logic/unification libraries as much as possible but it looks like your engine doesn't rely on them except for core.reducers.
 
You asked about what features I would like to see - does Clara work in the browser/cljs? I haven't tried it yet - do you know any reason why it wouldn't?

Alan

Ryan Brush

unread,
Aug 19, 2013, 1:57:35 AM8/19/13
to clo...@googlegroups.com
The idea of Datomic as an approach to scalable working memory is interesting. I haven't looked at the mechanics of doing this, but it seems possible since Clara aims to separate the working memory system from the rule logic and Rete network.  Also, the approach I've taken here aligns with Datomic's ideals of persistent data structures. I think having multiple working memory implementations makes sense -- the approaches for dealing with distributed event streams, local business logic, and incremental queries over large, long-lived memory likely call for different infrastructures. Datomic seems like a possible candidate to fill at least one of those roles. The problems I've been working on haven't (yet) required persisting the working memory, so my focus has been on in-memory models to this point.

As for help on this project, it would be welcome! Right now the code is rough and the documentation nearly absent, but I plan on addressing that in the coming weeks. I am starting to track bugs and enhancements on Github for this project in the interest of transparency and collaboration. If there's a particular item of interest, feel free to log or claim an issue on the project.

Regarding ClojureScript: I believe it should be straightforward to get this running in ClojureScript, but I haven't attempted it. There is a small amount of logic specific to the JVM -- the mechanism for identifying the fields that exist in a fact used in the working memory -- but that could be factored out ad a ClojureScript alternative made available. I logged an issue to track that: https://github.com/rbrush/clara-rules/issues/4

Maik Schünemann

unread,
Aug 19, 2013, 3:36:21 AM8/19/13
to clo...@googlegroups.com
Hi, the library looks very interesting!
I am also working on a rule based translator as part of expresso [1],
my gsoc project Algebraic Expressions, which uses the rule based
translator
to succinctly specify transformations of algebraic expressions.
You can see some examples of this in my recent blog posts on [2].

It is, of course, most suited for algebraic expressions, but this
means just s-exps in clojure (it is also backed by protocols so even
java interop should be possible).

would clara also be applicable to the domain of algebraic expressions?
If so, how does it compare?

The one cool feature of the rule based translator i like the most is
that it has semantic matching instead of purely syntactical matching.
That means, that for example '+ can specify its own matching algorithm
(in this case commutative matching).
So
(def r (rule (ex (+ 0 ?&*)) :=> (ex (+ ?&*))))
(apply-rule r (ex (+ 0 1 2))) :=> '(+ 1 2)
(apply-rule r (ex (+ 1 0 2))) :=> '(+ 1 2)

?&* here stands for: match zero or more expressions.

[1] https://github.com/clojure-numerics/expresso
[2] http://kimavcrp.blogspot.de/
> --
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your
> first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Clojure" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to clojure+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.

Ryan Brush

unread,
Aug 19, 2013, 7:50:44 AM8/19/13
to clo...@googlegroups.com
Hey Maik, I appreciate it!

I'm going to look more closely at expresso, but at first glance I think these projects have different objectives. Expresso appears to offer very rich semantics in its domain, where Clara intentionally offers more limited semantics in exchange for scalability and interoperability with existing data.  It targets the "lots of data and arbitrary business rules" use case, versus the algebraic focus of expresso.  So while it might be possible to force these sorts of expressions into Clara, it won't be straightforward. I suspect this is a case where these tools are complementary, and expresso is better suited for the problems you describe.

In any case, expresso looks like a cool project in its own right, so I'm glad you mentioned it here.

-Ryan

Alan Moore

unread,
Aug 19, 2013, 1:58:20 PM8/19/13
to clo...@googlegroups.com
inline


On Sunday, August 18, 2013 10:57:35 PM UTC-7, Ryan Brush wrote:
The idea of Datomic as an approach to scalable working memory is interesting. I haven't looked at the mechanics of doing this, but it seems possible since Clara aims to separate the working memory system from the rule logic and Rete network.  Also, the approach I've taken here aligns with Datomic's ideals of persistent data structures. I think having multiple working memory implementations makes sense -- the approaches for dealing with distributed event streams, local business logic, and incremental queries over large, long-lived memory likely call for different infrastructures. Datomic seems like a possible candidate to fill at least one of those roles. The problems I've been working on haven't (yet) required persisting the working memory, so my focus has been on in-memory models to this point.

I agree that it is an unorthodox approach but that hasn't stopped me before. :-) I started by looking at graph APIs and databases to represent a distributed rete graph but then datomic was announced and it seemed a much better fit for the problems I'm trying to solve.

IMHO the real benefit for a rule engine that accommodates a data model like that of datomic is that it automatically gains access to temporal reasoning, something many rule engines struggle with or do not implement as seamlessly. It also makes the shape of the data less relevant and simplifies things considerably.

Datomic already has rules (well the LHS anyway)... if only they could work incrementally and trigger a RHS... I guess this is probably the wrong forum for this line of reasoning... I'll take it over there.
 

As for help on this project, it would be welcome! Right now the code is rough and the documentation nearly absent, but I plan on addressing that in the coming weeks. I am starting to track bugs and enhancements on Github for this project in the interest of transparency and collaboration. If there's a particular item of interest, feel free to log or claim an issue on the project.

Sure - I'll will start by writing some docs so that I can more fully understand the implementation. See you over there.

 

Regarding ClojureScript: I believe it should be straightforward to get this running in ClojureScript, but I haven't attempted it. There is a small amount of logic specific to the JVM -- the mechanism for identifying the fields that exist in a fact used in the working memory -- but that could be factored out ad a ClojureScript alternative made available. I logged an issue to track that: https://github.com/rbrush/clara-rules/issues/4

Nice... As you consider the java interop support you might want to avoid excluding clojurescript, even if it isn't supported initially. I have seen many client-side use cases that could benefit from a rules engine. The equivalent javascript solutions get messy fast.

Take care.

Alan
Reply all
Reply to author
Forward
0 new messages