Clojure Async/State Machine/Workflow Libraries?

859 views
Skip to first unread message

Tim Visher

unread,
Apr 30, 2015, 7:35:25 AM4/30/15
to clo...@googlegroups.com
Hey All,

Anyone have any tips on clojure 'workflow' libraries? https://github.com/relaynetwork/impresario is very close, but lacks some basic features like exception transitions, etc. 

Basically, I'm looking for a library that allows me to create a workflow that will happen asynchronously, recording it's progress in a db. I think i could probably whip something together without _too_ much trouble using core.async but this feels like something that's probably already been written.

Thanks in advance!

--

In Christ,

Timmy V.

http://five.sentenc.es/ -- Spend less time on mail

Vjeran Marcinko

unread,
Apr 30, 2015, 1:11:43 PM4/30/15
to clo...@googlegroups.com
If you're looking for something similar to some BPM (BPMN, BPEL...) engines in Clojure land, I *think* there is nothing similar here. I'm actually researching that area occasionally, and thinking wishfully about implementing one in Clojure someday.

When core.async appeared first, since it also comes from "process area" of IT (CSP, actors, process algebra...), I thought it would be sufficient for that case also, but unfortunately it seems it has some strong differences between BPM engines which are "session-based", meaning, each message that is received over channel marked as 'session creator' spawns new async process which is long, very long running (potentially years), and all subsequent messages that have correlation value for that process are routed to that session afterwards. 

Biggest similarity is that both approaches (BPM enginer and core.async) invert control of execution, meaning, you write easy-to-grasp sequential code which is executed asynchronously, but one would need option to stop the execution in some point of "go" block, persist it, and continue it later. In Java, Apache ODDE, which is BPEL engine, uses Pi-calculus engine underneath, that uses continuations queue and is able to persist the session on demand, and dehydrate it again when needed, even if that moment comes a year later..

In other words, we need something like durable, restartable, GO blocks, for each indivudual long-running session, and there can be hundreds of thousands of them active in a system simultaneously (think about hundred k of active purchase orders...).

-Vjeran

Alan Moore

unread,
Apr 30, 2015, 10:59:43 PM4/30/15
to clo...@googlegroups.com
Timmy,

Several BPM tools are derivatives of or are directly based upon business rule engines. They usually pile on a bunch of higher level abstractions, UIs and/or frameworks to make them business user friendly. I have not seen anything like this in Clojure.

However, you might want to take a look at Clara which is a rule engine written in Clojure. It would give you a lower level library upon which you could build the rest of the BPM feature sets.

If Clara doesn't give you all of what you need you could look into integrating with the JBoss/Drools tooling via Java interop.

Good luck!

Alan

Brett Morgan

unread,
May 1, 2015, 4:43:56 PM5/1/15
to clo...@googlegroups.com
but one would need option to stop the execution in some point of "go" block, persist it, and continue it later.
 
Why would you need to stop execution?  You could just have a chan, put what you need to persist on it, then then have different go block persist it.  Main processing continues on happily.  

Vjeran Marcinko

unread,
May 2, 2015, 8:25:27 AM5/2/15
to clo...@googlegroups.com
I meant it would be good to have engine itself be able to remember what workflow "activity" it has executed last, and in cas eof reboot, be able to start from there, and I mean that for each active "session" because we want to code workflow *for each indivudual session* in easy-to-grasp manner, same as core.async allows us to code sequentially in easy way some async process with lots of interacting channels.

For example, I will present in some stateful pseudo-code what this orchestration languages look like, and lets take example of some overly simple Purchase Order that after it is created, it waits for approval request, and after that it notifies external service  to process it, and it tries that last action 3 times (with some wait period in between these tries) with until it is gives up.

So we have session workflow DSL something like:

RECEIVE CreatePurchaseOrderRequest (createSession=yes)  // marker for creating new session "process"
var sessionId = generateOrderId();
var orderState = "PENDING_APPROVAL";
SEND CreatePurchaseOrderResponse;

RECEIVE ApprovePurchaseOrderRequest (sessionExpression = "$message.orderId == sessionId") // session process ID expression
orderState = "APPROVED";
SEND ApprovePurchaseOrderResponse

var processed = false;
var numberOfAttempts = 0;
WHILE (processed) {
   numberOfAttempts++;
   try {
     SEND ProcessPurchaseOrderRequest;
     RECEIVE ProcessPurchaseOrderResponse(sessionExpression = "$message.orderId == sessionId");
     processed = true;
     orderState = "PROCESSED";
  } catch (
    if (numberOfAttempts == 3) {
        orderState = 'FAILED_PROCESSING";
        break;
    } else {
        WAIT (period = 1 day);
    }
  }
}

I know this above is awful stateful language, not having any functional characteristics, but I just wanted to point out that this represents orchastration logic that is very long running potentially (you can see we wait for 1 day between processing attempts), and very easy to read although there are lots of async stuff happening there (no callback hell). And if engine gets restarted during that period, it would be great if it would remember where it left off, and remember whole session state (eg. "orderState", "numberOfAttempts" variables...) to be able to do that properly. Each session is kind of "go" block, but it is session specific (each RECEIVE activity has correlation expression required for routing to that particular session), and there can be maybe even millions of them. It doesn't make sense to keep them all in memory, thus engine takes care of hydrating/dehydrating them from disk.

Regards,
Vjeran

Tim Visher

unread,
May 4, 2015, 11:56:00 AM5/4/15
to clo...@googlegroups.com
I think I may have summoned the wrong demons when invoking with the `Workflow` keyword. :)

I've found some resources on Event-Driven Architecture, mostly from Zach Tellman. Is his stuff the main source of that sort of thing?

I realized that prismatic's graph is basically what I'm looking for (especially with the addition of async into the model) so long as my edge predicates are always based on data availability, since graph is essentially what I'm talking about as far as a 'workflow' description where you talk about steps, but the transition predicates are implicitly defined by data availability, which I think at least models the current problems that I have.

Any further thoughts given that new information?
Reply all
Reply to author
Forward
0 new messages