Graph modification during execution?

57 views
Skip to first unread message

Harald Wellmann

unread,
Aug 14, 2011, 12:27:16 PM8/14/11
to sarasvati-wf-users
Does Sarasvati support self-modifying graphs?

The use case for this would be a map-reduce scenario where a node
replaces itself by a number of children at runtime that all need to be
joined to complete the execution of the original node.

Of course you can't very well write this down in XML, but I wonder if
you can do something like this programmatically.

Best regards,
Harald

Paul Lorenz

unread,
Aug 14, 2011, 1:43:14 PM8/14/11
to sarasvati...@googlegroups.com
I think you could do that with the in-memory graph. I wouldn't replace the node, but rather have a set up A -> B -> C, where A figures the number of children and then replaces B with those children and C is an AND join.

BUT, before you try it, Sarasvati already supports something like this natively, using TokenSets.

You have the same setup A ->B -> C.

A does a Engine#completeWithNewTokenSet, and specifies the number of tokens to generate as well as a TokenSetEnv, which can have custom parameters for each token. B then gets executed once for each token, and C would have a token set join, and wouldn't complete until all the tokens from that token set had arrived.

Cheers,
Paul


--
You received this message because you are subscribed to the Google Groups "sarasvati-wf-users" group.
To post to this group, send email to sarasvati...@googlegroups.com.
To unsubscribe from this group, send email to sarasvati-wf-us...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/sarasvati-wf-users?hl=en.


Paul Lorenz

unread,
Aug 14, 2011, 1:46:05 PM8/14/11
to sarasvati...@googlegroups.com
See common/test-wf/token-set-approvals.wf.xml for an example workflow which uses token sets.

Harald Wellmann

unread,
Aug 14, 2011, 2:48:36 PM8/14/11
to sarasvati...@googlegroups.com
Sounds good :-) Do TokenSets also allow for parallel execution?

Or maybe the question is, does Sarasvati support multi-threaded or even
multi-host execution at all, or would that require a new Engine
implementation?

Cheers,
Harald

Paul Lorenz

unread,
Aug 14, 2011, 3:38:02 PM8/14/11
to sarasvati...@googlegroups.com
Sarasvati is single-threaded for a given process (multiple processes can execute concurrently). My general advice is to have the nodes execute serially, having each kick off an asynchronous task, then completing back to the engine when they're complete.

A parallel engine implementation is possible, I just don't know that it would be much faster than what I describe above, and if that's worth the extra complexity.

Cheers,
Paul

--
You received this message because you are subscribed to the Google Groups "sarasvati-wf-users" group.
To post to this group, send email to sarasvati-wf-users@googlegroups.com.
To unsubscribe from this group, send email to sarasvati-wf-users+unsub...@googlegroups.com.

Harald Wellmann

unread,
Aug 15, 2011, 3:41:08 AM8/15/11
to sarasvati-wf-users
I believe I'm getting your point about kicking off an asynchronous
task, but somehow I'm confused by what I see in the Sarasvati API.

It's fine to have just a single thread working on the graph and
deciding which node to process next, but when you work with
asynchronous tasks, I'd expect to have a synchronized queue where the
tasks put their completion events, running under any number of worker
threads, whereas the graph main thread pulls the events off the queue
and blocks while the queue is empty.

The ExecutionEventQueue appears to be something rather different, it
does not store events at all but fires them. addListener() is
synchronized, but the events are not.

Cheers,
Harald

Paul Lorenz

unread,
Aug 15, 2011, 9:15:54 AM8/15/11
to sarasvati...@googlegroups.com
There isn't anything in the Savasvati API to do that for you. You want a Node subclass that does something like

package com.googlecode.sarasvati.test;

import java.util.concurrent.Executor;
import java.util.concurrent.Executors;

import com.googlecode.sarasvati.Engine;
import com.googlecode.sarasvati.NodeToken;
import com.googlecode.sarasvati.mem.MemNode;

public abstract class AsyncNode extends MemNode
{
  private static final Executor e = Executors.newFixedThreadPool(20);

  @Override
  public void execute(final Engine engine, final NodeToken token)
  {
    e.execute(new Runnable()
    {
      public void run()
      {
        executeAsync(engine, token);
      }
    });
  }

  protected abstract void executeAsync(Engine engine, NodeToken token);
}

Cheers,
Paul

--
You received this message because you are subscribed to the Google Groups "sarasvati-wf-users" group.
To post to this group, send email to sarasvati...@googlegroups.com.
To unsubscribe from this group, send email to sarasvati-wf-us...@googlegroups.com.

Harald Wellmann

unread,
Aug 15, 2011, 1:33:19 PM8/15/11
to sarasvati...@googlegroups.com
Ok. but that doesn't explain how the Engine synchronizes the completion
events from the worker threads.

I suppose the AsyncNode is expected to call
engine.completeAsynchronous() when it's done, but neither this method
nor BaseEngine.asyncQueue is synchronized, so this is bound to cause
concurrency issues, or am I missing anything...?

Cheers,
Harald

Paul Lorenz

unread,
Aug 15, 2011, 2:01:33 PM8/15/11
to sarasvati...@googlegroups.com
You are correct. I've generally handled this by synchronizing on the process. I've mostly used the hibernate backend, so that entailed locking the corresponding row in the process table. For the in memory version, it would just be a synchronized block using the graph object.

Cheers,
Paul

--
You received this message because you are subscribed to the Google Groups "sarasvati-wf-users" group.
To post to this group, send email to sarasvati-wf-users@googlegroups.com.
To unsubscribe from this group, send email to sarasvati-wf-users+unsub...@googlegroups.com.

Harald Wellmann

unread,
Aug 15, 2011, 3:00:09 PM8/15/11
to sarasvati...@googlegroups.com
Thanks, now I see more clearly :-)

The background of my question is, I once wrote a small graph-based
workflow engine of my own, with a focus on multi-threading and
map-reduce to distribute tasks on the local host or on a grid.

This was just a by-product to get a large data grinding job done, and I
can see many things in Sarasvati I always would have liked to implement
in my engine but never got round to, so I was just asking myself what it
would take to build a similar solution on top of Sarasvati.

Now I think I know where to start, and I'll give it a try. Thanks again
for providing the pointers!

Cheers,
Harald

Paul Lorenz

unread,
Aug 15, 2011, 3:10:37 PM8/15/11
to sarasvati...@googlegroups.com
You're welcome, and good luck!

Reply all
Reply to author
Forward
0 new messages