Easier Parallel Programming on Google App Engine Java

105 views
Skip to first unread message

Vladimír Oraný

unread,
Feb 16, 2012, 2:37:07 PM2/16/12
to google-a...@googlegroups.com
Hi all,
I've created framework for easier programming on Google App Engine Java. If you are interested you can find the short introduction here


More info you can find in the project web site


The project is also hosted on Maven Central so you can start experimenting right now using "eu.appsatori:pipes:0.6.0" dependency. A minimal sample code you can see bellow.

// the class starting looking for the needle in haystack parallely
public class StartSeach implements Node<SerialPipe, Collection<Haystack>> {
   public NodeResult execute(SerialPipe pipe, Collection<HayStack> haystacks){
      return pipe.fork(FindNeedle.class, haystacks);
   }
}

// the class actually searching the needle in the haystack
public class StartSeach implements Node<ParallelPipe, Haystack> {
public NodeResult execute(ParallelPipe pipe, HayStack haystacks){
// let's find the needle
return pipe.join(HaystackSearched.class, needleFound);
}
}

Any feedback is highly appreciated!

Cheers,
Vladimir

Matija

unread,
Feb 17, 2012, 6:50:09 AM2/17/12
to google-a...@googlegroups.com
With your current model is it possible to achieve something like this. 

I need fork and join model, but at start I don't know all fork nodes. Only at the end when I start my last fork node I will know that these were all my nodes.

Let's say I need to group/count/summarize some data and my every fork node would analyze some amount of entities, but because I am traversing through query with cursor I don't have all my data up until I reach the end of index.

Matija.

Matija

unread,
Feb 17, 2012, 6:56:13 AM2/17/12
to google-a...@googlegroups.com
I did forget to mention that I want to start every node executing as soon as possible and not all nodes at the end.

Matija.

Vladimír Oraný

unread,
Feb 17, 2012, 9:57:21 AM2/17/12
to google-a...@googlegroups.com
do you mean something like "inject" function in Groovy which is sending the result of one call to the another or more like following "cumulate" method?

// some starter code
while(cursor.hasNext){
  Pipes.cumulate(NextNode.class, cursor.next());
}


// next node class
NodeResult execute(ParallelPipe pipe, Entity en){
  // count something based on the entity
  pipe.join(TheFinalNode.class, count);
}

// the final node class
NodeResult execute(SerialPipe pipe, Collection<Long>){
 // do something with the counts
}





Matija

unread,
Feb 17, 2012, 10:14:37 AM2/17/12
to google-a...@googlegroups.com
Hm. To be honest I don't understand your "cumulate" model, but idea is not like Groovy inject function.

What I need:

One task (let us call it 'task reader') to iterate over some query and create new analysis task ('task analyzer') for each data set. Third join task ('finish task') should be called when every task 'analyzer' is finished with grouped data, but only if task 'reader' has finished its reading role and iterated to the end of query. And I want tasks 'analyzer' to be executed in parallel with task 'reader'. 

Matija.

Vladimír Oraný

unread,
Feb 17, 2012, 12:50:47 PM2/17/12
to google-a...@googlegroups.com
sorry it wasn't clear. it looks much like the cumulate model - you just need parallel tasks to be started not at once but as soon as possible. if we follow the farmers example in the blog post (http://en.appsatori.eu/2012/02/easier-parallel-programming-on-google.html) what you need is that the farmer start searching the needle as soon as he arrives. 

this is not supported yet but it looks like very good use case. I've started an issue for this https://github.com/musketyr/appsatori-pipes/issues/6 and I hope it will be supported soon.

Matija

unread,
Feb 17, 2012, 2:46:38 PM2/17/12
to google-a...@googlegroups.com
As soon as it is supported I would start to use it. Good job.

Matija.
Reply all
Reply to author
Forward
0 new messages