--
You received this message because you are subscribed to the Google Groups "Google App Engine Pipeline API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to app-engine-pipeli...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to app-engine-pipeline-api+unsub...@googlegroups.com.
--
You received this message because you are subscribed to the Google Groups "Google App Engine Pipeline API" group.
To unsubscribe from this group and stop receiving emails from it, send an email to app-engine-pipeli...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to app-engine-pipeline-api+unsub...@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to app-engine-pipeli...@googlegroups.com.
The MapReduce library is able to sort datasets much larger than your memory. This is a major part of it's power. I routinely run TB size jobs using the Java version.
The library will do something similar to what you are describe above, but it has optimize batching on reads and writes, automatic error handling that is transparent to your program's logic, and can operate in parallel on huge datasets.
To unsubscribe from this group and stop receiving emails from it, send an email to app-engine-pipeline-api+unsubscr...@googlegroups.com.
On Friday, 3 October 2014 12:42:41 UTC+13, Tom Kaitchuck wrote:The MapReduce library is able to sort datasets much larger than your memory. This is a major part of it's power. I routinely run TB size jobs using the Java version.Really?!! I thought MapReduce would just use normal task queues, and that means each task is limited to the instance memory size.
How can a single task exceed that? I'd like to know more, are you sure you're not looking at the combined memory size of many different tasks?