MapOnlyMapper Jobs Failing to Complete

31 views
Skip to first unread message

Vamsi Krishna Penumatsa

unread,
Apr 26, 2016, 7:34:29 AM4/26/16
to Google App Engine

Hello,


I have a very large dataset, and want to update certain entity kinds. I am exploring MapReduce library in GoogleAppEngine. I have followed the examples listed here.

https://github.com/GoogleCloudPlatform/appengine-mapreduce/tree/master/java/example/src/com/google/appengine/demos/mapreduce/entitycount

What I am basically doing is this, in my MapSpecification

MapSpecification<Entity, Entity, Void> spec = new MapSpecification.Builder<>(
                new DatastoreKeyInput(query,2),
                new UrlFlattenMapper(),
                new DatastoreOutput())
                .setJobName("Flatten URLs entities")
                .build();

and My Mapper basically performs the operations on the Entity and then Emits it, for the DatastoreOutput writer to write it back into the database.

My problem is, the Entities are getting updated fine. The endSlice is also being called in my MapperTask. But the Jobs is not completing. I keep getting these errors

[INFO] INFO: RetryHelper(28.07 ms, 1 attempts, java.util.concurrent.Executors$RunnableAdapter@7f0264e0): Attempt #1 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=1, shardCount=2, lastWorkItem=Topics("jzdh"), workerCallCount=297, workerTimeMillis=42513], inputExhausted=true, isFirstSlice=false]], sleeping for 1028 ms
[INFO] Apr 26, 2016 4:39:37 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry
[INFO] INFO: RetryHelper(1.085 s, 2 attempts, java.util.concurrent.Executors$RunnableAdapter@7f0264e0): Attempt #2 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=1, shardCount=2, lastWorkItem=Topics("jzdh"), workerCallCount=297, workerTimeMillis=42513], inputExhausted=true, isFirstSlice=false]], sleeping for 2435 ms
[INFO] Apr 26, 2016 4:39:37 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry
[INFO] INFO: RetryHelper(3.562 s, 3 attempts, java.util.concurrent.Executors$RunnableAdapter@6d7fcd47): Attempt #3 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=0, shardCount=2, lastWorkItem=Topics("jz63"), workerCallCount=289, workerTimeMillis=41536], inputExhausted=true, isFirstSlice=false]], sleeping for 3421 ms
[INFO] Apr 26, 2016 4:39:39 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry
[INFO] INFO: RetryHelper(3.567 s, 3 attempts, java.util.concurrent.Executors$RunnableAdapter@7f0264e0): Attempt #3 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=1, shardCount=2, lastWorkItem=Topics("jzdh"), workerCallCount=297, workerTimeMillis=42513], inputExhausted=true, isFirstSlice=false]], sleeping for 3340 ms
[INFO] Apr 26, 2016 4:39:41 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry
[INFO] INFO: RetryHelper(7.015 s, 4 attempts, java.util.concurrent.Executors$RunnableAdapter@6d7fcd47): Attempt #4 failed [java.lang.RuntimeException: Can't serialize object: MapOnlyShardTask[context=IncrementalTaskContext[jobId=3c041e68-5041-458c-994b-290cd941f8bb, shardNumber=0, shardCount=2, lastWorkItem=Topics("jz63"), workerCallCount=289, workerTimeMillis=41536], inputExhausted=true, isFirstSlice=false]], sleeping for 6941 ms
[INFO] Apr 26, 2016 4:39:42 PM com.google.appengine.tools.cloudstorage.RetryHelper doRetry

I havent been able to get around this issue, any help or pointers on what I could be doing would be greatly appreciated.


Thanks,

Vamsi


PS:I have raised this question in StackOverflow, but wanted to ask here for better exposure.

Reply all
Reply to author
Forward
0 new messages