From: Pushpender Garg
Sent: April 17, 2015 9:38:19am PDT
To: cascadi...@googlegroups.com
Subject: checkpoint question
I have had issues to use checkpoints in all tools that I have used and I have learnt to avoid checkpoint as much as possible.
I think it can create even more issues on Hadoop. I have a question that lets say I did a groupby with 4 reducers and then did a checkpoint. It means all sorted data will go to disk and then it will be read again for next task which can be "every" operation. Now when cascading read from disk and apply groups for "every" operation is not there are possibility of messing up the group because now it will be in a map operation?