I've been using the mapreduce library for the last 18mo or so.
In addition to what's already been mentioned, some additional comments:
- The docs are kinda confusing because there's different sets of docs. Just the fact that docs are disorganized gives the impression that it's a low priority project that's not well maintained.
Keeping one set of well maintained docs would help give the sense that MapReduce is a higher class citizen
- Using MapReduce for schema changes is probably a very common yet simple use case. I've heard more than one comment that the MapReduce pipeline seems to complicated to pick up to do a simple task of updating a bunch of entities.
mapper_spec? reducer_spec? input_reader? output_reader? Do I have to learn all these things just to add an extra field to my entities? While the wordcount demo shows more of the pipeline, it would probably be easier for users to pick up if there was a simple demo of how to update your 'schema' with 5 lines of python. (a DatastoreInputReader that supports filtering would be great too)
If this sounds too negative, you can interepret these comments as saying that the rest of the GAE docs are great and easy to follow.
- Packaged versions would be great. I mean, it was great. I'm not sure why you guys got rid of it. Maybe I'm not hardcore enough to just sync with the repo (actually, I did). However, a packaged version suggests that it's tested and stable. If I see bugs, I can check online to see if anyone's seeing the same issue. When syncing with the repo, I have not idea how stable the latest checkins are. Maybe something just broke and I'm the one person who sync'd after the broken change, and I'm obviously not going to be constantly syncing the MR library, because I actually have other things to work on.
What about the version that's included in the SDK? I toyed with using that, but the docs indicate I should be downloading from the repo. So is the repo more recent, and the SDK version outdated? Would the SDK version be more stable? Again, confusion.