MapReduce import woes

161 views
Skip to first unread message

Marijn Vriens

unread,
Nov 13, 2014, 12:07:49 PM11/13/14
to google-a...@googlegroups.com

Hi all,

Being relatively new to GAE and certainly to mapreduce part of it, thus the following is probably screamingly obvious, but it's eluding for a good time now.

So, I'm trying to make use of mapreduce on GAE.

The first hurdle is that the Python MapReduce demo on https://github.com/GoogleCloudPlatform/appengine-mapreduce seems to have been removed from the repository with commit ac4483dfae8e77ead7d0844799ed75b95afcbb99 about two days ago while all the documentation still points to it. (Oops?)

So, looking at https://github.com/GoogleCloudPlatform/appengine-mapreduce/blob/1acb1e9165552dbff8cdfe32c8836b44013ab2d5/python/demo/app.yaml I see nothing special in the way of special includes...

Just import it, like in python/demo/main.py , right? But "from mapreduce import base_handler" fails with "ImportError: No module named mapreduce", Hmmm.

Maybe it's my dev-server (SDK1.9,15 on windows7, running 32 bit python 2.7.8 by the way), but uploading a simple instance to the GAE production gives me the same ImportErrors. That's okay then.

So, I guess there was a reason the demo was removed and I do the next most obvious thing, it's a thing you need to add it, like a library. Looking around I see mapreduce in "google_appengine\google\appengine\ext\builtins". So according to https://cloud.google.com/appengine/docs/python/config/appconfig?csw=1#Python_app_yaml_Builtin_handlers adding the following to app.yaml should work.

builtins:
- mapreduce: on

But this fails with: "google.appengine.ext.builtins.InvalidBuiltinName: mapreduce is not the name of a valid builtin.".

Okay, so it's not that.  Maybe it's a straight include after all, just more explicit. So: "from google.appengine.ext.mapreduce2 import base_handler" This gives me a warning from the dev-server:

WARNING  2014-11-13 16:45:09,437 __init__.py:43] You should not use the mapreduce library that is bundled with the SDK. Use the one from https://pypi.python.org/pypi/GoogleAppEngineMapReduce instead.

And fails with: "ImportError: No module named simplejson"

No it's not that... maybe add some magic handlers like in the app.yaml of the demo? Added the handlers, still giving ImportError on mapreduce.

I've added the mapreduce code to the project itself. but that gives:

... \mapreduce\third_party\pipeline\__init__.py", line 26, in _fix_path

all_paths = os.environ.get('PYTHONPATH').split(os.pathsep)

AttributeError: 'NoneType' object has no attribute 'split'

Added the mapreduce.yaml file with these strange "Make messages lowercase" mapreduce functions that don't seem to related to the rest of the demo. Didn't make a difference either.

So, I'm now a bit lost as to actually how to include mapreduce.

Anybody care to tell me how they did it?

Kind regards,
      Marijn Vriens.

PK

unread,
Nov 13, 2014, 12:19:12 PM11/13/14
to Marijn Vriens, google-a...@googlegroups.com
Did you copy the mapreduce/python/src directory (from github) at the top of your app, at the same level where app.yaml is?

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine.
For more options, visit https://groups.google.com/d/optout.

Tom Kaitchuck

unread,
Nov 13, 2014, 5:22:58 PM11/13/14
to google-a...@googlegroups.com, Marijn Vriens
Take a look at the build.sh script: https://github.com/GoogleCloudPlatform/appengine-mapreduce/blob/master/python/build.sh
It compiles (and runs) the demo application using the checked out Mapreduce.
It currently has an issue that makes it kindof annoying:

However if you just want to depend on MapReduce in your application, the easiest thing is to get it from Pypi.
Reply all
Reply to author
Forward
0 new messages