Creating a cluster in the eu-west-1 region.

113 views
Skip to first unread message

Gareth Rogers

unread,
Dec 6, 2013, 7:17:41 AM12/6/13
to lemur...@googlegroups.com
I would like my job, created by lemur, to run in the eu-west-1 region. Is it that possible?

As far as I can tell this is different to the no longer supported availability zone (http://stackoverflow.com/questions/17622071/indefinite-provisioning-of-emr-cluster-with-segue-in-r/17660786#17660786) which requests the job runs in a zone within the region (http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-regions-availability-zones.html). At the moment the jobs are being created in us-east-1.

Thanks

Marc Limotte

unread,
Dec 6, 2013, 10:27:31 AM12/6/13
to lemur...@googlegroups.com
Hi Gareth,

You can try setting this key in your defcluster:

  :availability-zone "eu-west-1"

The value is passed along to the constructor of 
com.amazonaws.services.elasticmapreduce.model.PlacementType, 
which is, in turn, used to call
com.amazonaws.services.elasticmapreduce.model.JobFlowInstancesConfig#setPlacement

If that doesn't work, you can tool around the Java AWS SDK (http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/elasticmapreduce/model/JobFlowInstancesConfig.html).  Assuming it's possible to do it with the java sdk, then we should be able to figure out a way to do it with lemur.

marc




--
You received this message because you are subscribed to the Google Groups "Lemur User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to lemur-user+...@googlegroups.com.
To post to this group, send email to lemur...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lemur-user/8031f406-0da6-45f5-aec6-967d69d60bca%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Gareth Rogers

unread,
Dec 6, 2013, 12:26:51 PM12/6/13
to lemur...@googlegroups.com
Hi Marc

Thanks for the pointers, I think I've now found the setting that I need. It's the setEndpoint function on the AmazonElasticMapReduceClient (docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/services/elasticmapreduce/AmazonElasticMapReduceClient.html#setEndpoint(java.lang.String, java.lang.String, java.lang.String))). There are two but I think it is the one I've linked two. I guessing based on this article http://aws.amazon.com/articles/Amazon-S3/3604.

What is the best way to set this parameter?

I can try hard coding it in emr.clj (a grep on the code base seems to point to emr-client function) to see if it does have the effect I desire.

Thanks
Gareth

Gareth Rogers

unread,
Dec 9, 2013, 8:17:51 AM12/9/13
to lemur...@googlegroups.com
Hi Marc

I've put in a quick hack to set the endpoint for my EMR client:

Line 34 emr.clj
(defn emr-client [aws-creds]
  (let [client (AmazonElasticMapReduceClient. aws-creds)]
    (.setEndpoint client "eu-west-1.elasticmapreduce.amazonaws.com")
    client))

which has had the effect that I want. The job ran in eu west :)

I think fairly obviously that is not the best way to achieve this!

Do you have any thoughts about how best to implement this?

Do you know if there are any other things that are effected by the change in the region?

There is this `s3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar' jar which the only obvious thing that jumps out at me from the job definition print out.

Thanks
Gareth

Marc Limotte

unread,
Dec 9, 2013, 9:44:16 AM12/9/13
to lemur...@googlegroups.com
So, this is a bit convoluted, but I think you can do this with a dynamic binding instead of hacking the source code.

The fn emr-client is used like this:

(defn emr
  [creds]
  (aws emr-client creds))

The actual client this is used by the rest of the library is specified by a dynamic var:

(def ^{:dynamic true} *emr* nil)

This binding is first initialized by lemur.core/-main:

(ns (:require [com.climate.services.aws :as emr] ...))
(binding [... 
             aws-creds (awscommon/aws-credential-discovery)
             emr/*emr* (emr/emr aws-creds)] 
  ... )
 
So in your jobdef, you can create the emr-client however you want, and then wrap the rest using with-binding, kind of like this: 

(require '[com.climate.services.aws.emr :as emr] '[com.climate.services.aws.common :as aws])
(with-bindings [emr/*emr* (aws/aws YOUR_EMR_CLIENT (aws/aws-credential-discovery))]
   ...  ; your defcluster, defsteps, etc
   )

Admittedly, this is a little goofy.  Ideally Lemur would allow a :region key in the defcluster.  Then lemur.core/execute-jobdef could grab the key and, if it not nil, apply a similar with-bindings step around the load-file form.  Patches are welcome.

IIRC, the path for the script-runner.jar is the only thing that is region specific.  I haven't tried it, but this path will probably work from any region; although a region neutral path would probably be more efficient.


marc

Gareth Rogers

unread,
Dec 9, 2013, 12:49:02 PM12/9/13
to lemur...@googlegroups.com
I can't get that solution to work. Attached is my current best attempt. It seems that the defstep is the problem (which is just based on removing bits of code) but I don't understand why. I'm trying to get some local Clojure expertise.
job.clj

Marc Limotte

unread,
Dec 9, 2013, 1:22:46 PM12/9/13
to lemur...@googlegroups.com
That code looks more-or-less correct to me.  What sort of problem are you running into with defstep?

with-bindings creates a thread-local binding, which I think should be sufficient; but you can also try with-redefs, which is kind of the sledgehammer approach).

marc



Gareth Rogers

unread,
Dec 9, 2013, 1:29:05 PM12/9/13
to lemur...@googlegroups.com
Sorry I forgot to post the error, a colleague distracted me and I came back and hit post. At least that's my excuse :)

If there is just defcluster and fire! then it complains at the fire! step as the step variable isn't defined. If there is just the defstep then I see this error.

vagrant@vagrant-ubuntu-raring-64:~/data/order-processing$ lemur dry-run src/metail/order_jobs/orders.clj
WARNING: You might want to set env variable LEMUR_EXTRA_CLASSPATH to include extra code on lemur's classpath. Generally you would want to do this, so that lemur can find the bases (or other functions) used in your jobdef.
/usr/lib/jvm/java-1.7.0-openjdk-amd64//bin/java -cp /home/vagrant/programs/lemur/lemur-1.3.1.jar:/home/vagrant/programs/lemur/lib/*: lemur.core dry-run src/metail/order_jobs/orders.clj
2013-12-09 17:16:49,908  INFO core:? - Loading jobdef src/metail/order_jobs/orders.clj
Exception in thread "main" java.lang.UnsupportedOperationException: Unknown Collection type, compiling:(/home/vagrant/data/order-processing/src/metail/order_jobs/orders.clj:12)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6416)
    at clojure.lang.Compiler.analyze(Compiler.java:6216)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6397)
    at clojure.lang.Compiler.analyze(Compiler.java:6216)
    at clojure.lang.Compiler.analyze(Compiler.java:6177)
    at clojure.lang.Compiler$InvokeExpr.parse(Compiler.java:3503)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6411)
    at clojure.lang.Compiler.analyze(Compiler.java:6216)
    at clojure.lang.Compiler.analyze(Compiler.java:6177)
    at clojure.lang.Compiler$BodyExpr$Parser.parse(Compiler.java:5572)
    at clojure.lang.Compiler$FnMethod.parse(Compiler.java:5008)
    at clojure.lang.Compiler$FnExpr.parse(Compiler.java:3629)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6407)
    at clojure.lang.Compiler.analyze(Compiler.java:6216)
    at clojure.lang.Compiler.eval(Compiler.java:6462)
    at clojure.lang.Compiler.load(Compiler.java:6902)
    at clojure.lang.Compiler.loadFile(Compiler.java:6863)
    at clojure.lang.RT$3.invoke(RT.java:305)
    at lemur.core$execute_jobdef.invoke(core.clj:760)
    at lemur.core$_main$fn__1288.invoke(core.clj:947)
    at lemur.core$_main.doInvoke(core.clj:942)
    at clojure.lang.RestFn.applyTo(RestFn.java:137)
    at lemur.core.main(Unknown Source)
Caused by: java.lang.UnsupportedOperationException: Unknown Collection type
    at clojure.lang.Compiler$EmptyExpr.emit(Compiler.java:2685)
    at clojure.lang.Compiler$MethodExpr.emitArgsAsArray(Compiler.java:1250)
    at clojure.lang.Compiler$MapExpr.emit(Compiler.java:2760)
    at clojure.lang.Compiler$InvokeExpr.emitArgsAndCall(Compiler.java:3419)
    at clojure.lang.Compiler$InvokeExpr.emit(Compiler.java:3359)
    at clojure.lang.Compiler$DefExpr.emit(Compiler.java:419)
    at clojure.lang.Compiler$BodyExpr.emit(Compiler.java:5616)
    at clojure.lang.Compiler$FnMethod.doEmit(Compiler.java:5169)
    at clojure.lang.Compiler$FnMethod.emit(Compiler.java:5023)
    at clojure.lang.Compiler$FnExpr.emitMethods(Compiler.java:3555)
    at clojure.lang.Compiler$ObjExpr.compile(Compiler.java:4188)
    at clojure.lang.Compiler$FnExpr.parse(Compiler.java:3687)
    at clojure.lang.Compiler.analyzeSeq(Compiler.java:6407)
    ... 22 more

Marc Limotte

unread,
Dec 9, 2013, 1:33:27 PM12/9/13
to lemur...@googlegroups.com
What do you mean by "just the defstep"?  Does that mean no defcluster and fire! ?

What is the result when you run job.clj like it is in this email thread... with all of defcluster, defstep and fire!

marc



Gareth Rogers

unread,
Dec 10, 2013, 5:08:36 AM12/10/13
to lemur...@googlegroups.com
The result of running job.clj with all of defcluster, defstep and fire! is the error I posted.

Additionally I tried running the job.clj with only the defcluster within the with-bindings, I did not get the error message in that case (obviously nothing much happened as there was no fire! command). When I tried to run the job.clj file with only the defstep within the with-bindings then I saw the error posted.

lemur run job.clj is how I'm running this script.

Hopefully that makes a bit more sense than my two rushed posts yesterday.

Marc Limotte

unread,
Dec 10, 2013, 10:39:33 AM12/10/13
to lemur...@googlegroups.com
So this worked for me:

(require '[com.climate.services.aws.emr :as emr]
         '[com.climate.services.aws.common :as aws])

(import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient)

(defn eu-west-1-emr-client
  [aws-creds]
  (doto (AmazonElasticMapReduceClient. aws-creds)

...


(with-bindings {#'emr/*emr* (aws/aws eu-west-1-emr-client (aws/aws-credential-discovery))}
  (fire! test-cluster-fn test-step-fn))

I.e. wrap just the fire! call in with-bindings.  I did get an error when I wrapped the whole thing in with-bindings, but it was a different error than yours and I didn't spend a lot of time trying to figure out why.

marc



Gareth Rogers

unread,
Dec 10, 2013, 11:36:45 AM12/10/13
to lemur...@googlegroups.com
That worked and my job ran in the EU as expected.

Thanks you've been very helpful :)

Marc Limotte

unread,
Dec 10, 2013, 11:50:36 AM12/10/13
to lemur...@googlegroups.com
Great.  Glad it worked out.


rus...@gmail.com

unread,
Dec 15, 2015, 4:35:03 AM12/15/15
to Lemur User
2013년 12월 11일 수요일 오전 1시 50분 36초 UTC+9, Marc Limotte 님의 말:
namespace was renamed,
so now this modified code was worked for me.

>
>(require '[com.climate.services.aws.emr :as emr]
> '[lemur.core])
>
>(import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClient)
>
>(defn ap-northeast-1-emr-client
> [aws-creds]
> (doto (AmazonElasticMapReduceClient. aws-creds)
> (.setEndpoint "elasticmapreduce.ap-northeast-1.amazonaws.com")))
>
>(with-bindings {#'emr/*emr* (ap-northeast-1-emr-client (lemur.core/aws-credentials))}
> (fire! my-cluster my-job-step))
>


Thank you Mr. Limotte.
Reply all
Reply to author
Forward
0 new messages