Modification of the engine Java Ecommerce REcommendation

220 views
Skip to first unread message

bru no

unread,
Jun 9, 2015, 8:38:15 AM6/9/15
to predicti...@googlegroups.com
Hello,

I cloned the engine Java ECommerce Recommendation from https://github.com/PredictionIO/template-java-parallel-ecommercerecommendation

Now I have it in Eclipse with errors due to the fact none of the libraries required are included in the project. Where could I get those libraries? Should I create a pom.xml file manually or is there a neater way to do it?

Thanks

Also, I would like to know whether the java version of this engine is as reliable as the Scala one?

Pat Ferrel

unread,
Jun 9, 2015, 12:00:31 PM6/9/15
to bru no, predicti...@googlegroups.com
There are some instructions for debugging templates using Scala and IntelliJ but many steps will apply to Eclipse too. The primary thing to know is the environment variables, entry point and artifacts needed. https://docs.prediction.io/resources/intellij/

The flow must be satisfied. Each of these CLI drivers will perform part of the workflow. Once you have gotten through #2 the rest can be run inside the IDE but you are still performing these workflow ops.
1) pio-start-all
2) pio build # this must be done from the CLI at least once, then the template can be incrementally compiled in the IDE.
3) pio train
4) pio deploy

The last 3 steps can be run in the debugger once you have the environment setup to be available in Eclipse. The entry points are in the PIO assembly so this needs to be know by Eclipse. I have not used Eclipse for this so can’t offer more precise instructions. Even with IntelliJ there are lots of things to keep straight. For instance a breakpoint in the wrong place will cause timeouts (Zookeeper etc). So be aware of this.

If you get this working let us know so we can document it better.


As to reliability, the code executed is the same, the Java template calls into the Scala and is called by Scala both using simplified Java class and function signatures. So there is no substantive difference. This is essentially the same method Spark uses to support Java.
Message has been deleted

Pat Ferrel

unread,
Jun 17, 2015, 4:25:45 PM6/17/15
to Eco, predicti...@googlegroups.com
No, I think this is because you haven’t added the pio assembly to your project. All pio classes come from that assembly since you are not using pio like a library. Unfortunately I don’t know how to do this in Eclipse.

The debugger must have access to the Spark and PIO assemblies and the dependencies assembly created in /your-template/target/scala-2.10/your-template-…-deps.jar for all deps external to your template code. The debugger should know how to find your classes.

Note in the IntelliJ instructions which assemblies to add as local external libraries. Make suer to add /your-template/target/scala-2.10/your-template-…-deps.jar too even though it is not mentioned there. I’ll update the IntelliJ instructions soon. 


On Jun 17, 2015, at 9:14 AM, Eco <b.l...@esillage.com> wrote:

Thanks Pat for your help.

When I try to launch pio train using eclipse i have the following error
Exception in thread "main" java.lang.NoSuchMethodError: scala.collection.immutable.$colon$colon.hd$1()Ljava/lang/Object;
    at org.json4s.MonadicJValue.$bslash(MonadicJValue.scala:18)
    at io.prediction.workflow.CreateWorkflow$.main(CreateWorkflow.scala:144)
    at io.prediction.workflow.CreateWorkflow.main(CreateWorkflow.scala)

I have Pio 0.9.3, spark 1.3.1 and my scala version is 2.11.6

It seems, from what I read, that there could be a problem of compatibility but I am not sure which version I should use, would you have any clue?

Thank you.



-----------------

Espace entreprise ORCHIS
91 route nationale
35650 - Le Rheu
02 99 85 51 90
http://www.esillage.fr

Message has been deleted
Message has been deleted

bru no

unread,
Jun 18, 2015, 11:00:59 AM6/18/15
to predicti...@googlegroups.com, b.l...@esillage.com
Hello

Thanks for your help. I had the jars in my build path.

However I hadnt run build.sbt, compile and package to get the template...jar. So I did and added the jar to my build path.

I still had my problem which was related to the fact that my scala was in version 2.11 instead of 2.10. Switching to 2.10 solved it.

Now it seems that I have an issue related to Elasticsearch. Something like:
Exception in thread "main" org.elasticsearch.client.transport.NoNodeAvailableException: None of the configured nodes are available: []

Solved. The error was that I hadnt started Elasticsearch using pio-start-all
I also changed the app name in the file engine.json from INVALID_APP_NAME to testMtk in two places (datasource and algorithms).

My next problem is :
[ERROR] [Common$] Invalid app name testMtk
Exception in thread "main" java.lang.IllegalArgumentException: Invalid app name testMtk
    at io.prediction.data.store.Common$$anonfun$appNameToId$2.apply(Common.scala:47)
    at io.prediction.data.store.Common$$anonfun$appNameToId$2.apply(Common.scala:45)
    at scala.Option.getOrElse(Option.scala:120)
    at io.prediction.data.store.Common$.appNameToId(Common.scala:45)
    at io.prediction.data.store.PEventStore$.aggregateProperties(PEventStore.scala:103)
    at org.template.ecommercerecommendation.DataSource.readTraining(DataSource.scala:31)
    at org.template.ecommercerecommendation.DataSource.readTraining(DataSource.scala:18)
    at io.prediction.controller.PDataSource.readTrainingBase(PDataSource.scala:37)
    at io.prediction.controller.Engine$.train(Engine.scala:640)
    at io.prediction.controller.Engine.train(Engine.scala:174)
    at io.prediction.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:65)
    at io.prediction.workflow.CreateWorkflow$.main(CreateWorkflow.scala:246)
    at io.prediction.workflow.CreateWorkflow.main(CreateWorkflow.scala)

So I thought it is due to the fact that I dont have an app created. Considering that I am simulating "pio train" then ok I should create my app in command line :
pio app new testMtk (from the project "template-ecommerce-recommendation" in my eclipse workspace)

So i did. But I still have the same error.
I need data to train my engine, so I imported data to my engine:
"pio import --appid 1 --input data.json"
I still have the same error (I didnt expect my pio import to solve the problem :)

I can't figure out how it makes the link between my project on eclipse and the app I created with command line 'pio app new testMtk'. Why does it tell me it can't find an app named testMtk?


Kenneth Chan

unread,
Jun 18, 2015, 3:17:47 PM6/18/15
to predicti...@googlegroups.com, b.l...@esillage.com
" Invalid app name testMtk" mean this app cannot be found.

Do you see an app with name "testMtk" if you run "pio app list" ?

Pat Ferrel

unread,
Jun 18, 2015, 4:09:11 PM6/18/15
to bru no, predicti...@googlegroups.com, b.l...@esillage.com
Keep in mind that you have to follow the "build, train, deploy” cycle and that “pio build” must be done at least once from the command line. Then the appName comes from your engine.json and is registered with pio. Until it is registered it can’t be run since a check is made to find it in the metadata db.

Also you have to do pio-start-all to get the needed services running. I think all this is covered in the IntelliJ page, the fact you are using Eclipse doesn’t change that part. 

You can debug the CLI bits separately but they must have the services environment set up and running correctly first and I warn you that it is sometime complicated to remember all the moving parts that have to be working. Often Zookeeper dies, which makes Hbase not work so when in doubt I restart all, Spark, HDFS, pio, then make sure “pio status” is clean.

bru no

unread,
Jun 19, 2015, 3:33:20 AM6/19/15
to predicti...@googlegroups.com, bruno...@gmail.com, b.l...@esillage.com
Hello to both of you,

 
Yes I have my app named testMtk that appears when I run pio app list. Somehow I had forgotten to run pio build...

Anyway when I run it I have a new error message. Here the result with "pio build --verbose"

[INFO] [Console$] sbt.ResolveException: unresolved dependency: io.prediction#pio-build;0.9.2: not found
[INFO] [Console$] at sbt.IvyActions$.sbt$IvyActions$$resolve(IvyActions.scala:278)
[INFO] [Console$] at sbt.IvyActions$$anonfun$updateEither$1.apply(IvyActions.scala:175)
[INFO] [Console$] at sbt.IvyActions$$anonfun$updateEither$1.apply(IvyActions.scala:157)
[INFO] [Console$] at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:151)
[INFO] [Console$] at sbt.IvySbt$Module$$anonfun$withModule$1.apply(Ivy.scala:151)
[INFO] [Console$] at sbt.IvySbt$$anonfun$withIvy$1.apply(Ivy.scala:128)
[INFO] [Console$] at sbt.IvySbt.sbt$IvySbt$$action$1(Ivy.scala:56)
[INFO] [Console$] at sbt.IvySbt$$anon$4.call(Ivy.scala:64)
[INFO] [Console$] at xsbt.boot.Locks$GlobalLock.withChannel$1(Locks.scala:93)
[INFO] [Console$] at xsbt.boot.Locks$GlobalLock.xsbt$boot$Locks$GlobalLock$$withChannelRetries$1(Locks.scala:78)
[INFO] [Console$] at xsbt.boot.Locks$GlobalLock$$anonfun$withFileLock$1.apply(Locks.scala:97)
[INFO] [Console$] at xsbt.boot.Using$.withResource(Using.scala:10)
[INFO] [Console$] at xsbt.boot.Using$.apply(Using.scala:9)
[INFO] [Console$] at xsbt.boot.Locks$GlobalLock.ignoringDeadlockAvoided(Locks.scala:58)
[INFO] [Console$] at xsbt.boot.Locks$GlobalLock.withLock(Locks.scala:48)
[INFO] [Console$] at xsbt.boot.Locks$.apply0(Locks.scala:31)
[INFO] [Console$] at xsbt.boot.Locks$.apply(Locks.scala:28)
[INFO] [Console$] at sbt.IvySbt.withDefaultLogger(Ivy.scala:64)
[INFO] [Console$] at sbt.IvySbt.withIvy(Ivy.scala:123)
[INFO] [Console$] at sbt.IvySbt.withIvy(Ivy.scala:120)
[INFO] [Console$] at sbt.IvySbt$Module.withModule(Ivy.scala:151)
[INFO] [Console$] at sbt.IvyActions$.updateEither(IvyActions.scala:157)
[INFO] [Console$] at sbt.Classpaths$$anonfun$sbt$Classpaths$$work$1$1.apply(Defaults.scala:1318)
[INFO] [Console$] at sbt.Classpaths$$anonfun$sbt$Classpaths$$work$1$1.apply(Defaults.scala:1315)
[INFO] [Console$] at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$85.apply(Defaults.scala:1345)
[INFO] [Console$] at sbt.Classpaths$$anonfun$doWork$1$1$$anonfun$85.apply(Defaults.scala:1343)
[INFO] [Console$] at sbt.Tracked$$anonfun$lastOutput$1.apply(Tracked.scala:35)
[INFO] [Console$] at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:1348)
[INFO] [Console$] at sbt.Classpaths$$anonfun$doWork$1$1.apply(Defaults.scala:1342)
[INFO] [Console$] at sbt.Tracked$$anonfun$inputChanged$1.apply(Tracked.scala:45)
[INFO] [Console$] at sbt.Classpaths$.cachedUpdate(Defaults.scala:1360)
[INFO] [Console$] at sbt.Classpaths$$anonfun$updateTask$1.apply(Defaults.scala:1300)
[INFO] [Console$] at sbt.Classpaths$$anonfun$updateTask$1.apply(Defaults.scala:1275)
[INFO] [Console$] at scala.Function1$$anonfun$compose$1.apply(Function1.scala:47)
[INFO] [Console$] at sbt.$tilde$greater$$anonfun$$u2219$1.apply(TypeFunctions.scala:40)
[INFO] [Console$] at sbt.std.Transform$$anon$4.work(System.scala:63)
[INFO] [Console$] at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:226)
[INFO] [Console$] at sbt.Execute$$anonfun$submit$1$$anonfun$apply$1.apply(Execute.scala:226)
[INFO] [Console$] at sbt.ErrorHandling$.wideConvert(ErrorHandling.scala:17)
[INFO] [Console$] at sbt.Execute.work(Execute.scala:235)
[INFO] [Console$] at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:226)
[INFO] [Console$] at sbt.Execute$$anonfun$submit$1.apply(Execute.scala:226)
[INFO] [Console$] at sbt.ConcurrentRestrictions$$anon$4$$anonfun$1.apply(ConcurrentRestrictions.scala:159)
[INFO] [Console$] at sbt.CompletionService$$anon$2.call(CompletionService.scala:28)
[INFO] [Console$] at java.util.concurrent.FutureTask.run(FutureTask.java:262)
[INFO] [Console$] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
[INFO] [Console$] at java.util.concurrent.FutureTask.run(FutureTask.java:262)
[INFO] [Console$] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[INFO] [Console$] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[INFO] [Console$] at java.lang.Thread.run(Thread.java:745)
[INFO] [Console$] [error] (*:update) sbt.ResolveException: unresolved dependency: io.prediction#pio-build;0.9.2: not found
[INFO] [Console$] Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore? 
[ERROR] [Console$] Return code of previous step is 1. Aborting.
 
I use predictionio 0.9.3
The error message is related to the content of a file named pio-build.sbt which contains:
addSbtPlugin("io.prediction" % "pio-build" % "0.9.2")
I tried to modify 0.9.2 to 0.9.3 but it doesnt work either. What does this line mean? What is my mistake?

Thank you for your help
 

bru no

unread,
Jun 25, 2015, 5:30:42 AM6/25/15
to predicti...@googlegroups.com
Hello,

I am back trying to have my engine running using Eclipse.

I re-did everything from the beginning, that is : 
I got the engine recommendation using: pio template get PredictionIO/template-scala-parallel-ecommercerecommendation mytemplate
cd mytemplate

i generated my accesskey
pio app new zemtk

I modified the engine.json to insert my app name at the two locations required.

I imported data
pio import --appId 5 --input data

i built the engine
pio build
 then trained it
pio train
then deployed it
pio deploy

this part is fine, my engine is deployed and answer my requests

now I got a plugin for sbt to transform a project in a Eclipse project. 
I run sbt, it brings me back a lot of libraries
then i run sbt eclipse, so I have an Eclipse project that I import in Eclipse.

Next I added pio-assembly-0.9.3.jar and spark-assembly-1.3.0-hadoop2.6.0.jar (note that I added the 2.6.0 and not the 2.4.0 as recommended, as it is the one i found in my installed spark.

I added:
Main class: io.prediction.workflow.CreateWorkflow
VM options: -Dspark.master=local -Dlog4j.configuration=file:/**replace_with_your_PredictionIO_path**/conf/log4j.properties
Program arguments: --engine-id dummy --engine-version dummy --engine-variant engine.json
modified accordingly

and all the environnemnt variables also modified accordingly.

When I run this configuration I have an error:
Exception in thread "main" java.lang.IllegalArgumentException: Invalid app name zemtk
even though when I ran it in command line with 'pio train' I didnt have any problem...

My engine.json is fine, I dont understand what is going wrong.

Would you have any suggestion please?

Thanks





Pat Ferrel

unread,
Jun 25, 2015, 11:02:30 AM6/25/15
to bru no, predicti...@googlegroups.com
You will end up with debug configs for the equivalent of ‘pio train’ ‘pio deploy’ and a run config that launches a bash script to send a query. ‘pio build' must be done from the commend line at least once because it creates the manifest.json and doesn’t from inside the debugger (we are making this much easier but for now…)

Is the appName in your engine.json “zemtk”? Try running ‘pio app list’ and note you appName and access key, these won’t change unless you register another app. Can you send your engine.json?

bru no

unread,
Jun 25, 2015, 11:48:29 AM6/25/15
to Pat Ferrel, predicti...@googlegroups.com
Hello Pat,

Thanks for your help.

Yes I am dreaming to be able to debug my pio scala code :)

I launched the whole chain : build/train/deploy from command line.

Here is the content of my engine.json file:

{
  "id": "default",
  "description": "Default settings",
  "engineFactory": "com.esillage.ECommerceRecommendationEngine",
  "datasource": {
    "params" : {
      "appName": "zemtk"
    }
  },
  "algorithms": [
    {
      "name": "ecomm",
      "params": {
        "appName": "zemtk",
        "unseenOnly": true,
        "seenEvents": ["buy", "view"],
        "similarEvents": ["view"],
        "rank": 10,
        "numIterations" : 20,
        "lambda": 0.01,
        "seed": 3
      }
    }
  ]
}

and here is the result of pio app list:
[INFO] [App$]                zemtk |    5 | BKKrHjtkrcVzd4zMWI9crS4wQb9GW36IsgyAajEa7miHjhLv8rjsyqB5iVsdMbaY | (all)


Pat Ferrel

unread,
Jun 25, 2015, 1:46:36 PM6/25/15
to bru no, predicti...@googlegroups.com
The appName seems ok. If the advice below gives the same error try creating a new appName and send the entire stack trace if that errors out.

As I recall when I first ran train from the debugger I used real params instead of “dummy” for app params, like this:
Program arguments: --engine-id **get-from-manifest.json** --engine-version **get-from-manifest.json** --engine-variant engine.json

If there is no manifest.json we have some other issue.

If you get past train, the config for Deploy will require the instance ID that is output from train so make a note of it before deploy.


BTW: your VM options were modified, right?
 -Dspark.master=local -Dlog4j.configuration=file:/**replace_with_your_PredictionIO_path**/conf/log4j.properties
Message has been deleted

Pat Ferrel

unread,
Jun 26, 2015, 11:48:17 AM6/26/15
to bru no, predicti...@googlegroups.com
I don’t see anything obvious with your setup

1) Everything in engine.json must be modified by you. It is where you configure parameters so not sure of your question. If you change the appName, you must also modify the engine.json.

2) Your Eclipse env looks correct for using a spark master of “local”. It is exactly like mine. 

3) do you have hdfs and spark running on your local machine all the time? One difference in running in IntelliJ (and I assume also Eclipse) and the command line is that IntelliJ launches spark inside the debugger, and obviously the CLI does not. So Eclipse must have a way to find all of the PIO and Spark classes. The error in Eclipse look like the metadata about your app is not found correctly, which happens if inside Eclipse the code can’t access Hbase or Elasticsearch or if the metadata created with pio build is not available somehow. 

4) PIO_FS_ENGINESDIR=/home/dev/.pio_store/engines and PIO_FS_TMPDIR=/home/dev/.pio_store/tmp will be put into hdfs if that’s the fl you are using. Yes I know those aren’t good paths for hdfs but anyway… Maybe you can pull the hdfs versions of those and also put them in the localfs in case that is where Eclipse expects them?

5) Since I don’t use Eclipse I’m sorry I can’t point you in a better direction. I know it’s not your preference but IntelliJ for Scala is free :-)


On Jun 26, 2015, at 1:22 AM, bru no <bruno...@gmail.com> wrote:

Hello Pat,

Thanks again.

So I generated a new appname:
~$ pio app new mtkappname
[INFO] [HBLEvents] The table pio_event:events_7 doesn't exist yet. Creating now...
[INFO] [App$] Initialized Event Store for this app ID: 7.
[INFO] [App$] Created new app:
[INFO] [App$]       Name: mtkappname
[INFO] [App$]         ID: 7
[INFO] [App$] Access Key: j8O1U0F13em6HLnIdSuhYt8XYfqEAlkjmaiCyoG2Gnn5p1JfF2Wzc4HknwNq1ePZ

Here is the result when I launch:
$ pio app list
[INFO] [App$]                 Name |   ID |                                                       Access Key | Allowed Event(s)
[INFO] [App$]           mtkappname |    7 | j8O1U0F13em6HLnIdSuhYt8XYfqEAlkjmaiCyoG2Gnn5p1JfF2Wzc4HknwNq1ePZ | (all)
[INFO] [App$] Finished listing 1 app(s).


Note that my engine.json contains INVALID_APP_NAME and that I have to modify manually its name in the file, is that the same for everybody or is there something I dont do correctly?

I modify the engine.json accordingly:
{
  "id": "default",
  "description": "Default settings",
  "engineFactory": "com.esillage.ECommerceRecommendationEngine",
  "datasource": {
    "params" : {
      "appName": "mtkappname"
    }
  },
  "algorithms": [
    {
      "name": "ecomm",
      "params": {
        "appName": "mtkappname",
        "unseenOnly": true,
        "seenEvents": ["buy", "view"],
        "similarEvents": ["view"],
        "rank": 10,
        "numIterations" : 20,
        "lambda": 0.01,
        "seed": 3
      }
    }
  ]
}

My manifest.json is:
{"id":"mjaOHNTajMZJpHndxjg0AyIAhp1HtqK4","version":"2679cce2b66d26effa839b6d393459db1614bb3e","name":"testmtkecoreco","description":"pio-autogen-manifest","files":[],"engineFactory":""}

I modified my program argument to match the manifest.json:
--engine-id mjaOHNTajMZJpHndxjg0AyIAhp1HtqK4 --engine-version 2679cce2b66d26effa839b6d393459db1614bb3e --engine-variant engine.json

My VM arguments were fine:
-Dspark.master=local -Dlog4j.configuration=file:/opt/PredictionIO/conf/log4j.properties

My Environnement variables are:

SPARK_HOME=/opt/PredictionIO/vendors/spark-1.3.1
PIO_FS_BASEDIR=/home/dev/.pio_store
PIO_FS_ENGINESDIR=/home/dev/.pio_store/engines
PIO_FS_TMPDIR=/home/dev/.pio_store/tmp
PIO_STORAGE_REPOSITORIES_METADATA_NAME=predictionio_metadata
PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH
PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_
PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS
PIO_STORAGE_REPOSITORIES_APPDATA_NAME=predictionio_appdata
PIO_STORAGE_REPOSITORIES_APPDATA_SOURCE=ELASTICSEARCH
PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=predictionio_eventdata
PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE
PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch
PIO_STORAGE_SOURCES_ELASTICSEARCH_HOSTS=localhost
PIO_STORAGE_SOURCES_ELASTICSEARCH_PORTS=9300
PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
PIO_STORAGE_SOURCES_LOCALFS_HOSTS=/home/dev/.pio_store/models
PIO_STORAGE_SOURCES_LOCALFS_PORTS=0
PIO_STORAGE_SOURCES_HBASE_TYPE=hbase
PIO_STORAGE_SOURCES_HBASE_HOSTS=0
PIO_STORAGE_SOURCES_HBASE_PORTS=0

Now when I launch pio build --verbose

$ pio build --verbose
[INFO] [Console$] Using existing engine manifest JSON at /home/dev/testmtkecoreco/manifest.json
[INFO] [Console$] Using command '/opt/PredictionIO/sbt/sbt' at the current working directory to build.
[INFO] [Console$] If the path above is incorrect, this process will fail.
[INFO] [Console$] Uber JAR disabled. Making sure lib/pio-assembly-0.9.3.jar is absent.
[INFO] [Console$] Going to run: /opt/PredictionIO/sbt/sbt  package assemblyPackageDependency
[INFO] [Console$] [info] Loading global plugins from /home/dev/.sbt/0.13/plugins
[INFO] [Console$] [info] Loading project definition from /home/dev/testmtkecoreco/project
[INFO] [Console$] [info] Set current project to template-scala-parallel-ecommercerecommendation (in build file:/home/dev/testmtkecoreco/)
[INFO] [Console$] [info] Compiling 1 Scala source to /home/dev/testmtkecoreco/target/scala-2.10/classes...
[INFO] [Console$] [warn] No main class detected
[INFO] [Console$] [info] Packaging /home/dev/testmtkecoreco/target/scala-2.10/template-scala-parallel-ecommercerecommendation_2.10-0.1-SNAPSHOT.jar ...
[INFO] [Console$] [info] Done packaging.
[INFO] [Console$] [success] Total time: 8 s, completed 26 juin 2015 09:16:15
[INFO] [Console$] [warn] No main class detected
[INFO] [Console$] [info] Including from cache: scala-library.jar
[ERROR] [Console$] Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
[ERROR] [Console$] 15/06/26 09:16:18 INFO SparkContext: Running Spark version 1.3.0
[ERROR] [Console$] 15/06/26 09:16:18 WARN Utils: Your hostname, dev-VirtualBox resolves to a loopback address: 127.0.1.1; using 192.168.1.107 instead (on interface eth0)
[ERROR] [Console$] 15/06/26 09:16:18 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
[ERROR] [Console$] 15/06/26 09:16:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
[ERROR] [Console$] 15/06/26 09:16:19 INFO SecurityManager: Changing view acls to: dev
[ERROR] [Console$] 15/06/26 09:16:19 INFO SecurityManager: Changing modify acls to: dev
[ERROR] [Console$] 15/06/26 09:16:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(dev); users with modify permissions: Set(dev)
[ERROR] [Console$] 15/06/26 09:16:20 INFO Slf4jLogger: Slf4jLogger started
[ERROR] [Console$] 15/06/26 09:16:20 INFO Remoting: Starting remoting
[ERROR] [Console$] 15/06/26 09:16:21 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://spark...@192.168.1.107:49577]
[ERROR] [Console$] 15/06/26 09:16:21 INFO Utils: Successfully started service 'sparkDriver' on port 49577.
[ERROR] [Console$] 15/06/26 09:16:21 INFO SparkEnv: Registering MapOutputTracker
[ERROR] [Console$] 15/06/26 09:16:21 INFO SparkEnv: Registering BlockManagerMaster
[ERROR] [Console$] 15/06/26 09:16:21 INFO DiskBlockManager: Created local directory at /tmp/spark-7175dd74-fa3f-45da-9c3c-0ed2ab89fcd6/blockmgr-1c378cbc-f7c2-4df3-8044-137ad08e7140
[ERROR] [Console$] 15/06/26 09:16:21 INFO MemoryStore: MemoryStore started with capacity 530.3 MB
[ERROR] [Console$] 15/06/26 09:16:21 INFO HttpFileServer: HTTP File server directory is /tmp/spark-7fccb5c2-4200-4bad-984a-0db49f268167/httpd-22faa8f1-a451-4474-b3b8-6e61d4be3fa1
[ERROR] [Console$] 15/06/26 09:16:21 INFO HttpServer: Starting HTTP Server
[ERROR] [Console$] 15/06/26 09:16:22 INFO Server: jetty-8.y.z-SNAPSHOT
[ERROR] [Console$] 15/06/26 09:16:22 INFO AbstractConnector: Started SocketC...@0.0.0.0:47027
[ERROR] [Console$] 15/06/26 09:16:22 INFO Utils: Successfully started service 'HTTP file server' on port 47027.
[ERROR] [Console$] 15/06/26 09:16:22 INFO SparkEnv: Registering OutputCommitCoordinator
[ERROR] [Console$] 15/06/26 09:16:22 INFO Server: jetty-8.y.z-SNAPSHOT
[ERROR] [Console$] 15/06/26 09:16:22 INFO AbstractConnector: Started SelectChann...@0.0.0.0:4040
[ERROR] [Console$] 15/06/26 09:16:22 INFO Utils: Successfully started service 'SparkUI' on port 4040.
[ERROR] [Console$] 15/06/26 09:16:22 INFO SparkUI: Started SparkUI at http://192.168.1.107:4040
[ERROR] [Console$] 15/06/26 09:16:23 INFO Executor: Starting executor ID <driver> on host localhost
[ERROR] [Console$] 15/06/26 09:16:23 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://spark...@192.168.1.107:49577/user/HeartbeatReceiver
[ERROR] [Console$] 15/06/26 09:16:23 INFO NettyBlockTransferService: Server created on 49703
[ERROR] [Console$] 15/06/26 09:16:23 INFO BlockManagerMaster: Trying to register BlockManager
[ERROR] [Console$] 15/06/26 09:16:23 INFO BlockManagerMasterActor: Registering block manager localhost:49703 with 530.3 MB RAM, BlockManagerId(<driver>, localhost, 49703)
[ERROR] [Console$] 15/06/26 09:16:23 INFO BlockManagerMaster: Registered BlockManager
[ERROR] [Console$] 15/06/26 09:16:24 INFO SparkContext: Starting job: collect at ECommAlgorithmTest.scala:79
[ERROR] [Console$] 15/06/26 09:16:24 INFO DAGScheduler: Registering RDD 5 (filter at ECommAlgorithm.scala:169)
[ERROR] [Console$] 15/06/26 09:16:24 INFO DAGScheduler: Got job 0 (collect at ECommAlgorithmTest.scala:79) with 1 output partitions (allowLocal=false)
[ERROR] [Console$] 15/06/26 09:16:24 INFO DAGScheduler: Final stage: Stage 1(collect at ECommAlgorithmTest.scala:79)
[ERROR] [Console$] 15/06/26 09:16:24 INFO DAGScheduler: Parents of final stage: List(Stage 0)
[ERROR] [Console$] 15/06/26 09:16:24 INFO DAGScheduler: Missing parents: List(Stage 0)
[ERROR] [Console$] 15/06/26 09:16:24 INFO DAGScheduler: Submitting Stage 0 (MapPartitionsRDD[5] at filter at ECommAlgorithm.scala:169), which has no missing parents
[ERROR] [Console$] 15/06/26 09:16:24 INFO MemoryStore: ensureFreeSpace(4120) called with curMem=0, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:24 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 4.0 KB, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:24 INFO MemoryStore: ensureFreeSpace(2939) called with curMem=4120, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:24 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 2.9 KB, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:24 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:49703 (size: 2.9 KB, free: 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:24 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
[ERROR] [Console$] 15/06/26 09:16:24 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:839
[ERROR] [Console$] 15/06/26 09:16:24 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (MapPartitionsRDD[5] at filter at ECommAlgorithm.scala:169)
[ERROR] [Console$] 15/06/26 09:16:24 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
[ERROR] [Console$] 15/06/26 09:16:24 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1438 bytes)
[ERROR] [Console$] 15/06/26 09:16:24 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
[ERROR] [Console$] 15/06/26 09:16:25 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 881 bytes result sent to driver
[ERROR] [Console$] 15/06/26 09:16:25 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 323 ms on localhost (1/1)
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: Stage 0 (filter at ECommAlgorithm.scala:169) finished in 0,356 s
[ERROR] [Console$] 15/06/26 09:16:25 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: looking for newly runnable stages
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: running: Set()
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: waiting: Set(Stage 1)
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: failed: Set()
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: Missing parents for Stage 1: List()
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: Submitting Stage 1 (MapPartitionsRDD[7] at map at ECommAlgorithm.scala:174), which is now runnable
[ERROR] [Console$] 15/06/26 09:16:25 INFO MemoryStore: ensureFreeSpace(2600) called with curMem=7059, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:25 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 2.5 KB, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:25 INFO MemoryStore: ensureFreeSpace(1909) called with curMem=9659, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:25 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1909.0 B, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:25 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:49703 (size: 1909.0 B, free: 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:25 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
[ERROR] [Console$] 15/06/26 09:16:25 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:839
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: Submitting 1 missing tasks from Stage 1 (MapPartitionsRDD[7] at map at ECommAlgorithm.scala:174)
[ERROR] [Console$] 15/06/26 09:16:25 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
[ERROR] [Console$] 15/06/26 09:16:25 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, PROCESS_LOCAL, 1056 bytes)
[ERROR] [Console$] 15/06/26 09:16:25 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
[ERROR] [Console$] 15/06/26 09:16:25 INFO CacheManager: Partition rdd_7_0 not found, computing it
[ERROR] [Console$] 15/06/26 09:16:25 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
[ERROR] [Console$] 15/06/26 09:16:25 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 34 ms
[ERROR] [Console$] 15/06/26 09:16:25 INFO MemoryStore: ensureFreeSpace(160) called with curMem=11568, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:25 INFO MemoryStore: Block rdd_7_0 stored as values in memory (estimated size 160.0 B, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:25 INFO BlockManagerInfo: Added rdd_7_0 in memory on localhost:49703 (size: 160.0 B, free: 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:25 INFO BlockManagerMaster: Updated info of block rdd_7_0
[ERROR] [Console$] 15/06/26 09:16:25 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1652 bytes result sent to driver
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: Stage 1 (collect at ECommAlgorithmTest.scala:79) finished in 0,461 s
[ERROR] [Console$] 15/06/26 09:16:25 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 449 ms on localhost (1/1)
[ERROR] [Console$] 15/06/26 09:16:25 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
[ERROR] [Console$] 15/06/26 09:16:25 INFO BlockManager: Removing broadcast 0
[ERROR] [Console$] 15/06/26 09:16:25 INFO DAGScheduler: Job 0 finished: collect at ECommAlgorithmTest.scala:79, took 1,532979 s
[ERROR] [Console$] 15/06/26 09:16:25 INFO BlockManager: Removing block broadcast_0
[ERROR] [Console$] 15/06/26 09:16:25 INFO MemoryStore: Block broadcast_0 of size 4120 dropped from memory (free 556031273)
[ERROR] [Console$] 15/06/26 09:16:25 INFO BlockManager: Removing block broadcast_0_piece0
[ERROR] [Console$] 15/06/26 09:16:25 INFO MemoryStore: Block broadcast_0_piece0 of size 2939 dropped from memory (free 556034212)
[ERROR] [Console$] 15/06/26 09:16:25 INFO BlockManagerInfo: Removed broadcast_0_piece0 on localhost:49703 in memory (size: 2.9 KB, free: 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:25 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
[ERROR] [Console$] 15/06/26 09:16:25 INFO ContextCleaner: Cleaned broadcast 0
[ERROR] [Console$] 15/06/26 09:16:26 INFO SparkContext: Starting job: collectAsMap at ECommAlgorithm.scala:216
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Registering RDD 14 (map at ECommAlgorithm.scala:213)
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Got job 1 (collectAsMap at ECommAlgorithm.scala:216) with 1 output partitions (allowLocal=false)
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Final stage: Stage 3(collectAsMap at ECommAlgorithm.scala:216)
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Parents of final stage: List(Stage 2)
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Missing parents: List(Stage 2)
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Submitting Stage 2 (MapPartitionsRDD[14] at map at ECommAlgorithm.scala:213), which has no missing parents
[ERROR] [Console$] 15/06/26 09:16:26 INFO MemoryStore: ensureFreeSpace(4288) called with curMem=4669, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:26 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 4.2 KB, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:26 INFO MemoryStore: ensureFreeSpace(3140) called with curMem=8957, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:26 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 3.1 KB, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:26 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:49703 (size: 3.1 KB, free: 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:26 INFO BlockManagerMaster: Updated info of block broadcast_2_piece0
[ERROR] [Console$] 15/06/26 09:16:26 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:839
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Submitting 1 missing tasks from Stage 2 (MapPartitionsRDD[14] at map at ECommAlgorithm.scala:213)
[ERROR] [Console$] 15/06/26 09:16:26 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
[ERROR] [Console$] 15/06/26 09:16:26 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, localhost, PROCESS_LOCAL, 1388 bytes)
[ERROR] [Console$] 15/06/26 09:16:26 INFO Executor: Running task 0.0 in stage 2.0 (TID 2)
[ERROR] [Console$] 15/06/26 09:16:26 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 881 bytes result sent to driver
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Stage 2 (map at ECommAlgorithm.scala:213) finished in 0,007 s
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: looking for newly runnable stages
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: running: Set()
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: waiting: Set(Stage 3)
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: failed: Set()
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Missing parents for Stage 3: List()
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Submitting Stage 3 (ShuffledRDD[15] at reduceByKey at ECommAlgorithm.scala:214), which is now runnable
[ERROR] [Console$] 15/06/26 09:16:26 INFO MemoryStore: ensureFreeSpace(2104) called with curMem=12097, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:26 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 2.1 KB, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:26 INFO MemoryStore: ensureFreeSpace(1552) called with curMem=14201, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:26 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 1552.0 B, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:26 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 15 ms on localhost (1/1)
[ERROR] [Console$] 15/06/26 09:16:26 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 
[ERROR] [Console$] 15/06/26 09:16:26 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on localhost:49703 (size: 1552.0 B, free: 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:26 INFO BlockManagerMaster: Updated info of block broadcast_3_piece0
[ERROR] [Console$] 15/06/26 09:16:26 INFO SparkContext: Created broadcast 3 from broadcast at DAGScheduler.scala:839
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Submitting 1 missing tasks from Stage 3 (ShuffledRDD[15] at reduceByKey at ECommAlgorithm.scala:214)
[ERROR] [Console$] 15/06/26 09:16:26 INFO TaskSchedulerImpl: Adding task set 3.0 with 1 tasks
[ERROR] [Console$] 15/06/26 09:16:26 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 3, localhost, PROCESS_LOCAL, 1056 bytes)
[ERROR] [Console$] 15/06/26 09:16:26 INFO Executor: Running task 0.0 in stage 3.0 (TID 3)
[ERROR] [Console$] 15/06/26 09:16:26 INFO ShuffleBlockFetcherIterator: Getting 1 non-empty blocks out of 1 blocks
[ERROR] [Console$] 15/06/26 09:16:26 INFO ShuffleBlockFetcherIterator: Started 0 remote fetches in 0 ms
[ERROR] [Console$] 15/06/26 09:16:26 INFO Executor: Finished task 0.0 in stage 3.0 (TID 3). 1055 bytes result sent to driver
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Stage 3 (collectAsMap at ECommAlgorithm.scala:216) finished in 0,006 s
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Job 1 finished: collectAsMap at ECommAlgorithm.scala:216, took 0,079332 s
[ERROR] [Console$] 15/06/26 09:16:26 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 3) in 13 ms on localhost (1/1)
[ERROR] [Console$] 15/06/26 09:16:26 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool 
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
[ERROR] [Console$] 15/06/26 09:16:26 INFO SparkUI: Stopped Spark web UI at http://192.168.1.107:4040
[ERROR] [Console$] 15/06/26 09:16:26 INFO DAGScheduler: Stopping DAGScheduler
[ERROR] [Console$] 15/06/26 09:16:26 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
[ERROR] [Console$] 15/06/26 09:16:26 INFO MemoryStore: MemoryStore cleared
[ERROR] [Console$] 15/06/26 09:16:26 INFO BlockManager: BlockManager stopped
[ERROR] [Console$] 15/06/26 09:16:26 INFO BlockManagerMaster: BlockManagerMaster stopped
[ERROR] [Console$] 15/06/26 09:16:26 INFO OutputCommitCoordinator$OutputCommitCoordinatorActor: OutputCommitCoordinator stopped!
[ERROR] [Console$] 15/06/26 09:16:26 INFO SparkContext: Successfully stopped SparkContext
[ERROR] [Console$] 15/06/26 09:16:26 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
[ERROR] [Console$] 15/06/26 09:16:26 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
[INFO] [Console$] [info] ECommAlgorithmTest:
[INFO] [Console$] [info] ECommAlgorithm.genMLlibRating()
[INFO] [Console$] [info] - should create RDD[MLlibRating] correctly
[INFO] [Console$] [info] ECommAlgorithm.trainDefault()
[INFO] [Console$] [info] - should return popular count for each item
[INFO] [Console$] [info] ECommAlgorithm.predictKnownuser()
[INFO] [Console$] [info] - should return top item
[INFO] [Console$] [info] ECommAlgorithm.predictDefault()
[INFO] [Console$] [info] - should return top item
[INFO] [Console$] [info] ECommAlgorithm.predictSimilar()
[INFO] [Console$] [info] - should return top item
[ERROR] [Console$] 15/06/26 09:16:26 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkContext: Running Spark version 1.3.0
[ERROR] [Console$] 15/06/26 09:16:27 INFO SecurityManager: Changing view acls to: dev
[ERROR] [Console$] 15/06/26 09:16:27 INFO SecurityManager: Changing modify acls to: dev
[ERROR] [Console$] 15/06/26 09:16:27 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(dev); users with modify permissions: Set(dev)
[ERROR] [Console$] 15/06/26 09:16:27 INFO Slf4jLogger: Slf4jLogger started
[ERROR] [Console$] 15/06/26 09:16:27 INFO Remoting: Starting remoting
[ERROR] [Console$] 15/06/26 09:16:27 INFO Utils: Successfully started service 'sparkDriver' on port 36319.
[ERROR] [Console$] 15/06/26 09:16:27 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@localhost:36319]
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkEnv: Registering MapOutputTracker
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkEnv: Registering BlockManagerMaster
[ERROR] [Console$] 15/06/26 09:16:27 INFO DiskBlockManager: Created local directory at /tmp/spark-3753c116-c92c-495d-af03-0d6af608ba98/blockmgr-dc722d5c-50e2-4d89-868b-7dfb58b5ee0f
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: MemoryStore started with capacity 530.3 MB
[ERROR] [Console$] 15/06/26 09:16:27 INFO HttpFileServer: HTTP File server directory is /tmp/spark-d6457086-562b-415d-b492-17a334fac3a4/httpd-e7de6291-e459-4074-855f-36df00aaab58
[ERROR] [Console$] 15/06/26 09:16:27 INFO HttpServer: Starting HTTP Server
[ERROR] [Console$] 15/06/26 09:16:27 INFO Server: jetty-8.y.z-SNAPSHOT
[ERROR] [Console$] 15/06/26 09:16:27 INFO AbstractConnector: Started SocketC...@0.0.0.0:44463
[ERROR] [Console$] 15/06/26 09:16:27 INFO Utils: Successfully started service 'HTTP file server' on port 44463.
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkEnv: Registering OutputCommitCoordinator
[ERROR] [Console$] 15/06/26 09:16:27 INFO Server: jetty-8.y.z-SNAPSHOT
[ERROR] [Console$] 15/06/26 09:16:27 INFO AbstractConnector: Started SelectChann...@0.0.0.0:4040
[ERROR] [Console$] 15/06/26 09:16:27 INFO Utils: Successfully started service 'SparkUI' on port 4040.
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkUI: Started SparkUI at http://localhost:4040
[ERROR] [Console$] 15/06/26 09:16:27 INFO Executor: Starting executor ID <driver> on host localhost
[ERROR] [Console$] 15/06/26 09:16:27 INFO AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@localhost:36319/user/HeartbeatReceiver
[ERROR] [Console$] 15/06/26 09:16:27 INFO NettyBlockTransferService: Server created on 48610
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerMaster: Trying to register BlockManager
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerMasterActor: Registering block manager localhost:48610 with 530.3 MB RAM, BlockManagerId(<driver>, localhost, 48610)
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerMaster: Registered BlockManager
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkContext: Starting job: collect at PreparatorTest.scala:44
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Got job 0 (collect at PreparatorTest.scala:44) with 1 output partitions (allowLocal=false)
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Final stage: Stage 0(collect at PreparatorTest.scala:44)
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Parents of final stage: List()
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Missing parents: List()
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Submitting Stage 0 (ParallelCollectionRDD[0] at parallelize at PreparatorTest.scala:36), which has no missing parents
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: ensureFreeSpace(1064) called with curMem=0, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1064.0 B, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: ensureFreeSpace(786) called with curMem=1064, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 786.0 B, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:48610 (size: 786.0 B, free: 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerMaster: Updated info of block broadcast_0_piece0
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:839
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Submitting 1 missing tasks from Stage 0 (ParallelCollectionRDD[0] at parallelize at PreparatorTest.scala:36)
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, PROCESS_LOCAL, 1379 bytes)
[ERROR] [Console$] 15/06/26 09:16:27 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
[ERROR] [Console$] 15/06/26 09:16:27 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 737 bytes result sent to driver
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Stage 0 (collect at PreparatorTest.scala:44) finished in 0,047 s
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Job 0 finished: collect at PreparatorTest.scala:44, took 0,112750 s
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkContext: Starting job: collect at PreparatorTest.scala:45
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 66 ms on localhost (1/1)
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Got job 1 (collect at PreparatorTest.scala:45) with 1 output partitions (allowLocal=false)
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Final stage: Stage 1(collect at PreparatorTest.scala:45)
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Parents of final stage: List()
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Missing parents: List()
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Submitting Stage 1 (ParallelCollectionRDD[1] at parallelize at PreparatorTest.scala:37), which has no missing parents
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: ensureFreeSpace(1064) called with curMem=1850, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 1064.0 B, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: ensureFreeSpace(786) called with curMem=2914, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 786.0 B, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:48610 (size: 786.0 B, free: 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerMaster: Updated info of block broadcast_1_piece0
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:839
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Submitting 1 missing tasks from Stage 1 (ParallelCollectionRDD[1] at parallelize at PreparatorTest.scala:37)
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, localhost, PROCESS_LOCAL, 1662 bytes)
[ERROR] [Console$] 15/06/26 09:16:27 INFO Executor: Running task 0.0 in stage 1.0 (TID 1)
[ERROR] [Console$] 15/06/26 09:16:27 INFO Executor: Finished task 0.0 in stage 1.0 (TID 1). 1085 bytes result sent to driver
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Stage 1 (collect at PreparatorTest.scala:45) finished in 0,019 s
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Job 1 finished: collect at PreparatorTest.scala:45, took 0,059497 s
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkContext: Starting job: collect at PreparatorTest.scala:47
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Got job 2 (collect at PreparatorTest.scala:47) with 1 output partitions (allowLocal=false)
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Final stage: Stage 2(collect at PreparatorTest.scala:47)
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Parents of final stage: List()
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Missing parents: List()
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Submitting Stage 2 (ParallelCollectionRDD[3] at parallelize at PreparatorTest.scala:39), which has no missing parents
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: ensureFreeSpace(1144) called with curMem=3700, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 1144.0 B, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: ensureFreeSpace(891) called with curMem=4844, maxMem=556038881
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 22 ms on localhost (1/1)
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 891.0 B, free 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on localhost:48610 (size: 891.0 B, free: 530.3 MB)
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerMaster: Updated info of block broadcast_2_piece0
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:839
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Submitting 1 missing tasks from Stage 2 (ParallelCollectionRDD[3] at parallelize at PreparatorTest.scala:39)
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, localhost, PROCESS_LOCAL, 1399 bytes)
[ERROR] [Console$] 15/06/26 09:16:27 INFO Executor: Running task 0.0 in stage 2.0 (TID 2)
[ERROR] [Console$] 15/06/26 09:16:27 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 773 bytes result sent to driver
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Stage 2 (collect at PreparatorTest.scala:47) finished in 0,006 s
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Job 2 finished: collect at PreparatorTest.scala:47, took 0,057073 s
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 11 ms on localhost (1/1)
[ERROR] [Console$] 15/06/26 09:16:27 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/metrics/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/kill,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/static,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/threadDump,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/executors,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/environment,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/rdd,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/storage,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/pool,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/stage,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/stages,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/job,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs/json,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO ContextHandler: stopped o.s.j.s.ServletContextHandler{/jobs,null}
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkUI: Stopped Spark web UI at http://localhost:4040
[ERROR] [Console$] 15/06/26 09:16:27 INFO DAGScheduler: Stopping DAGScheduler
[ERROR] [Console$] 15/06/26 09:16:27 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped!
[ERROR] [Console$] 15/06/26 09:16:27 INFO MemoryStore: MemoryStore cleared
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManager: BlockManager stopped
[ERROR] [Console$] 15/06/26 09:16:27 INFO BlockManagerMaster: BlockManagerMaster stopped
[ERROR] [Console$] 15/06/26 09:16:27 INFO OutputCommitCoordinator$OutputCommitCoordinatorActor: OutputCommitCoordinator stopped!
[ERROR] [Console$] 15/06/26 09:16:27 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
[ERROR] [Console$] 15/06/26 09:16:27 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
[ERROR] [Console$] 15/06/26 09:16:27 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
[ERROR] [Console$] 15/06/26 09:16:27 INFO SparkContext: Successfully stopped SparkContext
[INFO] [Console$] [info] PreparatorTest:
[INFO] [Console$] [info] Preparator
[INFO] [Console$] [info] - should prepare PreparedData
[INFO] [Console$] [info] Run completed in 10 seconds, 915 milliseconds.
[INFO] [Console$] [info] Total number of tests run: 6
[INFO] [Console$] [info] Suites: completed 2, aborted 0
[INFO] [Console$] [info] Tests: succeeded 6, failed 0, canceled 0, ignored 0, pending 0
[INFO] [Console$] [info] All tests passed.
[INFO] [Console$] [info] Checking every *.class/*.jar file's SHA-1.
[INFO] [Console$] [info] Merging files...
[INFO] [Console$] [warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
[INFO] [Console$] [warn] Strategy 'discard' was applied to a file
[INFO] [Console$] [info] Assembly up to date: /home/dev/testmtkecoreco/target/scala-2.10/template-scala-parallel-ecommercerecommendation-assembly-0.1-SNAPSHOT-deps.jar
[INFO] [Console$] [success] Total time: 14 s, completed 26 juin 2015 09:16:29
[INFO] [Console$] Build finished successfully.
[INFO] [Console$] Looking for an engine...
[INFO] [Console$] Found template-scala-parallel-ecommercerecommendation_2.10-0.1-SNAPSHOT.jar
[INFO] [Console$] Found template-scala-parallel-ecommercerecommendation-assembly-0.1-SNAPSHOT-deps.jar
[INFO] [plugins] [Tag] loaded [], sites []
[INFO] [RegisterEngine$] Registering engine mjaOHNTajMZJpHndxjg0AyIAhp1HtqK4 2679cce2b66d26effa839b6d393459db1614bb3e
[INFO] [Console$] Your engine is ready for training.


Now I import my data
pio import --appid 7 --input data

Now I launch pio train (still from command line ):
$ pio train
[INFO] [Console$] Using existing engine manifest JSON at /home/dev/testmtkecoreco/manifest.json
[INFO] [Runner$] Submission command: /opt/PredictionIO/vendors/spark-1.3.1/bin/spark-submit --class io.prediction.workflow.CreateWorkflow --jars file:/home/dev/testmtkecoreco/target/scala-2.10/template-scala-parallel-ecommercerecommendation_2.10-0.1-SNAPSHOT.jar,file:/home/dev/testmtkecoreco/target/scala-2.10/template-scala-parallel-ecommercerecommendation-assembly-0.1-SNAPSHOT-deps.jar --files file:/opt/PredictionIO/conf/log4j.properties,file:/opt/PredictionIO/vendors/hbase-1.0.0/conf/hbase-site.xml --driver-class-path /opt/PredictionIO/conf:/opt/PredictionIO/vendors/hbase-1.0.0/conf file:/opt/PredictionIO/lib/pio-assembly-0.9.3.jar --engine-id mjaOHNTajMZJpHndxjg0AyIAhp1HtqK4 --engine-version 2679cce2b66d26effa839b6d393459db1614bb3e --engine-variant file:/home/dev/testmtkecoreco/engine.json --verbosity 0 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/dev/.pio_store,PIO_STORAGE_SOURCES_HBASE_HOME=/opt/PredictionIO/vendors/hbase-1.0.0,PIO_HOME=/opt/PredictionIO,PIO_FS_ENGINESDIR=/home/dev/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/home/dev/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/opt/PredictionIO/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/home/dev/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/opt/PredictionIO/conf,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(mtkappname))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[WARN] [Utils] Your hostname, dev-VirtualBox resolves to a loopback address: 127.0.1.1; using 192.168.1.107 instead (on interface eth0)
[WARN] [Utils] Set SPARK_LOCAL_IP if you need to bind to another address
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://spark...@192.168.1.107:52014]
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: com.esillage.DataSource@47c767d8
[INFO] [Engine$] Preparator: com.esillage.Preparator@fe8780
[INFO] [Engine$] AlgorithmList: List(com.esillage.ECommAlgorithm@33749745)
[INFO] [Engine$] Data santiy check is on.
[WARN] [TableInputFormatBase] Cannot resolve the host name for dev-VirtualBox/127.0.1.1 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '1.1.0.127.in-addr.arpa'
[WARN] [TableInputFormatBase] Cannot resolve the host name for dev-VirtualBox/127.0.1.1 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '1.1.0.127.in-addr.arpa'
[WARN] [TableInputFormatBase] Cannot resolve the host name for dev-VirtualBox/127.0.1.1 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '1.1.0.127.in-addr.arpa'
[WARN] [TableInputFormatBase] Cannot resolve the host name for dev-VirtualBox/127.0.1.1 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '1.1.0.127.in-addr.arpa'
[INFO] [Engine$] com.esillage.TrainingData does not support data sanity check. Skipping check.
[INFO] [Engine$] com.esillage.PreparedData does not support data sanity check. Skipping check.
[WARN] [TableInputFormatBase] Cannot resolve the host name for dev-VirtualBox/127.0.1.1 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '1.1.0.127.in-addr.arpa'
[WARN] [TableInputFormatBase] Cannot resolve the host name for dev-VirtualBox/127.0.1.1 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '1.1.0.127.in-addr.arpa'
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
[Stage 31:>                                                         (0 + 2) / 2][WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK
[WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK
[INFO] [Engine$] com.esillage.ECommModel does not support data sanity check. Skipping check.
[INFO] [Engine$] EngineWorkflow.train completed
[INFO] [Engine] engineInstanceId=AU4u5QaBs0waRdTTu7tL
[INFO] [CoreWorkflow$] Inserting persistent model
[INFO] [CoreWorkflow$] Updating engine instance
[INFO] [CoreWorkflow$] Training completed successfully.
[WARN] [QueuedThreadPool] 4 threads could not be stopped

then my deploy
$ pio deploy
[INFO] [Runner$] Submission command: /opt/PredictionIO/vendors/spark-1.3.1/bin/spark-submit --class io.prediction.workflow.CreateServer --jars file:/home/dev/testmtkecoreco/target/scala-2.10/template-scala-parallel-ecommercerecommendation_2.10-0.1-SNAPSHOT.jar,file:/home/dev/testmtkecoreco/target/scala-2.10/template-scala-parallel-ecommercerecommendation-assembly-0.1-SNAPSHOT-deps.jar --files file:/opt/PredictionIO/conf/log4j.properties,file:/opt/PredictionIO/vendors/hbase-1.0.0/conf/hbase-site.xml --driver-class-path /opt/PredictionIO/conf:/opt/PredictionIO/vendors/hbase-1.0.0/conf file:/opt/PredictionIO/lib/pio-assembly-0.9.3.jar --engineInstanceId AU4u5QaBs0waRdTTu7tL --engine-variant file:/home/dev/testmtkecoreco/engine.json --ip 0.0.0.0 --port 8000 --event-server-ip 0.0.0.0 --event-server-port 7070 --json-extractor Both --env PIO_STORAGE_SOURCES_HBASE_TYPE=hbase,PIO_ENV_LOADED=1,PIO_STORAGE_REPOSITORIES_METADATA_NAME=pio_meta,PIO_FS_BASEDIR=/home/dev/.pio_store,PIO_STORAGE_SOURCES_HBASE_HOME=/opt/PredictionIO/vendors/hbase-1.0.0,PIO_HOME=/opt/PredictionIO,PIO_FS_ENGINESDIR=/home/dev/.pio_store/engines,PIO_STORAGE_SOURCES_LOCALFS_PATH=/home/dev/.pio_store/models,PIO_STORAGE_SOURCES_ELASTICSEARCH_TYPE=elasticsearch,PIO_STORAGE_REPOSITORIES_METADATA_SOURCE=ELASTICSEARCH,PIO_STORAGE_REPOSITORIES_MODELDATA_SOURCE=LOCALFS,PIO_STORAGE_REPOSITORIES_EVENTDATA_NAME=pio_event,PIO_STORAGE_SOURCES_ELASTICSEARCH_HOME=/opt/PredictionIO/vendors/elasticsearch-1.4.4,PIO_FS_TMPDIR=/home/dev/.pio_store/tmp,PIO_STORAGE_REPOSITORIES_MODELDATA_NAME=pio_model,PIO_STORAGE_REPOSITORIES_EVENTDATA_SOURCE=HBASE,PIO_CONF_DIR=/opt/PredictionIO/conf,PIO_STORAGE_SOURCES_LOCALFS_TYPE=localfs
[WARN] [WorkflowUtils$] Non-empty parameters supplied to com.esillage.Preparator, but its constructor does not accept any arguments. Stubbing with empty parameters.
[WARN] [WorkflowUtils$] Non-empty parameters supplied to com.esillage.Serving, but its constructor does not accept any arguments. Stubbing with empty parameters.
[WARN] [Utils] Your hostname, dev-VirtualBox resolves to a loopback address: 127.0.1.1; using 192.168.1.107 instead (on interface eth0)
[WARN] [Utils] Set SPARK_LOCAL_IP if you need to bind to another address
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://spark...@192.168.1.107:41739]
[INFO] [Engine] Using persisted model
[INFO] [Engine] Loaded model com.esillage.ECommModel for algorithm com.esillage.ECommAlgorithm
[INFO] [MasterActor] Undeploying any existing engine instance at http://0.0.0.0:8000
[WARN] [MasterActor] Nothing at http://0.0.0.0:8000
[INFO] [HttpListener] Bound to /0.0.0.0:8000
[INFO] [MasterActor] Engine is deployed and running. Engine API is live at http://0.0.0.0:8000.

I compared the content of my file pio-env to the environnemnts variables I set and everything match. One thing I noticed is that I dont have a folder called engines in
home/dev/.pio_store/
despite the fact that in my environnement variables I set :
PIO_FS_ENGINESDIR=/home/dev/.pio_store/engines

nor have I a temp folder in home/dev/.pio_store/
despite : PIO_FS_TMPDIR=/home/dev/.pio_store/tmp

could it be related?

Anyway when I launch my pio train using Eclipse I still have the same problem:

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/dev/.ivy2/cache/org.slf4j/slf4j-log4j12/jars/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/dev/lib/pio-assembly-0.9.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/dev/lib/spark-assembly-1.3.1-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
[INFO] [Engine] Extracting datasource params...
[INFO] [WorkflowUtils$] No 'name' is found. Default empty String will be used.
[INFO] [Engine] Datasource params: (,DataSourceParams(mtkappname))
[INFO] [Engine] Extracting preparator params...
[INFO] [Engine] Preparator params: (,Empty)
[INFO] [Engine] Extracting serving params...
[INFO] [Engine] Serving params: (,Empty)
[WARN] [Utils] Your hostname, dev-VirtualBox resolves to a loopback address: 127.0.1.1; using 192.168.1.107 instead (on interface eth0)
[WARN] [Utils] Set SPARK_LOCAL_IP if you need to bind to another address
[INFO] [Remoting] Starting remoting
[INFO] [Remoting] Remoting started; listening on addresses :[akka.tcp://spark...@192.168.1.107:46970]
[INFO] [Engine$] EngineWorkflow.train
[INFO] [Engine$] DataSource: com.esillage.DataSource@3eb407cc
[INFO] [Engine$] Preparator: com.esillage.Preparator@57cc8cb1
[INFO] [Engine$] AlgorithmList: List(com.esillage.ECommAlgorithm@2fc16b7)
[INFO] [Engine$] Data santiy check is on.
[ERROR] [Common$] Invalid app name mtkappname
Exception in thread "main" java.lang.IllegalArgumentException: Invalid app name mtkappname
at io.prediction.data.store.Common$$anonfun$appNameToId$2.apply(Common.scala:47)
at io.prediction.data.store.Common$$anonfun$appNameToId$2.apply(Common.scala:45)
at scala.Option.getOrElse(Option.scala:120)
at io.prediction.data.store.Common$.appNameToId(Common.scala:45)
at io.prediction.data.store.PEventStore$.aggregateProperties(PEventStore.scala:103)
at com.esillage.DataSource.readTraining(DataSource.scala:31)
at com.esillage.DataSource.readTraining(DataSource.scala:18)
at io.prediction.controller.PDataSource.readTrainingBase(PDataSource.scala:37)
at io.prediction.controller.Engine$.train(Engine.scala:640)
at io.prediction.controller.Engine.train(Engine.scala:174)
at io.prediction.workflow.CoreWorkflow$.runTrain(CoreWorkflow.scala:65)
at io.prediction.workflow.CreateWorkflow$.main(CreateWorkflow.scala:246)
at io.prediction.workflow.CreateWorkflow.main(CreateWorkflow.scala)

bru no

unread,
Jun 30, 2015, 2:34:24 AM6/30/15
to Pat Ferrel, predicti...@googlegroups.com
Hello Pat,

Thanks again.

So I generated a new appname:
~$ pio app new mtkappname
[INFO] [HBLEvents] The table pio_event:events_7 doesn't exist yet. Creating now...
[INFO] [App$] Initialized Event Store for this app ID: 7.
[INFO] [App$] Created new app:
[INFO] [App$]       Name: mtkappname
[INFO] [App$]         ID: 7
[INFO] [App$] Access Key: j8O1U0F13em6HLnIdSuhYt8XYfqEAlkjmaiCyoG2Gnn5p1JfF2Wzc4HknwNq1ePZ

Here is the result when I launch:
$ pio app list
[INFO] [App$]                 Name |   ID |                                                       Access Key | Allowed Event(s)
[INFO] [App$]           mtkappname |    7 | j8O1U0F13em6HLnIdSuhYt8XYfqEAlkjmaiCyoG2Gnn5p1JfF2Wzc4HknwNq1ePZ | (all)
[INFO] [App$] Finished listing 1 app(s).


Note that my engine.json contains INVALID_APP_NAME and that I have to modify manually its name in the file, is that the same for everybody or is there something I dont do correctly?

I modify the engine.json accordingly:
{
  "id": "default",
  "description": "Default settings",
  "engineFactory": "com.esillage.ECommerceRecommendationEngine",
  "datasource": {
    "params" : {
      "appName": "mtkappname"
    }
  },
  "algorithms": [
    {
      "name": "ecomm",
      "params": {
        "appName": "mtkappname",
        "unseenOnly": true,
        "seenEvents": ["buy", "view"],
        "similarEvents": ["view"],
        "rank": 10,
        "numIterations" : 20,
        "lambda": 0.01,
        "seed": 3
      }
    }
  ]
}

Reply all
Reply to author
Forward
0 new messages