PIO + ES5 + Universal Recommender

172 views
Skip to first unread message

Pat Ferrel

unread,
Nov 1, 2017, 6:44:30 PM11/1/17
to us...@predictionio.apache.org, actionml-user
We have a version working here: https://github.com/actionml/universal-recommender.git
checkout 0.7.0-SNAPSHOT once you pull the repo. 

Known bug: exclusion rules not working. This will be fixed before release in the next few days

Issues: do not trust the integration test, Lucene and ES have changed their scoring method and so you cannot compare the old scores to the new ones. the test will be fixed before release.

You must build the Template with pio v0.12.0 using Scala 2.11, Spark 2.2.1, ES 5.

Pat Ferrel

unread,
Nov 1, 2017, 7:30:19 PM11/1/17
to us...@predictionio.apache.org, actionml-user
Ack, I hate this &^%&%^&  touchbar!

What I meant to say was:


We have a version of the universal recommender working with PIO-0.12.0 that is ready for brave souls to test. This includes some speedups and quality of recommendation improvements, not yet documented. 

Known bugs: exclusion rules not working. This will be fixed before release in the next few days

Issues: do not trust the integration test, Lucene and ES have changed their scoring method and so you cannot compare the old scores to the new ones. The test will be fixed before release but do trust it to populate PIO with some sample data you can play with.

You must build PredictionIO with the default parameters so just run `./make-distribution` this will require you to install Scala 2.11, Spark 2.1.1 or greater, ES 5.5.2 or greater, Hadoop 2.6 or greater. If you have issues getting pio to build and run send questions to the PIO mailing list. Once PIO is running test with `pio status` and `pio app list`. You will need to create an app in import your data to run the integration test to get some sample data installed in the “handmade” app.

*Backup your data*, moving from ES 1 to ES 5 will delete all data!!!! Actually even worse it is still in HBase but you can’t get at it so to upgrade so the following
  • `pio export` with pio < 0.12.0 =====*Before upgrade!*=====
  • `pio data-delete` all your old apps =====*Before upgrade!*=====
  • build and install pio 0.12.0 including all the services =====*The point of no return!*=====
  • `pio app new …` and `pio import…` any needed datasets
  • download and build Mahout for Scala 2.11 from this repo: https://github.com/actionml/mahout.git follow the instructions in the README.md
  • download the UR from here: https://github.com/actionml/universal-recommender.git and checkout branch 0.7.0-SNAPSHOT
  • replace the line: `resolvers += "Local Repository" at "file:///Users/pat/.custom-scala-m2/repo”` with your path to the local mahout build
  • build the UR with `pio build` or run the integration test to get sample data put into PIO `./examples/integration-test`

This will use the released PIO and alpha UR

This will be much easier when it’s released

Pat Ferrel

unread,
Nov 3, 2017, 5:27:52 PM11/3/17
to us...@predictionio.apache.org, actionml-user
The exclusion rules are working now along with the integration-test. We have some cleanup but please feel free to try it.

Please note the upgrade issues mentioned below before you start, fresh installs should have no such issues.


--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To post to this group, send email to action...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/326BE669-574B-45A5-AAA5-6A285BA0B33E%40occamsmachete.com.
For more options, visit https://groups.google.com/d/optout.

bernhard...@maxodus.net

unread,
Nov 6, 2017, 6:52:47 AM11/6/17
to actionml-user
Hi Pat, 

yields

remote: Repository not found.
fatal: repository 'https://github.com/actionml/mahout.git/' not found

for me. Is this the right repo?

Cheers, 

Bernhard

Pat Ferrel

unread,
Nov 6, 2017, 12:04:19 PM11/6/17
to bernhard...@maxodus.net, actionml-user
There is a script named build-scala-2.11.sh you should run to build a local maven repo (change the path that it uses) that you will use later with the UR.


-- 
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To post to this group, send email to action...@googlegroups.com.

Noelia Osés Fernández

unread,
Nov 7, 2017, 3:52:26 AM11/7/17
to us...@predictionio.apache.org, actionml-user
Thank you, Pat!

I have a problem with the Mahout repo, though. I get the following error message:


remote: Repository not found.
fatal: repository 'https://github.com/actionml/mahout.git/' not found

To unsubscribe from this group and stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.




--

Noelia Osés Fernández, PhD
Senior Researcher |
Investigadora Senior


no...@vicomtech.org
+[34] 943 30 92 30
Data Intelligence for Energy and
Industrial Processes | Inteligencia
de Datos para Energía y Procesos
Industriales


  

member of:     

Legal Notice - Privacy policy

Pat Ferrel

unread,
Nov 7, 2017, 12:01:24 PM11/7/17
to Noelia Osés Fernández, us...@predictionio.apache.org, actionml-user
Very sorry, it was incorrectly set to private. Try it again.



To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.

To post to this group, send email to action...@googlegroups.com.

Pat Ferrel

unread,
Nov 7, 2017, 2:58:20 PM11/7/17
to actionml-user, us...@predictionio.apache.org
Very sorry, it was incorrectly set to private. Try it again.




On Nov 7, 2017, at 7:26 AM, Pat Ferrel <p...@occamsmachete.com> wrote:

Pat Ferrel

unread,
Nov 8, 2017, 11:52:17 AM11/8/17
to us...@predictionio.apache.org, actionml-user
“mvn not found”, install mvn. 

This step will go away with the next Mahout release.


On Nov 8, 2017, at 2:41 AM, Noelia Osés Fernández <no...@vicomtech.org> wrote:

Thanks Pat!

I have followed the instructions on the README.md file of the mahout folder:


You will need to build this using Scala 2.11. Follow these instructions

 - install Scala 2.11 as your default version

I've done this with the following commands:

# scala install
wget www.scala-lang.org/files/archive/scala-2.11.7.deb
sudo dpkg -i scala-2.11.7.deb
# sbt installation
echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 642AC823
sudo apt-get update
sudo apt-get install sbt

 - download this repo: `git clone https://github.com/actionml/mahout.git`
 - checkout the speedup branch: `git checkout sparse-speedup-13.0`
 - edit the build script `build-scala-2.11.sh` to put the custom repo where you want it

This file is now:

#!/usr/bin/env bash

git checkout sparse-speedup-13.0

mvn clean package -DskipTests -Phadoop2 -Dspark.version=2.1.1 -Dscala.version=2.11.11 -Dscala.compat.version=2.11

echo "Make sure to put the custom repo in the right place for your machine!"
echo "This location will have to be put into the Universal Recommenders build.sbt"

mvn deploy:deploy-file -Durl=file:///home/ubuntu/PredictionIO/apache-predictionio-0.12.0-incubating/PredictionIO-0.12.0-incubating/vendors/.custom-scala-m2/repo/ -Dfile=//home/ubuntu/PredictionIO/apache-predictionio-0.12.0-incubating/PredictionIO-0.12.0-incubating/vendors/mahout/hdfs/target/mahout-hdfs-0.13.0.jar -DgroupId=org.apache.mahout -DartifactId=mahout-hdfs -Dversion=0.13.0
mvn deploy:deploy-file -Durl=file:///home/ubuntu/PredictionIO/apache-predictionio-0.12.0-incubating/PredictionIO-0.12.0-incubating/vendors/.custom-scala-m2/repo/ -Dfile=//home/ubuntu/PredictionIO/apache-predictionio-0.12.0-incubating/PredictionIO-0.12.0-incubating/vendors/mahout/math/target/mahout-math-0.13.0.jar -DgroupId=org.apache.mahout -DartifactId=mahout-math -Dversion=0.13.0
mvn deploy:deploy-file -Durl=file:///home/ubuntu/PredictionIO/apache-predictionio-0.12.0-incubating/PredictionIO-0.12.0-incubating/vendors/.custom-scala-m2/repo/ -Dfile=//home/ubuntu/PredictionIO/apache-predictionio-0.12.0-incubating/PredictionIO-0.12.0-incubating/vendors/mahout/math-scala/target/mahout-math-scala_2.11-0.13.0.jar -DgroupId=org.apache.mahout -DartifactId=mahout-math-scala_2.11 -Dversion=0.13.0
mvn deploy:deploy-file -Durl=file:///home/ubuntu/PredictionIO/apache-predictionio-0.12.0-incubating/PredictionIO-0.12.0-incubating/vendors/.custom-scala-m2/repo/ -Dfile=//home/ubuntu/PredictionIO/apache-predictionio-0.12.0-incubating/PredictionIO-0.12.0-incubating/vendors/mahout/spark/target/mahout-spark_2.11-0.13.0.jar -DgroupId=org.apache.mahout -DartifactId=mahout-spark_2.11 -Dversion=0.13.0

 - execute the build script `build-scala-2.11.sh`

This outputed the following:


$ ./build-scala-2.11.sh
M    build-scala-2.11.sh
Already on 'sparse-speedup-13.0'
Your branch is up-to-date with 'origin/sparse-speedup-13.0'.
./build-scala-2.11.sh: line 5: mvn: command not found
Make sure to put the custom repo in the right place for your machine!
This location will have to be put into the Universal Recommenders build.sbt
./build-scala-2.11.sh: line 10: mvn: command not found
./build-scala-2.11.sh: line 11: mvn: command not found
./build-scala-2.11.sh: line 12: mvn: command not found
./build-scala-2.11.sh: line 13: mvn: command not found


Do I need to install MAVEN? If so, it is not said in the PredictionIO installation instructions nor on the Mahout instructions.

I apologise if this is an obvious question for those familiar with the Apache projects, but for an outsider like me it helps when everything (even the most silly details) is spelled out. Thanks a lot for all your invaluable help!!
 

bernhard...@maxodus.net

unread,
Nov 8, 2017, 2:01:00 PM11/8/17
to actionml-user
Hi Pat, 

Finally, success :) after failing the integration test (because port 8000 is in use for something else), UR is up and running on my Ubuntu 16.04 test server.

Will test a bit in the upcoming days!

Cheers, 

Bernhard
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To post to this group, send email to action...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/326BE669-574B-45A5-AAA5-6A285BA0B33E%40occamsmachete.com.
For more options, visit https://groups.google.com/d/optout.



--

Noelia Osés Fernández, PhD
Senior Researcher |
Investigadora Senior


no...@vicomtech.org
+[34] 943 30 92 30
Data Intelligence for Energy and
Industrial Processes | Inteligencia
de Datos para Energía y Procesos
Industriales


  

member of:     

Legal Notice - Privacy policy

Pat Ferrel

unread,
Nov 8, 2017, 4:20:38 PM11/8/17
to bernhard...@maxodus.net, actionml-user
Awesome, thanks!

You can change the test script to us a different port in `pio deploy --port 1234` but make sure to pass in this port to the query script, it is defaulted to 8000 but there is already an option for it. Not sure if the python script needs to be modified to use it with the SDK query client.


Noelia Osés Fernández

unread,
Nov 9, 2017, 4:52:14 AM11/9/17
to us...@predictionio.apache.org, actionml-user
It worked! Thank you Pat!
Reply all
Reply to author
Forward
0 new messages