Re: Docs Universal Recommender

59 views
Skip to first unread message

Pat Ferrel

unread,
May 8, 2017, 1:02:51 PM5/8/17
to us...@predictionio.incubator.apache.org, actionml-user
yes to all for UR v0.5.0

UR v0.6.0 is sitting in the `develop` branch waiting for one more minor fix to be released. It uses the latest release of Mahout 0.13.0 so no need to build it for the project. Several new features too. I expect it to be out this week.


On May 8, 2017, at 3:07 AM, Dennis Honders <dennis...@gmail.com> wrote:

Hi, 

Are the following docs up-to-date?

Is version 0.11.0 suitable for UR?

Is 0.5.0 the latest version? 
Is Mahout still necessary?

Thanks,

Dennis

Pat Ferrel

unread,
May 10, 2017, 3:52:51 PM5/10/17
to us...@predictionio.incubator.apache.org, actionml-user
That is how to make personalized content-based recommendations.You’d have to input content by attaching it to items and recording it separately as a usage event per content bit. The input , for instance would be every term in the description of an item the user purchased. The input would be huge and the current UR + PIO is not optimized for that kind of input. It is not a recommended mode to use the UR and is of dubious value without NLP techniques such as word2vec or NER instead of bag-of-word type content. It might be ok if you have rich metadata like categories or tags.

In general content based recommendations are often little better than some filtering of popular or rotating promoted items (with no purchase history), both can be done fairly easily with the UR. 

Content based with NLP techniques for short lived items like news can work well but require extra phases in from of the recommender to do the NLP.


On May 10, 2017, at 12:33 PM, Marius Rabenarivo <mariusra...@gmail.com> wrote:

Hello,

2017-05-10 21:10 GMT+04:00 Pat Ferrel <p...@occamsmachete.com>:
Content based recommendations are based on, well, content. You can really only make recs if you have an example item as with the recommendations you see at the bottom of product page on Amazon.

For this make sure t have lots of properties of items, even keywords from descriptions will work, but also categories, tags, brands, price ranges. etc. These all must be encoded as JSON arrays of strings so prices might be one of [“$0-$1”, “$1-$5”, …] other things like descriptions categories or tags can have several strings attached. 

Then issue an item-based query with itemBias set higher (>1) to make use of usage information first before content since it performs better. Then add query fields for the various properties but include the values of the item referenced in the “item” field. 

You will get similar items based on usage data unless there is none then content will take over to recommend things with similar content. Play with the itemBias, try >1 by varying amounts since you want usage based similarity over content most of the time you have usage based data in the model. There is no hard rule for the bias.

  
On May 10, 2017, at 6:36 AM, Dennis Honders <dennis...@gmail.com> wrote:

According to the docs, the UR is considered as hybrid collaborative filtering / content-based filtering. 
In my case I have a purchase history. Quite a lot of products are never bought so traditional techniques won't be able to make recommendations. For those products (never bought/sold), will recommendations be made with content-based filtering techniques?
If so, what techniques are used in UR?

Marius Rabenarivo

unread,
May 10, 2017, 4:05:50 PM5/10/17
to us...@predictionio.incubator.apache.org, actionml-user
So in you opinion, do you think that the NLP task should be done in the Engine part using a library like mallet or should be implemented in algorithm focused library : mahout?

Pat Ferrel

unread,
May 10, 2017, 4:12:37 PM5/10/17
to Marius Rabenarivo, us...@predictionio.incubator.apache.org, actionml-user
What are your items? How much text? What other content? Unless you are recommending long for blogs or news NLP won’t give you much except maybe word2vec, which, if it has a good model, will give better than bag-of-words.


--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To post to this group, send email to action...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVGvbEM3nzmAPk4%2BD4GM6z1e1t9yJf4irR1kN1y5%3DAk4Ag%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Marius Rabenarivo

unread,
May 10, 2017, 4:19:59 PM5/10/17
to Pat Ferrel, us...@predictionio.incubator.apache.org, actionml-user
My items are products with name and description and maybe caption extracted from the image too.

To unsubscribe from this group and stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages