When a user expresses a preference for a tag, word or term as in search or even in content like descriptions, these can be considered secondary events. The most useful are tags and search terms in our experience. Content can be used but each term/token needs to be sent as a separate preference while search phrases can be used though again turning them into tokens may be better.
Please looks through the docs here: http://actionml.com/docs/ur or the siide deck here: https://www.slideshare.net/pferrel/unified-recommender-39986309The major innovation of CCO, the algorithm behind the UR, is the use of these cross-domain indicators. They are not guaranteed to predict conversions but the CCO algo tests them and weights them low if they do not so we tend to test for strength of prediction of the entire category of indictor and drop them if weak or set a minLLR threshold and filter weak individual indicators out.
Technically these are not called latent, that has another meaning in Machine Learning having to do with Latent Factor Analysis.
On Jun 1, 2017, at 11:26 PM, Marius Rabenarivo <mariusra...@gmail.com> wrote:
MariusHello everyone!Regards,
Do you have an idea on how to use latent informations associated to items like tag, word vector embedding in Mahout's SimilarityAnalysis.cooccurrences?
--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.
Buy purchasing an item with a tag that you have given it, they are displaying a preference for that tag.
On Jun 3, 2017, at 12:36 PM, Marius Rabenarivo <mariusra...@gmail.com> wrote:
So the tag here is assumed to be a tag given by the user to an item?I was thinking that it was some kind of tag we give to the item by some mean (classification, LDA, etc)
A = history of all purchases (in the e-com case)B = history of all tag preferencesr = [A’A]h_a + [A’B]h_bThe part in the slides about content-based recs is not needed here because you have captured them as user preferences.
On Jun 2, 2017, at 7:22 PM, Marius Rabenarivo <mariusra...@gmail.com> wrote:
Please correct side to size in my previous e-mail
To post to this group, send email to actionml-user@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVEO_YON-5E95iPJjBR-FUgEv8TQsOA0rtD-xg0u-tNA_g%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVFMsZw3uKtJQ8Mi00vvfRz4wOo3bacs5KMzcqS0kDdc0A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/CAC-ATVEuH6iFKAyzDt8_MdAWQuzjgb%3Dx3EdULpqjHK3LtEfdcQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-user+unsubscribe@googlegroups.com.
TT’ does not solve cold start because you need user history for personalizations. There are several other techniques that I’ve mentioned many times on the list that help with cold start but TT’ is for a slightly different thing. It’s use is when you have a user’s history of item preferences but the items are too old to recommend and you only want to recommend new ones with no history. If you think about news, it is close to being like this. Or patent application, law opinions or judgments too. To be helpful there needs to be a lot of content for each item and you only want new things recommended.What cold-start do you need to “solve” new anonymous users with no history or items with no conversions? Search the PIO list and AML group for past posts on this.Tag use is implemented as both CF and content similarity (not TT’). If you ask for item-based recommendation and the item has no conversions, you will get popular items by default. If you boost items with the same tags as the item the user is looking at, you get popular items mostly with similar tags. If you disable the popularity part you get items with similar tags, This requires that you attach tags to the items with $set and your query should contain the tags (or any other properties) of the example item. There are many ways of mixing this. You could also just get recs and mix-in new inventory by some small random amount. You can use different placements for these so you aren’t ruining recs with too much randomized cold-items.Anyway, the best way to do this depends on your GUI and data.
On Jun 4, 2017, at 11:35 AM, Marius Rabenarivo <mariusra...@gmail.com> wrote:
Is it SimilarityAnalysis.rowSimilarity() in Mahout that implement TT'? (just to confirm)but I didn't find tag use in the code.to use content based information when doing recommendation.I didn't mean to tell you what it means, but I just wanted to make it clear for my part.As I understand, the T part is a personalization that we should make if we wantFor my use case, I want to use it for to overcome the cold start problem.I was thinking that it was already implemented as you documented it in the slides
No offense Marius but I wrote the slides and the equation so I do indeed know what they are saying. Whether a user writes a tag or you are detecting the user preference for a tag you wrote, they are user indicators of preference. The LLR filtering of these secondary indicators is what CCO is all about and leaves you with a model that can be compared to a user’s history and contains only indicators that correlate to some conversion behavior.T in the "whole enchilada" it used to personalize content based recommendations. Each row of T represent an item and it’s content as tokens. Tokens are stemmed, tokenized text terms, of can be entities in the item’s text (using some form of NLP) or tags, etc. TT’ then gives you items and items that are most similar in terms of whatever content you were using in T. Now you take the users’s history of content item preference, which articles did they read for instance, and the most similar items in TT’. These will be personalized content-based recommendations.This is not implemented in the UR but is in the CCO tools in Mahout. The reason it is not implemented is that it still requires users history and content-based recs are worse predictors than collaborative filtering with user history. In CF you treat the terms or tags as indicators of preference you do not find items similar by content.
The personalized content-based recs may serve for edge conditions where you are recommending items with no usage behavior as the most common case, like news articles where you have no items all the time with no usage events. In this case extracting something better than “bag-of-words” for content is quite important. So highly detailed user tagging or NLP techniques can greatly increase the quality of results.
On Jun 4, 2017, at 4:09 AM, Marius Rabenarivo <mariusra...@gmail.com> wrote:
IMHO, T represents tag it an Anonymous tag (or property) labeling taskand what you propose is Personalized tag (or property) labeling
as described in https://arxiv.org/pdf/1203.4487.pdf (Section 1.4.5 Emerging new classification) p. 40