Empty results with bias -1

77 views
Skip to first unread message

tom...@magicinternet.de

unread,
Jul 26, 2016, 10:51:24 AM7/26/16
to actionml-user
Hello everyone,

I use predictionio 0.9.5 with universal recommender template 0.2.3

I run the following query - {"user":"598430", "fields": [{"name": "duration", "values":["574"], "bias": -1}]}

Which produce the following ES query - 

{"size":20,"query":{"bool":{"should":[{"terms":{"watched":[]}},{"terms":{"skipped":[]}},{"terms":{"discard":[]}},{"terms":{"preferences-tags":[]}},{"terms":{"preferences-language":[]}},{"terms":{"preferences-type":[]}},{"terms":{"preferences-provider":[]}},{"terms":{"preferences-communityId":[]}},{"constant_score":{"filter":{"match_all":{}},"boost":0}}],"must":[{"terms":{"duration":["574"],"boost":0}}],"must_not":{"ids":{"values":[],"boost":0}},"minimum_should_match":1}},"sort":[{"_score":{"order":"desc"}},{"popRank":{"unmapped_type":"double","order":"desc"}}]}



Although there are items which match this filter the result is empty. 


Anyone have a clue for me?


Thank you in advance,

Tom

tom...@magicinternet.de

unread,
Jul 27, 2016, 7:06:46 AM7/27/16
to actionml-user, tom...@magicinternet.de
Anyone? I am very lost here..

I already cleaned all the data, loaded new data, created a new app with the old data \ with a new data, etc.
And I always get empty result ({"itemScores":[]}) when I set the bias to -1. Where I set it to other numbers it is ok.

build, train, deploy stages all work correctly with no errors.

Some hint anyone?

Thanks, you help is appreciated

Pat Ferrel

unread,
Jul 27, 2016, 8:13:31 PM7/27/16
to tom...@magicinternet.de, actionml-user
Filters work on recommendations, not items, so if there is no recommendation you will get nothing returned.Since is no other data in the Elasticsearch query we can see that there is no user history for user 598430, hence no recommendations


--
You received this message because you are subscribed to the Google Groups "actionml-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to actionml-use...@googlegroups.com.
To post to this group, send email to action...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/actionml-user/ca5af63b-0d95-4fa0-bb76-bffdd8b603a6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

tom...@magicinternet.de

unread,
Jul 28, 2016, 7:22:16 AM7/28/16
to actionml-user, tom...@magicinternet.de
Thank you very much for your answer. I pick a user that does not exists in order to anonymize the data. Even for users which we have data for and we have relevant recommendations that fit this filter we got empty results.
Interesting insight - for some properties the filters work properly and for some not (all the metadata properties are saved as list of strings).
My next step was to run the examples from the tests and that didn't work as well.

We thought that something went really wrong with the ES and as we anyhow considered upgrading we are now upgrading to 0.9.6 + 0.3.0 UR version so now it works well.

Additional question - when I run `pio train -- --driver-memory 8g --executor-memory 8g --master spark://<ip>:7077` process get stuck on 

When I run it locally - it finishes after a minute. I wonder what do I do wrong.

Thanks!

Pat Ferrel

unread,
Jul 28, 2016, 4:29:59 PM7/28/16
to tom...@magicinternet.de, actionml-user
If there is no data for the user there can only be popularity based recommendation since collaborative filtering requires user data. The filters work on recommendations, not items. So if you have no recs there is nothing to filter. Do you have the popularity model enabled? This should fill in with popular items if there are no collaborative filtering recommendations. 

How much memory on you “local” machine and how much on the non-local machine? Do you have Spark running on a separate cluster? If you run the driver and a Spark executor on the same machine they both need 8g so a total of 16g, and this may be too much to allocate since the OS and other services need memory too.


tom...@magicinternet.de

unread,
Jul 29, 2016, 11:30:21 AM7/29/16
to actionml-user, tom...@magicinternet.de, p...@occamsmachete.com
Hello,

Thank you for your time and effort.

We still got empty results. First I get a recommendation for a user - 

curl http://<my_ip>:8000/queries.json -d '{"user":"Hs6UvSj5Kxvb00J79lmeVFkuCVw"} 


Then I take the first result (this result have a score) and look on the metadata and try to filter the result by one of the metadata properties - 


curl http://<my_ip>:8000/queries.json -d '{"user":"Hs6UvSj5Kxvb00J79lmeVFkuCVw", "fields":[{"name":"my_field","values":["my_val"],"bias":-1}], "num": 5}'


And then I get an empty result.

I suspect this has to do with the elasticsearch. When I run pio status it seems ok. I also tried to restart the ES + Hbase and the status does not changed.


In the ES log I see (several times) - 

[2016-07-29 15:01:12,533][DEBUG][action.index             ] [Libra] observer: timeout notification from cluster service. timeout setting [1m], time since start [1m]

[2016-07-29 15:01:28,530][INFO ][cluster.metadata         ] [Libra] [urindex_1469804488512] creating index, cause [api], shards [5]/[1], mappings [items]

[2016-07-29 15:01:34,452][INFO ][cluster.metadata         ] [Libra] [urindex_1469804017367] deleting index


In the HBASE log I see - 

2016-07-29 15:20:42,983 WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181] server.NIOServerCnxn: caught end of stream exception

EndOfStreamException: Unable to read additional data from client sessionid 0x1563722aa540033, likely client has closed socket

        at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)

        at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)

        at java.lang.Thread.run(Thread.java:745)


The machine have 8giga ram, the spark is external (1.5.1), hbase 1.0.0, elasticsearch 1.4.4


Thanks!

Pat Ferrel

unread,
Jul 29, 2016, 2:58:31 PM7/29/16
to tom...@magicinternet.de, actionml-user
Unless the property you are filtering on is in Elasticsearch they won’t work so you must check there to see if they look correct. If they do not, then you have an error in you changing properties code.

Get the Chrome Sense extension and point it at your Elasticsearch server like below. The highlighted line will fetch a single item, the one above will get several so you can browse the “model”. I suspect that what you think are property values do not get written because of some $set encoding issue during input.

tom...@magicinternet.de

unread,
Aug 1, 2016, 8:41:17 AM8/1/16
to actionml-user, tom...@magicinternet.de, p...@occamsmachete.com
Thank you very much. Also read the documentation in the actionml docs, was helpful.

tom...@magicinternet.de

unread,
Aug 2, 2016, 5:14:57 AM8/2/16
to actionml-user, tom...@magicinternet.de, p...@occamsmachete.com
Hello,

So we have the same issue again. When I query the ES to specific item I get - 


{
           
"_index": "urindex_1470128185120",
           
"_type": "items",
           
"_id": "de76233fe0813247277deb9128fee413",
           
"_score": 1,
           
"_source": {
               
"id": "de76233fe0813247277deb9128fee413"
           
}
         
}


However in the eventserver I see - 



{
eventId
: "QEecorCLJcdKes-d6zeRUQAAAVYwg7EMiyYkgzr_Lkk",
event: "$set",
entityType
: "item",
entityId
: "de76233fe0813247277deb9128fee413",
properties
:
{
locations
:
[
"Los Angeles",
"USA"],
duration
:
[
"1782"],
releaseDate
:
[
"2015"],
mood
:
[
"cool"],
communityId
:
[
"76c6c1170762878c1929fc578ef41ae2"],
provider
:
[
"youtube"],
style
:
[
"männlich"],
tags
:
[
"skateboard",
"hip hop",
"musik",
"kunst",
"action",
"lifestyle",
"stunt",
"stadt"],
fsk
:
[
"0"],
language
:
[
"Englisch"],
type
:
[
"Film"],
subType
:
[
"Dokumentation"]},
eventTime
: "2016-07-28T07:59:12.140Z",
creationTime
: "2016-07-28T09:57:20.150Z"}


The train and deploy process runs correctly without any errors or warnings.
In the last 2 days we did not do any changes in the infrastructure but we have cron jobs which read the metadata from other source and create $set events, add general events and train and deploy. 

Would love to hear any idea, Thanks!

Pat Ferrel

unread,
Aug 2, 2016, 10:25:40 AM8/2/16
to tom...@magicinternet.de, actionml-user
A couple things are odd here:
1) there seem to be no usage events on the item. We filter recommendations and with no usage events they will never be recommended.   We can tell this since there is no property for the eventNames in engine.json.
2) the properties are encoded correctly but since there is no usage event there will be no object to attach them to.

Remember filters and boosts work on _recommendations_ not items. Unless the item can be returned as a recommendation, it will not be subject to filters because it is not in the potential return results.


Reply all
Reply to author
Forward
0 new messages