OBP Package v0.4.1 released

30 views

Skip to first unread message

unread,

Jul 2, 2021, 3:43:59 AM7/2/21

to Open Bandit Project

Dear all,

The changes are summarized below:

Add some functions to implement OPE for the slate contextual bandit setting [1]
- SlateSyntheticBanditFeedback (#82, #93, #95, #98, #100, #101, #102, #104, #105)
- Slate OPE Estimators (#88)
Make `OffPolicyEvaluation` class more useful
- add a method to visualize and compare OPE results of several different policies (#103)
- Enable to use different `estimated_rewards_by_reg_model` values (this will make MRDR [2] easier to use with obp, #92)
Fix some bugs and Refactoring
- Epsilon-greedy algorithm (#107)
- Type checks in OPE estimators (#106)
- Linear and logistic policies (#91)
Welcome new contributors (#94)

references

[1] James McInerney, Brian Brost, Praveen Chandar, Rishabh Mehrotra, and Benjamin Carterette. 2020. Counterfactual Evaluation of Slate Recommendations with Sequential Reward Interactions. In Proceedings of the 26th ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining. 1779–1788.
[2] Mehrdad Farajtabar, Yinlam Chow, and Mohammad Ghavamzadeh. 2018. More robust doubly robust off-policy evaluation. In Proceedings of the 35th International Conference on Machine Learning, PMLR 80, 1447–1456.

Please update your package accordingly. We continue to improve and expand the software; stay tuned!