We have released the OBP package version 0.5.2: https://github.com/st-tech/zr-obp/releases/tag/0.5.2
The changes are summarized below:
Updates
References
- Haruka Kiyohara, Yuta Saito, Tatsuya Matsuhiro, Yusuke Narita, Nobuyuki Shimizu, and Yasuo Yamamoto. Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model. WSDM2022.
- Arjun Sondhi, David Arbour, Drew Dimmery. Balanced Off-Policy Evaluation in General Action Spaces. AISTATS2020.
- Yi Su, Pavithra Srinath, Akshay Krishnamurthy. Adaptive Estimator Selection for Off-Policy Evaluation. ICML2020.
- George Tucker and Jonathan Lee. Improved Estimator Selection for Off-Policy Evaluation. ICML2021 Workshop.
- Alberto Maria Metelli, Alessio Russo, Marcello Restelli. Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning. NeurIPS2021.
- Noveen Sachdeva, Yi Su, and Thorsten Joachims. "Off-policy Bandits with Deficient Support.", KDD2020.
- Aman Agarwal, Soumya Basu, Tobias Schnabel, Thorsten Joachims. "Effective Evaluation using Logged Bandit Feedback from Multiple Loggers.", KDD2018.
- Nathan Kallus, Yuta Saito, and Masatoshi Uehara. "Optimal Off-Policy Evaluation from Multiple Logging Policies.", ICML2021.
Please update your package accordingly. We continue to improve and expand the software, stay tuned!
Best Regards,
Open Bandit Project Team