Dear all,
We have released the OBP package version 0.5.0: https://github.com/st-tech/zr-obp/releases/tag/0.5.0
The changes are summarized below:
Major updates
- Add OPE/OPE with Continuous Actions
- SyntheticContinuousBanditDataset (#112 )
- ContinuousOPEEstimators [1] (#113 )
- ContinuousNNPolicyLearner [2] (#114 )
- Add Weight clipping to IPW and DR (#115 )
- Add Automatic Hyperparameter Tuning of OPE estimators [3] (#116, #131 )
- Add some arguments to the SyntheticBanditDataset class to generate more flexible synthetic data (#123 )
- Add a subsample option to the OpenBanditDataset class (#118 )
- Modify an input type of off_policy_objective argument and Add some hyperparameters to NNPolicyLearner (#132)
Minor updates
- Fix README (#119 )
- Fix Scalar value checking (#122 )
- Add ValueError to OffPolicyEvaluation class (#125 )
- Fix Error messages (#126 )
- Add Some Errors (#125, #129 )
- Update Quickstart examples (#127 )
Cautions
- the hyperparameter name of obp.ope.SwitchDoublyRobust has changed to lambda_ from tau
- the type of argument off_policy_objective of obp.policy.NNPolicyLearner has changed to str from callable
References
- Nathan Kallus and Angela Zhou. Policy Evaluation and Optimization with Continuous Treatments, AISTATS2018.
- Nathan Kallus and Masatoshi Uehara. "Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies", NeurIPS2020.
- Yi Su, Maria Dimakopoulou, Akshay Krishnamurthy, and Miroslav Dudik. "Doubly Robust Off-Policy Evaluation with Shrinkage.", ICML2020.
Please update your package accordingly. We continue to improve and expand the software; stay tuned!
Best Regards,
Open Bandit Project Team