[Theory Lunch 02/26] Liyu Chen: Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition

2 views
Skip to first unread message

Haoming Li

unread,
Feb 22, 2021, 5:44:18 AM2/22/21
to usc-theo...@googlegroups.com, usc-t...@googlegroups.com
USC CS Theory Lunch:

Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition
Speaker: Liyu Chen (https://lchenat.github.io/)
Time: 02/26/21 11:45am PST
Location: https://usc.zoom.us/j/94386654763

Abstract:

We study the stochastic shortest path problem with adversarial costs and known transition, and show that the minimax regret is O(\sqrt{DTK}) and O(\sqrt{DTSAK}) for the full-information setting and the bandit feedback setting respectively, where D is the diameter, T is the expected hitting time of the optimal policy, S is the number of states, A is the number of actions, and K is the number of episodes. Our results significantly improve upon the recent work of (Rosenberg and Mansour, 2020) which only considers the full-information setting and achieves suboptimal regret. Our work is also the first to consider bandit feedback with adversarial costs. (https://arxiv.org/abs/2012.04053)
Reply all
Reply to author
Forward
0 new messages