Hi!
Olivier Pietquin from Google will be giving a seminar on learning from demonstrations in deep reinforcement learning, this Friday, March 23 at 11am in F107, Inria Montbonnot. Title and abstract below.
Titre: Deep Q-learning from Demonstrations
Abstract: Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor. This may be acceptable for a simulator, but it severely limits the applicability of deep RL to many real-world tasks, where the agent must learn in the real environment. In this talks we present a setting where the agent may access data from previous control of the system. We propose an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration data and is able to automatically assess the necessary ratio of demonstration data while learning thanks to a prioritized replay mechanism. DQfD works by combining temporal difference updates with supervised classification of the demonstrator's actions. Results on Atari games and robotics will be presented.