Accepted papers
===============
Title: Leveraging Demonstrations with Latent Space Priors
Authors: Jonas Gehring, Deepak Gopinath, Jungdam Won, Andreas Krause, Gabriel Synnaeve, Nicolas Usunier
Abstract: Demonstrations provide insight into relevant state or action space regions, bearing great potential to boost the efficiency and practicality of reinforcement learning agents. In this work, we propose to leverage demonstration datasets by combining skill learning and sequence modeling. Starting with a learned joint latent space, we separately train a generative model of demonstration sequences and an accompanying low-level policy. The sequence model forms a latent space prior over plausible demonstration behaviors to accelerate learning of high-level policies. We show how to acquire such priors from state-only motion capture demonstrations and explore several methods for integrating them into policy learning on transfer tasks. Our experimental results confirm that latent space priors provide significant gains in learning speed and final performance. We benchmark our approach on a set of challenging sparse-reward environments with a complex, simulated humanoid, and on offline RL benchmarks for navigation and object manipulation.
URL: https://openreview.net/forum?id=OzGIu4T4Cz
---
Title: Solving Nonconvex-Nonconcave Min-Max Problems exhibiting Weak Minty Solutions
Authors: Axel Böhm
Abstract: We investigate a structured class of nonconvex-nonconcave min-max problems exhibiting so-called \emph{weak Minty} solutions, a notion which was only recently introduced, but is able to simultaneously capture different generalizations of monotonicity. We prove novel convergence results for a generalized version of the optimistic gradient method (OGDA) in this setting, matching the $1/k$ rate for the best iterate in terms of the squared operator norm recently shown for the extragradient method (EG). In addition we propose an adaptive step size version of EG, which does not require knowledge of the problem parameters.
URL: https://openreview.net/forum?id=Gp0pHyUyrb
---
Title: Extreme Masking for Learning Instance and Distributed Visual Representations
Authors: Zhirong Wu, Zihang Lai, Xiao Sun, Stephen Lin
Abstract: The paper presents a scalable approach for learning spatially distributed visual representations over individual tokens and a holistic instance representation simultaneously. We use self-attention blocks to represent spatially distributed tokens, followed by cross-attention blocks to aggregate the holistic instance. The core of the approach is the use of extremely large token masking (75\%-90\%) as the data augmentation for supervision. Our model, named ExtreMA, follows the plain BYOL approach where the instance representation from the unmasked subset is trained to predict that from the intact input. Instead of encouraging invariance across inputs, learning requires the model to capture informative variations in an image.
The paper makes three contributions: 1) It presents random masking as a strong and computationally efficient data augmentation for siamese representation learning. 2) With multiple sampling per instance, extreme masking greatly speeds up learning and improves performance with more data. 3) ExtreMA obtains stronger linear probing performance than masked modeling methods, and better transfer performance than prior contrastive models.
URL: https://openreview.net/forum?id=3epEbhdgbv
---
New submissions
===============
Title: Contextualize Me – The Case for Context in Reinforcement Learning
Abstract: While Reinforcement Learning (RL) has made great strides towards solving increasingly complicated problems, many algorithms are still brittle to even slight environmental changes. Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner, thereby enabling flexible, precise and interpretable task specification and generation. Therefore cRL formalizes the study of generalization in RL. Our goal is to show how the framework of cRL can contribute to both our theoretical understanding and practical solutions of generalization. We show that theoretically optimal behavior in contextual Markov Decision Processes requires explicit context information. We empirically validate this result on various context-extended versions of common RL environments. They are part of the first benchmark library designed for generalization based on cRL extensions of popular benchmarks, CARL, which we propose as a testbed to study general agents further.
URL: https://openreview.net/forum?id=Y42xVBQusn
---