Daily TMLR digest for Nov 16, 2022

3 views
Skip to first unread message

TMLR

unread,
Nov 15, 2022, 7:00:07 PM11/15/22
to tmlr-anno...@googlegroups.com


New certifications
==================

Survey Certification: Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Jakob Hollenstein, Sayantan Auddy, Matteo Saveriano, Erwan Renaudo, Justus Piater

https://openreview.net/forum?id=NljBlZ6hmG

---


Accepted papers
===============


Title: Action Noise in Off-Policy Deep Reinforcement Learning: Impact on Exploration and Performance

Authors: Jakob Hollenstein, Sayantan Auddy, Matteo Saveriano, Erwan Renaudo, Justus Piater

Abstract: Many Deep Reinforcement Learning (D-RL) algorithms rely on simple forms of exploration
such as the additive action noise often used in continuous control domains. Typically,
the scaling factor of this action noise is chosen as a hyper-parameter and is kept constant
during training. In this paper, we focus on action noise in off-policy deep reinforcement
learning for continuous control. We analyze how the learned policy is impacted by the noise
type, noise scale, and impact scaling factor reduction schedule. We consider the two most
prominent types of action noise, Gaussian and Ornstein-Uhlenbeck noise, and perform a vast
experimental campaign by systematically varying the noise type and scale parameter, and
by measuring variables of interest like the expected return of the policy and the state-space
coverage during exploration. For the latter, we propose a novel state-space coverage measure
$\operatorname{X}_{\mathcal{U}\text{rel}}$ that is more robust to estimation artifacts caused by points close to the
state-space boundary than previously-proposed measures. Larger
noise scales generally increase state-space coverage. However, we found that increasing the
space coverage using a larger noise scale is often not beneficial. On the contrary, reducing
the noise scale over the training process reduces the variance and generally improves the
learning performance. We conclude that the best noise type and scale are environment
dependent, and based on our observations derive heuristic rules for guiding the choice of the
action noise as a starting point for further optimization.


URL: https://openreview.net/forum?id=NljBlZ6hmG

---

Title: Unifying Approaches in Active Learning and Active Sampling via Fisher Information and Information-Theoretic Quantities

Authors: Andreas Kirsch, Yarin Gal

Abstract: Recently proposed methods in data subset selection, that is active learning and active sampling, use Fisher information, Hessians, similarity matrices based on gradients, and gradient lengths to estimate how informative data is for a model’s training. Are these different approaches connected, and if so, how? We revisit the fundamentals of Bayesian optimal experiment design and show that these recently proposed methods can be understood as approximations to information-theoretic quantities: among them, the mutual information between predictions and model parameters, known as expected information gain or BALD in machine learning, and the mutual information between predictions of acquisition candidates and test samples, known as expected predictive information gain. We develop a comprehensive set of approximations using Fisher information and observed information and derive a unified framework that connects seemingly disparate literature. Although Bayesian methods are often seen as separate from non-Bayesian ones, the sometimes fuzzy notion of “informativeness” expressed in various non-Bayesian objectives leads to the same couple of information quantities, which were, in principle, already known by Lindley (1956) and MacKay (1992).

URL: https://openreview.net/forum?id=UVDAKQANOW

---

Title: An Efficient One-Class SVM for Novelty Detection in IoT

Authors: Kun Yang, Samory Kpotufe, Nicholas Feamster

Abstract: One-Class Support Vector Machines (OCSVM) are a common approach for novelty detection, due to their flexibility in fitting complex nonlinear boundaries between {normal} and {novel} data. Novelty detection is important in the Internet of Things (``IoT'') due to the threats these devices can present, and OCSVM often performs well in these environments due to the variety of devices, traffic patterns, and anomalies that IoT devices present. Unfortunately, conventional OCSVMs can introduce prohibitive memory and computational overhead at detection time. This work designs, implements and evaluates an efficient OCSVM for such practical settings. We extend Nystr\"om and (Gaussian) Sketching approaches to OCSVM, combining these methods with clustering and Gaussian mixture models to achieve 15-30x speedup in prediction time and 30-40x reduction in memory requirements without sacrificing detection accuracy. Here, the very nature of IoT devices is crucial: they tend to admit few modes of \emph{normal} operation, allowing for efficient pattern compression.

URL: https://openreview.net/forum?id=LFkRUCalFt

---


New submissions
===============


Title: ES-ENAS: Efficient Evolutionary Optimization for Large-Scale Hybrid Search Spaces

Abstract: In this paper, we approach the problem of optimizing blackbox functions over large hybrid search spaces consisting of both combinatorial and continuous parameters. We demonstrate that previous evolutionary algorithms which rely on mutation-based approaches, while flexible over combinatorial spaces, suffer from a curse of dimensionality in high dimensional continuous spaces both theoretically and empirically, which thus limits their scope over hybrid search spaces as well. In order to combat this curse, we propose ES-ENAS, a simple and modular joint optimization procedure combining the class of sample-efficient smoothed gradient gradient techniques, commonly known as Evolutionary Strategies (ES), with combinatorial optimizers in a highly scalable and intuitive way, inspired by the one-shot or supernet paradigm introduced in Efficient Neural Architecture Search (ENAS). By doing so, we achieve significantly more sample efficiency, which we empirically demonstrate over synthetic benchmarks, and are further able to apply ES-ENAS for architecture search over popular RL benchmarks.

URL: https://openreview.net/forum?id=EKtlJWam6h

---

Title: Transport Score Climbing: Variational Inference Using Forward KL and Adaptive Neural Transport

Abstract: Variational inference often minimizes the ``reverse'' Kullbeck-Leibler (KL) $D_{KL}(q||p)$ from the approximate distribution $q$ to the posterior $p$. Recent work studies the ``forward'' KL $D_{KL}(p||q)$, which unlike reverse KL does not lead to variational approximations that underestimate uncertainty. Markov chain Monte Carlo (MCMC) methods were used to evaluate the expectation in computing the forward KL. This paper introduces Transport Score Climbing (TSC), a method that optimizes $D_{KL}(p||q)$ by using Hamiltonian Monte Carlo (HMC) but running the HMC chain on a transformed, or warped, space. A function called the transport map performs the transformation by acting as a change-of-variable from the latent variable space. TSC uses HMC samples to dynamically train the transport map while optimizing $D_{KL}(p||q)$. TSC leverages synergies, where better transport maps lead to better HMC sampling, which then leads to better transport maps. We demonstrate TSC on synthetic and real data, including using TSC to train variational auto-encoders. We find that TSC achieves competitive performance on the experiments.

URL: https://openreview.net/forum?id=zfBW39xZ2E

---

Title: VRNN’s got a GAN: Generating Time Series using Variational Recurrent Neural Models with Adversarial Training

Abstract: Time-series data generation is a machine learning task growing in popularity, and
has been a focus of deep generative methods. The task is especially important
in fields where large amounts of training data are not available, and in applica-
tions where privacy preservation using synthetic data is preferred. In the past,
generative adversarial models (GANs) were combined with recurrent neural net-
works (RNNs) to produce realistic time-series data. Moreover, RNNs with time-
step variational autoencoders were shown to have the ability to produce diverse
temporal realizations. In this paper, we propose a novel data generating model,
dubbed VRNN-GAN, that employs an adversarial framework with an RNN-based
Variational Autoencoder (VAE) serving as the generator and a bidirectional RNN
serving as the discriminator. The recurrent VAE captures temporal dynamics into
a learned time-varying latent space while the adversarial training encourages the
generation of realistic time-series data. We compared the performance of VRNN-
GAN to state-of-the-art deep generative methods on the task of generating syn-
thetic time-series data. We show that VRNN-GAN achieves the best predictive
score across all methods and yields competitive results in other well-established
performance measures compared to the state-of-the-art.

URL: https://openreview.net/forum?id=JjNNIyKtiM

---

Reply all
Reply to author
Forward
0 new messages