Weekly TMLR digest for May 22, 2022

5 views

Skip to first unread message

TMLR

unread,

May 21, 2022, 8:00:06 PM5/21/22

to tmlr-annou...@googlegroups.com

New submissions
===============

Title: Causal Feature Selection via Orthogonal Search

Abstract: The problem of inferring the direct causal parents of a response variable among a large set of explanatory variables is of high practical importance in many disciplines. However, established approaches often scale at least exponentially with the number of explanatory variables, are difficult to extend to nonlinear relationships and are difficult to extend to cyclic data. Inspired by debiased machine learning methods, we study a one-vs.-the-rest feature selection approach to discover the direct causal parent of the response. We propose an algorithm that works for purely observational data while also offering theoretical guarantees, including the case of partially nonlinear relationships possibly under the presence of cycles. As it requires only one estimation for each variable, our approach is applicable even to large graphs. We demonstrate significant improvements compared to established approaches.

URL: https://openreview.net/forum?id=Q54jBjc896

---

Title: Rethinking Multidimensional Discriminator Output for Generative Adversarial Networks

Abstract: The study of multidimensional discriminator (critic) output for Generative Adversarial Networks has been underexplored in the literature.
In this paper, we generalize the Wasserstein GAN framework to take advantage of multidimensional critic output and explore its properties.
We also introduce a square-root velocity transformation (SRVT) block which favors training in the multidimensional setting. Proofs of properties are based on our proposed maximal $p$-centrality discrepancy, which is bounded above by $p$-Wasserstein distance and fits the Wasserstein GAN framework with multidimensional critic output $n$. Especially when $n=1$ and $p=1$, the proposed discrepancy equals $1$-Wasserstein distance. Theoretical analysis and empirical evidence show that high-dimensional critic output has its advantage on distinguishing real and fake distributions, and benefits faster convergence and diversity of results.

URL: https://openreview.net/forum?id=uY0yy2bXUc

---

Title: Adversarial Style Transfer for Robust Policy Optimization in Deep Reinforcement Learning

Abstract: This paper proposes an algorithm that aims to improve generalization for reinforcement learning agents by removing overfitting to confounding features. Our approach consists of a max-min game theoretic objective. A generator transfers the style of observation during reinforcement learning. An additional goal of the generator is to perturb the observation, which maximizes the agent's probability of taking a different action. In contrast, a policy network updates its parameters to minimize the effect of such perturbations, thus staying robust while maximizing the expected future reward. Based on this setup, we propose a practical deep reinforcement learning algorithm, Adversarial Robust Policy Optimization (ARPO), to find a robust policy that generalizes to unseen environments. We evaluate our approach on Procgen and Distracting Control Suite for generalization and sample efficiency. Empirically, ARPO shows improved performance compared to a few baseline algorithms, including data augmentation.

URL: https://openreview.net/forum?id=ljXArjJILf

---

Title: Did I do that? Blame as a means to identify controlled effects in reinforcement learning

Abstract: Affordance learning is a crucial ability of intelligent agents. This ability relies on understanding the different ways the environment can be controlled. Approaches encouraging RL agents to model controllable aspects of their environment have repeatedly achieved state-of-the-art results. Despite their success, these approaches have only been studied using generic tasks as a proxy but have not yet been evaluated in isolation. In this work, we study the problem of identifying controlled effects from a causal perspective. Humans compare counterfactual outcomes to assign a degree of blame to their actions. Following this idea, we propose Controlled Effect Network (CEN), a self-supervised method based on the causal concept of blame. CEN is evaluated in a wide range of environments against two state-of-the-art models, showing that it precisely identifies controlled effects.

URL: https://openreview.net/forum?id=NL2L3XjVFx

---

Title: Structure by Architecture: Disentangled Representations without Regularization

Abstract: We study the problem of self-supervised structured representation learning using autoencoders for downstream tasks such as generative modeling. Unlike most methods which rely on matching an arbitrary, relatively unstructured, prior distribution for sampling, we propose a sampling technique that relies solely on the independence of latent variables, thereby avoiding the trade-off between reconstruction quality and generative performance inherent to VAEs. We design a novel autoencoder architecture capable of learning a structured representation without the need for aggressive regularization. Our structural decoders learn a hierarchy of latent variables, akin to structural causal models, thereby ordering the information without any additional regularization. We demonstrate how these models learn a representation that improves results in a variety of downstream tasks including generation, disentanglement, and extrapolation using several challenging and natural image datasets.

URL: https://openreview.net/forum?id=oG8W3CfXal

---

Title: Subgraph Permutation Equivariant Networks

Abstract: In this work we develop a new method, named Sub-graph Permutation Equivariant Networks (SPEN), which provides a framework for building graph neural networks that operate on sub-graphs, while using permutation equivariant update functions that are also equivariant to a novel choice of automorphism groups. Message passing neural networks have been shown to be limited in their expressive power and recent approaches to over come this either lack scalability or require structural information to be encoded into the feature space. The general framework presented here overcomes the scalability issues associated with global permutation equivariance by operating on sub-graphs. In addition, through operating on sub-graphs the expressive power of higher-dimensional global permutation equivariant networks is improved; this is due to fact that two non-distinguishable graphs often contain distinguishable sub-graphs. Furthermore, the proposed framework only requires a choice of $k$-hops for creating ego-network sub-graphs and a choice of representation space to be used for each layer, which makes the method easily applicable across a range of graph based domains. We experimentally validate the method on a range of graph benchmark classification tasks, demonstrating either state-of-the-art results or very competitive results on all benchmarks. Further, we demonstrate that the use of local update functions offers a significant improvement in GPU memory over global methods.

URL: https://openreview.net/forum?id=X1XDtko96e

---

Title: Stable and Interpretable Unrolled Dictionary Learning

Abstract: The dictionary learning problem, representing data as a combination of a few atoms, has long stood as a popular method for learning representations in statistics and signal processing. The most popular dictionary learning algorithm alternates between sparse coding and dictionary update steps, and a rich literature has studied its theoretical convergence. The success of dictionary learning relies on access to a ``good'' initial estimate of the dictionary and the ability of the sparse coding step to provide an unbiased estimate of the code. The growing popularity of unrolled sparse coding networks has led to the empirical finding that backpropagation through such networks performs dictionary learning. We offer the theoretical analysis of these empirical results through PUDLE, a Provable Unrolled Dictionary LEarning method. We provide conditions on the network initialization and data distribution sufficient to recover and preserve the support of the latent code. Additionally, we address two challenges; first, the vanilla unrolled sparse coding computes a biased code estimate, and second, gradients during backpropagated learning can become unstable. We show approaches to reduce the bias of the code estimate in the forward pass, and that of the dictionary estimate in the backward pass. We propose strategies to resolve the learning instability by tuning network parameters and modifying the loss function. Overall, we highlight the impact of loss, unrolling, and backpropagation on convergence. We complement our findings through synthetic and image denoising experiments. Finally, we demonstrate PUDLE's interpretability, a driving factor in designing deep networks based on iterative optimizations, by building a mathematical relation between network weights, its output, and the training set.

URL: https://openreview.net/forum?id=e3S0Bl2RO8

---

Title: Representation Alignment in Neural Networks

Abstract: It is now a standard for neural network representations to be trained on large, publicly available datasets, and used for new problems. The reasons for why neural network representations have been so successful for transfer, however, are still not fully understood. In this paper, we demonstrate that representation alignment may play an important role. We show that, after training, neural network representations align their top singular vectors to the targets. We investigate this representation alignment phenomenon in a variety of neural network architectures and find that (a) alignment emerges across a variety of different architectures and optimizers, with more alignment arising from depth (b) alignment increases for layers closer to the output and (c) existing high-performance deep CNNs exhibit high levels of alignment. We then highlight why alignment between the top singular vectors and the targets promote transfer and show in a classic synthetic transfer problem that representation alignment is the determining factor for positive and negative transfer to similar and dissimilar tasks.

URL: https://openreview.net/forum?id=fLIWMnZ9ij

---

Title: On the link between conscious function and general intelligence in humans and machines

Abstract: In popular media, there is often a connection drawn between the advent of awareness in artificial agents and those same agents simultaneously achieving human or superhuman level intelligence. In this work, we explore the validity and potential application of this seemingly intuitive link between consciousness and intelligence. We do so by examining the cognitive abilities associated with three contemporary theories of conscious function: Global Workspace Theory (GWT), Information Generation Theory (IGT), and Attention Schema Theory (AST). We find that all three theories specifically relate conscious function to some aspect of domain-general intelligence in humans. With this insight, we turn to the field of Artificial Intelligence (AI) and find that, while still far from demonstrating general intelligence, many state-of-the-art deep learning methods have begun to incorporate key aspects of each of the three functional theories. Given this apparent trend, we use the motivating example of mental time travel in humans to propose ways in which insights from each of the three theories may be combined into a unified model. We believe that doing so can enable the development of artificial agents which are not only more generally intelligent but are also consistent with multiple current theories of conscious function.

URL: https://openreview.net/forum?id=LTyqvLEv5b

---

Title: Sample-Efficient Self-Supervised Imitation Learning

Abstract: Imitation learning allows an agent to acquire skills or mimic behaviors by observing an expert performing a given task.
While imitation learning approaches successfully replicate the observed behavior, they are limited to the trajectories generated by the expert both regarding their quality and availability.
In contrast, while reinforcement learning does not need a supervised signal to learn the task, it requires a lot of computation, which can result in sub-optimal policies when we are dealing with resource constraints.
For addressing those issues, we propose Reinforced Imitation Learning (RIL), a method that learns optimal policies using a very small sample of expert behavior to substantially speed up the process of reinforcement learning.
RIL leverages expert trajectories to learn how to mimic behavior while also learning with its own experiences in a typical reinforcement learning fashion.
A thorough set of experiments show that our method outperforms both imitation and reinforcement learning methods, providing a good compromise between sample efficiency and task performance.

URL: https://openreview.net/forum?id=DUGATQhjOa

---

Title: Auditor Fairness Evaluation via Learning Latent Assessment Models from Elicited Human Feedback

Abstract: Algorithmic fairness literature presents numerous mathematical notions and metrics, and also points to a tradeoff between them while satisficing some/all of them simultaneously. Furthermore, the contextual nature of fairness notions makes it difficult to automate bias evaluation in diverse algorithmic systems. Therefore, in this paper, we propose a novel model called latent assessment model (LAM) to characterize binary feedback provided by human auditors, by assuming that the auditor compares the classifier’s output to his/her own intrinsic judgment for each input. We prove that individual and/or group fairness notions are guaranteed as long as the auditor’s intrinsic judgments inherently satisfy the fairness notion at hand, and are relatively similar to the classifier's evaluations. We also demonstrate this relationship between LAM and traditional fairness notions on three well-known datasets, namely COMPAS, German credit and Adult Census Income datasets. Furthermore, we also derive the minimum number of feedback samples needed to obtain probably approximately correct (PAC) learning guarantees to estimate LAM for black-box classifiers. Moreover, we propose a novel multi-attribute reputation measure to evaluate auditor's preference towards various fairness notions as well as sensitive groups. These guarantees are also validated using standard machine learning algorithms, which are trained on real binary feedback elicited from 400 human auditors regarding COMPAS.

URL: https://openreview.net/forum?id=hNeMTfFBQJ

---

Reply all

Reply to author

Forward

0 new messages