Daily TMLR digest for Jun 06, 2024

0 views
Skip to first unread message

TMLR

unread,
Jun 6, 2024, 12:00:06 AMJun 6
to tmlr-anno...@googlegroups.com

Accepted papers
===============


Title: Distributionally Robust Policy Evaluation under General Covariate Shift in Contextual Bandits

Authors: Yihong Guo, Hao Liu, Yisong Yue, Anqi Liu

Abstract: We introduce a distributionally robust approach that enhances the reliability of offline policy evaluation in contextual bandits under general covariate shifts. Our method aims to deliver robust policy evaluation results in the presence of discrepancies in both context and policy distribution between logging and target data. Central to our methodology is the application of robust regression — a distributionally robust technique tailored here to improve the estimation of conditional reward distribution from logging data. Utilizing the reward model obtained from robust regression, we develop a comprehensive suite of policy value estimators, by integrating our reward model into established evaluation frameworks, namely direct methods and doubly robust methods. Through theoretical analysis, we further establish that the proposed policy value estimators offer a finite sample upper bound for the bias, providing a clear advantage over traditional methods, especially when the shift is large. Finally, we designed an extensive range of policy evaluation scenarios, covering diverse magnitudes of shifts and a spectrum of logging and target policies. Our empirical results indicate that our approach significantly outperforms baseline methods, most notably in 90% of the cases under the policy shift-only settings and 72% of the scenarios under the general covariate shift settings.

URL: https://openreview.net/forum?id=R7PReNELww

---

Title: Misspecification-robust Sequential Neural Likelihood for Simulation-based Inference

Authors: Ryan P. Kelly, David J Nott, David Tyler Frazier, David J Warne, Christopher Drovandi

Abstract: Simulation-based inference techniques are indispensable for parameter estimation of mechanistic and simulable models with intractable likelihoods. While traditional statistical approaches like approximate Bayesian computation and Bayesian synthetic likelihood have been studied under well-specified and misspecified settings, they often suffer from inefficiencies due to wasted model simulations. Neural approaches, such as sequential neural likelihood (SNL) avoid this wastage by utilising all model simulations to train a neural surrogate for the likelihood function. However, the performance of SNL under model misspecification is unreliable and can result in overconfident posteriors centred around an inaccurate parameter estimate. In this paper, we propose a novel SNL method, which through the incorporation of additional adjustment parameters, is robust to model misspecification and capable of identifying features of the data that the model is not able to recover. We demonstrate the efficacy of our approach through several illustrative examples, where our method gives more accurate point estimates and uncertainty quantification than SNL.

URL: https://openreview.net/forum?id=tbOYJwXhcY

---

Title: Rotate the ReLU to Sparsify Deep Networks Implicitly

Authors: Nancy Nayak, Sheetal Kalyani

Abstract: Compact and energy-efficient models have become essential in this era when deep learning-based solutions are widely used for various real-life tasks. In this paper, we propose rotating the ReLU activation to give an additional degree of freedom in conjunction with the appropriate initialization of the rotation. This combination leads to implicit sparsification without the use of a regularizer. We show that this rotated ReLU (RReLU) activation improves the representation capability of the parameters/filters in the network and eliminates those parameters/filters that are not crucial for the task, giving rise to significant savings in memory and computation. While the state-of-the-art regularization-based Network-Slimming method achieves $32.33\%$ saving in memory and $26.38\%$ saving in computation with ResNet-$164$, RReLU achieves a saving of $35.92\%$ in memory and $25.97\%$ in the computation with a better accuracy. The savings in memory and computation further increase by $64.67\%$ and $52.96\%$, respectively, with the introduction of $L_1$ regularization to the RReLU slopes. We note that the slopes of the rotated ReLU activations act as coarse feature extractors and can eliminate unnecessary features before retraining. Our studies indicate that features always choose to pass through a lesser number of filters. We demonstrate the results with popular datasets such as MNIST, CIFAR-10, CIFAR-100, SVHN, and Imagenet with different architectures, including Vision Transformers and EfficientNet. We also briefly study the impact of adversarial attacks on RReLU-based ResNets and observe that we get better adversarial accuracy for the architectures with RReLU than ReLU. We also demonstrate how this concept of rotation can be applied to the GELU and SiLU activation functions, commonly utilized in Transformer and EfficientNet architectures, respectively. The proposed method can be utilized by combining with other structural pruning methods resulting in better sparsity. For the GELU-based multi-layer perceptron (MLP) part of the Transformer, we obtain $2.6\%$ improvement in accuracy with $6.32\%$ saving in both memory and computation.

URL: https://openreview.net/forum?id=Nzy0XmCPuZ

---

Title: Bayesian Quantification with Black-Box Estimators

Authors: Albert Ziegler, Paweł Czyż

Abstract: Understanding how different classes are distributed in an unlabeled data set is important for the calibration of probabilistic classifiers and uncertainty quantification. Methods like adjusted classify and count, black-box shift estimators, and invariant ratio estimators use an auxiliary and potentially biased black-box classifier trained on a different data set to estimate the class distribution on the current data set and yield asymptotic guarantees under weak assumptions. We demonstrate that these algorithms are closely related to the inference in a
particular probabilistic graphical model approximating the assumed ground-truth generative process, and we propose a Bayesian estimator. Then, we discuss an efficient Markov chain Monte Carlo sampling scheme for the introduced model and show an asymptotic consistency guarantee in the large-data limit. We compare the introduced model against the established point estimators in a variety of scenarios, and show it is competitive, and in some cases superior, with the non-Bayesian alternatives.

URL: https://openreview.net/forum?id=Ft4kHrOawZ

---


New submissions
===============


Title: Contrastive Learning with Consistent Representations

Abstract: Contrastive learning demonstrates great promise for representation learning. Data augmentations play a critical role in contrastive learning by providing informative views of the data without necessitating explicit labels. Nonetheless, the efficacy of current methodologies heavily hinges on the quality of employed data augmentation (DA) functions, often chosen manually from a limited set of options. While exploiting diverse data augmentations is appealing, the complexities inherent in both DAs and representation learning can lead to performance deterioration. Addressing this challenge and facilitating the systematic incorporation of diverse data augmentations, this paper proposes Contrastive Learning with Consistent Representations (CoCor). At the heart of CoCor is a novel consistency metric termed DA consistency. This metric governs the mapping of augmented input data to the representation space, ensuring that these instances are positioned optimally in a manner consistent with the applied intensity of the DA. Moreover, we propose to learn the optimal mapping locations as a function of DA, all while preserving a desired monotonic property relative to DA intensity.
Experimental results demonstrate that CoCor notably enhances the generalizability and transferability of learned representations in comparison to baseline methods.

URL: https://openreview.net/forum?id=gKeSI8w63Z

---

Reply all
Reply to author
Forward
0 new messages