Daily TMLR digest for Jun 11, 2024

0 views

Skip to first unread message

TMLR

unread,

Jun 11, 2024, 12:00:08 AMJun 11

to tmlr-anno...@googlegroups.com

Accepted papers
===============

Title: Smoothed Robustness Analysis: Bridging worst- and average-case robustness analyses via smoothed analysis

Authors: Thomas Rodrigues Crespo, Jun-nosuke Teramae

Abstract: The sensitivity to adversarial attacks and noise is a significant drawback of neural networks, and understanding and certifying their robustness has attracted much attention. Studies have attempted to bridge two extreme analyses of robustness; one is the worst-case analysis, which often gives too pessimistic certification, and the other is the average-case analysis, which often fails to give a tight guarantee of robustness. Among them, \textit{Randomized Smoothing} became prominent by certifying a worst-case region of a classifier under input noise. However, the method still suffers from several limitations, probably due to the lack of a larger underlying framework to locate it. Here, inspired by the \textit{Smoothed Analysis} of algorithmic complexity, which bridges the worst-case and average-case analyses of algorithms, we provide a theoretical framework for robustness analyses of classifiers, which contains \textit{Randomized Smoothing} as a special case. Using the framework, we also propose a novel robustness analysis that works even in the small noise regime and thus provides a more confident robustness certification than \textit{Randomized Smoothing}. To validate the approach, we evaluate the robustness of fully connected and convolutional neural networks on the MNIST and CIFAR-10 datasets, respectively, and find that it indeed improves both adversarial and noise robustness.

URL: https://openreview.net/forum?id=BogwFMz5tU

---

Title: [Re] CUDA: Curriculum of Data Augmentation for Long‐tailed Recognition

Authors: Barath Chandran.C

Abstract: In this reproducibility study, we present our results and experience during replicating the paper, titled CUDA: Curriculum of Data Augmentation for Long-Tailed Recognition(Ahn et al., 2023).Traditional datasets used in image recognition, such as ImageNet, are often synthetically balanced, meaning each class has an equal number of samples. In practical scenarios, datasets frequently
exhibit significant class imbalances, with certain classes having a disproportionately larger number of samples compared to others. This discrepancy poses a challenge for traditional image recognition models, as they tend to favor classes with larger sample sizes, leading to poor performance on minority classes. CUDA proposes a class-wise data augmentation technique which can be used over
any existing model to improve the accuracy for LTR: Long Tailed Recognition. We successfully replicated all of the results pertaining to the long-tailed CIFAR-100-LT dataset and extended our analysis to provide deeper insights into how CUDA efficiently tackles class imbalance. The code and the readings are available in https://anonymous.4open.science/r/CUDA-org--C2FD/README.md

URL: https://openreview.net/forum?id=Wm6d44I8St

---

Title: Hybrid Active Learning with Uncertainty-Weighted Embeddings

Authors: Yinan He, Lile Cai, Jingyi Liao, Chuan-Sheng Foo

Abstract: We introduce a hybrid active learning method that simultaneously considers uncertainty and diversity for sample selection. Our method consists of two key steps: computing a novel uncertainty-weighted embedding, then applying distance-based sampling for sample selection. Our proposed uncertainty-weighted embedding is computed by weighting a sample's feature representation by an uncertainty measure. We show how this embedding generalizes the gradient embedding of BADGE so it can be used with arbitrary loss functions and be computed more efficiently, especially for dense prediction tasks and network architectures with large numbers of parameters in the final layer. We extensively evaluate the proposed hybrid active learning method on image classification, semantic segmentation and object detection tasks, and demonstrate that it achieves state-of-the-art performance.

URL: https://openreview.net/forum?id=jD761b5OaE

---

Title: Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation

Authors: Aahlad Manas Puli, Nitish Joshi, Yoav Wald, He He, Rajesh Ranganath

Abstract: In prediction tasks, there exist features that are related to the label in the same way across different settings for that task; these are semantic features or semantics. Features with vary- ing relationships to the label are nuisances. For example, in detecting cows from natural images, the shape of the head is semantic, but because images of cows often have grass back- grounds but not always, the background is a nuisance. Models that exploit nuisance-label relationships face performance degradation when these relationships change. Building mod- els robust to such changes requires additional knowledge beyond samples of the features and labels. For example, existing work uses annotations of nuisances or assumes erm-trained models depend on nuisances. Approaches to integrate new kinds of additional knowledge enlarge the settings where robust models can be built. We develop an approach to use knowledge about the semantics via data augmentations. These data augmentations cor- rupt semantic information to produce models that identify and adjust for where nuisances drive predictions. We study semantic corruptions in powering different spurious-correlation- avoiding methods on multiple out-of-distribution (ood) tasks like classifying waterbirds, natural language inference (nli), and detecting cardiomegaly in chest X-rays.

URL: https://openreview.net/forum?id=RIFJsSzwKY

---

Title: Making Translators Privacy-aware on the User's Side

Authors: Ryoma Sato

Abstract: We propose PRISM to enable users of machine translation systems to preserve the privacy of data on their own initiative. There is a growing demand to apply machine translation systems to data that require privacy protection. While several machine translation engines claim to prioritize privacy, the extent and specifics of such protection are largely ambiguous. First, there is often a lack of clarity on how and to what degree the data is protected. Even if service providers believe they have sufficient safeguards in place, sophisticated adversaries might still extract sensitive information. Second, vulnerabilities may exist outside of these protective measures, such as within communication channels, potentially leading to data leakage. As a result, users are hesitant to utilize machine translation engines for data demanding high levels of privacy protection, thereby missing out on their benefits. PRISM resolves this problem. Instead of relying on the translation service to keep data safe, PRISM provides the means to protect data on the user's side. This approach ensures that even machine translation engines with inadequate privacy measures can be used securely. For platforms already equipped with privacy safeguards, PRISM acts as an additional protection layer, reinforcing their security furthermore. PRISM adds these privacy features without significantly compromising translation accuracy. We prove that PRISM enjoys the theoretical guarantee of word-level differential privacy. Our experiments demonstrate the effectiveness of PRISM using real-world translators, T5 and ChatGPT (GPT-3.5-turbo), and the datasets with two languages. PRISM effectively balances privacy protection with translation accuracy over other user-side privacy protection protocols and helps users grasp the content written in a foreign language without leaking the original content.

URL: https://openreview.net/forum?id=A6eqDMttcs

---

Title: Convergences for Minimax Optimization Problems over Infinite-Dimensional Spaces Towards Stability in Adversarial Training

Authors: Takashi Furuya, Satoshi Okuda, Kazuma Suetake, Yoshihide Sawada

Abstract: Training neural networks that require adversarial optimization, such as generative adversarial networks (GANs) and unsupervised domain adaptations (UDAs), suffers from instability. This instability problem comes from the difficulty of the minimax optimization, and there have been various approaches in GANs and UDAs to overcome this problem. In this study, we tackle this problem theoretically through a functional analysis. Specifically, we show the convergence property of the minimax problem by the gradient descent over the infinite-dimensional spaces of continuous functions and probability measures under certain conditions.
Using this setting, we can discuss GANs and UDAs comprehensively, which have been studied independently.
In addition, we show that the conditions necessary for the convergence property are interpreted as stabilization techniques of adversarial training such as the spectral normalization and the gradient penalty.

URL: https://openreview.net/forum?id=6LePXHr2f3

---

Title: CoMIX: A Multi-agent Reinforcement Learning Training Architecture for Efficient Decentralized Coordination and Independent Decision-Making

Authors: Giovanni Minelli, Mirco Musolesi

Abstract: Robust coordination skills enable agents to operate cohesively in shared environments, together towards a common goal and, ideally, individually without hindering each other's progress. To this end, this paper presents Coordinated QMIX (CoMIX), a novel training framework for decentralized agents that enables emergent coordination through flexible policies, allowing at the same time independent decision-making at individual level. CoMIX models selfish and collaborative behavior as incremental steps in each agent's decision process. This allows agents to dynamically adapt their behavior to different situations balancing independence and collaboration. Experiments using a variety of simulation environments demonstrate that CoMIX outperforms baselines on collaborative tasks. The results validate our incremental approach as effective technique for improving coordination in multi-agent systems.

URL: https://openreview.net/forum?id=JoU9khOwwr

---

New submissions
===============

Title: Principal Graph Encoder Embedding and Principal Community Detection

Abstract: In this paper, we introduce the concept of principal communities and design a principal graph encoder embedding method to concurrently detect these communities and achieve vertex embedding. Given a graph adjacency matrix with vertex labels, the method computes a sample score for each community, providing a ranking to measure community importance and estimate a set of principal communities. It then produces a vertex embedding by retaining only the dimensions corresponding to the principal communities. We characterize the theoretical properties of the principal graph encoder embedding on the random graph model and prove that the proposed method preserves sufficient information about the vertex labels. The numerical performance of the proposed method is demonstrated through comprehensive simulated and real-data experiments.

URL: https://openreview.net/forum?id=9hihbE9udx

---

Title: Representation Norm Amplification for Out-of-Distribution Detection in Long-Tail Learning

Abstract: Detecting out-of-distribution (OOD) samples is a critical task for reliable machine learning. However, it becomes particularly challenging when the models are trained on long-tailed datasets, as the models often struggle to distinguish tail-class in-distribution samples from OOD samples. We examine the main challenges in this problem by identifying the trade-offs between OOD detection and in-distribution (ID) classification, faced by existing methods. We then introduce our method, called *Representation Norm Amplification* (RNA), which solves this challenge by decoupling the two problems. The main idea is to use the norm of the representation as a new dimension for OOD detection, and to develop a training method that generates a noticeable discrepancy in the representation norm between ID and OOD data, while not perturbing the feature learning for ID classification. Our experiments show that RNA achieves superior performance in both OOD detection and classification compared to the state-of-the-art methods, by 1.70\% and 9.46\% in FPR95 and 2.43\% and 6.87\% in classification accuracy on CIFAR10-LT and ImageNet-LT, respectively.

URL: https://openreview.net/forum?id=z4b4WfvooX

---

Title: DrGNN: Deep Residual Graph Neural Network with Contrastive Learning

Abstract: Recent studies reveal that deep representation learning models without proper regularization can suffer from the _dimensional collapse_ problem, i.e., representation vectors span over a lower dimensional space. In the domain of graph deep representation learning, the phenomenon that the node representations are indistinguishable and even shrink to a constant vector is called _oversmoothing_. Based on the analysis of the rank of node representations, we find that representation oversmoothing and dimensional collapse are highly related to each other in deep graph neural networks, and the oversmoothing problem can be interpreted by the dimensional collapse of the node representation matrix. Then, to address the dimensional collapse and the oversmoothing together in deep graph neural networks, we first find vanilla _residual connections_ and _contrastive learning_ producing sub-optimal outcomes by ignoring the structured constraints of graph data. Motivated by this, we propose a novel graph neural network named DrGNN to alleviate the oversmoothing issue from the perspective of addressing dimensional collapse. Specifically, in DrGNN, we design a topology-preserving residual connection for graph neural networks to force the low-rank of hidden representations close to the full-rank input features. Also, we propose the structure-guided contrastive learning to ensure only close neighbors who share similar local connections can have similar representations. Empirical experiments on multiple real-world datasets demonstrate that DrGNN outperforms state-of-the-art deep graph representation baseline algorithms.

URL: https://openreview.net/forum?id=frb6sLbACS

---

Title: Risk Bounds for Mixture Density Estimation on Compact Domains via the $h$-Lifted Kullback--Leibler Divergence

Abstract: We consider the problem of estimating probability density functions based on sample data, using a finite mixture of densities from some component class. To this end, we introduce the h-lifted Kullback--Leibler (KL) divergence as a generalization of the standard KL divergence and a criterion for conducting risk minimization. Under a compact support assumption, we prove an $O(1/{\sqrt{n}})$ bound on the expected estimation error when using the h-lifted KL divergence, which extends the results of Rakhlin et al. (2005, ESAIM: Probability and Statistics, Vol. 9) and Li and Barron (1999, Advances in Neural Information ProcessingSystems, Vol. 12) to permit the risk bounding of density functions that are not strictly positive. We develop a procedure for the computation of the corresponding maximum h-lifted likelihood estimators (h-MLLEs) using the Majorization-Maximization framework and provide experimental results in support of our theoretical bounds.

URL: https://openreview.net/forum?id=lAKvQO4vHj

---

Reply all

Reply to author

Forward

0 new messages