Daily TMLR digest for Jul 14, 2024

0 views

Skip to first unread message

TMLR

unread,

Jul 14, 2024, 12:00:36 AM (13 days ago) Jul 14

to tmlr-anno...@googlegroups.com

New submissions
===============

Title: Hashing with Uncertainty Quantification via Sampling-based Hypothesis Testing

Abstract: To quantify different types of uncertainty when deriving hash-codes for image retrieval, we develop a probabilistic hashing model (ProbHash). Sampling-based hypothesis testing is then derived for hashing with uncertainty quantification (HashUQ) in ProbHash to improve the granularity of hashing-based retrieval by prioritizing the data with confident hash-codes. HashUQ can drastically improve the retrieval performance without sacrificing computational efficiency. For efficient deployment of HashUQ in real-world applications, we discretize the quantified uncertainty to reduce the potential storage overhead. Experimental results show that our HashUQ can achieve state-of-the-art retrieval performance on three image datasets. Ablation experiments on model hyperparameters, different model components, and effects of UQ are also provided with performance comparisons.

URL: https://openreview.net/forum?id=cc4v6v310f

---

Title: Assessing and enhancing robustness of active learning strategies to spurious bias

Abstract: In the presence of spurious correlation, the deep neural network (DNN) trained using empirical risk minimization (ERM) tends to rely on spurious features during predictions, particularly when the target label exhibits spurious correlations with certain attributes in the training set. Prior works have proposed methods to mitigate bias caused by spurious correlations in passive learning scenarios. In this work, we focus on investigating the performance of common active learning (AL) algorithms under spurious bias and designing an AL algorithm that is robust to spurious bias. AL is a framework that iteratively acquires new samples to progressively improve the classifier. In AL loops, sample acquisition is directed by the informativeness criteria, such as uncertainty and representativeness. The concept behind these criteria shares similarities with approaches to addressing spurious correlations in passive settings (i.e., underrepresented samples are deemed informative and thus given higher value during training). In fact, Tamkin et al. (2022) has demonstrated the potential of AL in addressing out-of-distribution problems. Hence, with an appropriately defined acquisition function, a sample-efficient framework can be established to effectively handle spurious correlations. Inspired by recent works on simplicity bias, we propose Domain-Invariant Active Learning (DIAL) which leverages the disparity in training dynamics between overrepresented and underrepresented samples, selecting samples that exhibit “slow” training dynamics. DIAL involves no excessively resource-intensive computations beyond the standard training process and feedforward inference, making it more scalable for addressing real-world problems with AL. Empirical results demonstrates that DIAL not only outperforms baselines in achieving robustness performance under the spurious correlation scenarios but also on the standard ML datasets.

URL: https://openreview.net/forum?id=2XVECaYiFB

---

Reply all

Reply to author

Forward

0 new messages