Daily TMLR digest for Aug 15, 2025

0 views
Skip to first unread message

TMLR

unread,
Aug 15, 2025, 12:06:07 AMAug 15
to tmlr-anno...@googlegroups.com

Accepted papers
===============


Title: Combining Machine Learning Defenses without Conflicts

Authors: Vasisht Duddu, Rui Zhang, N. Asokan

Abstract: Machine learning (ML) models require protection against various risks to security, privacy, and fairness. Real-life ML models need simultaneous protection against multiple risks, necessitating combining multiple defenses effectively, without incurring significant drop in the effectiveness of the constituent defenses. We present a systematization of existing work based on how defenses are combined, and how they interact. We then identify unexplored combinations, and evaluate combination techniques to identify their limitations. Using these insights, we present, Def\Con, a combination technique which is (a) accurate (correctly identifies whether a combination is effective or not), (b) scalable (allows combining multiple defenses), (c) non-invasive (allows combining existing defenses without modification), and (d) general (is applicable to different types of defenses). We show that Def\Con achieves 90% accuracy on eight combinations from prior work, and 86% in 30 unexplored combinations evaluated empirically.

URL: https://openreview.net/forum?id=C7FgsjfFRC

---

Title: How does overparametrization affect performance on minority groups?

Authors: Saptarshi Roy, Subha Maity, Songkai Xue, Mikhail Yurochkin, Yuekai Sun

Abstract: The benefits of overparameterization for the overall performance of modern machine learning (ML) models are well known. However, the effect of overparameterization at a more granular level of data subgroups is less understood. Recent empirical studies demonstrate encouraging results: (i) when groups are not known, overparameterized models trained with empirical risk minimization (ERM) perform better on minority groups; (ii) when groups are known, ERM on data subsampled to equalize group sizes yields state-of-the-art worst-group accuracy in the overparameterized regime. In this paper, we complement these empirical studies with a theoretical investigation of the risk of overparameterized random feature regression models on minority groups with identical feature distribution as the majority group. In a setting in which the regression functions for the majority and minority groups are different, we show that overparameterization either improves or does not harm the asymptotic minority group performance under the ERM setting when the features are distributed uniformly over the sphere.

URL: https://openreview.net/forum?id=POunezXgvF

---


New submissions
===============


Title: Adversarial Fine-tuning of Compressed Neural Networks for Joint Improvement of Robustness and Efficiency

Abstract: As deep learning (DL) models are increasingly being integrated into our everyday lives, ensuring their safety by making them robust against adversarial attacks has become increasingly critical. DL models have been found to be susceptible to adversarial attacks by introducing small, targeted perturbations to disrupt the input data. Adversarial training has been presented as a mitigation strategy that can result in more robust models. This adversarial robustness comes with additional computational costs required to design adversarial attacks during training. The two objectives -- adversarial robustness and computational efficiency -- then appear to be in conflict with each other. In this work, we explore the effects of neural network compression on adversarial robustness. We specifically explore the effects of fine-tuning on compressed models, and present the trade-off between standard fine-tuning and adversarial fine-tuning. Our results show that {\em adversarial fine-tuning} of compressed models can yield large improvements to their robustness performance. We present experiments on several benchmark datasets showing that adversarial fine-tuning of compressed models can achieve robustness performance comparable to adversarially trained models, while also improving computational efficiency.

URL: https://openreview.net/forum?id=a1EIdh3RSD

---

Title: Improving Adversarial Training for Two-player Competitive Games via Episodic Reward Engineering

Abstract: Training adversarial agents to attack neural network policies has proven to be both effective and practical. However, we observe that existing methods can be further enhanced by distinguishing between states leading to win or lose and encouraging the policy training by reward engineering to prioritize winning states. In this paper, we introduce a novel adversarial training method with reward engineering for two-player competitive games. Our method extracts the historical evaluations for states from historical experiences with an episodic memory, and then incorporating these evaluations into the rewards with our proposed reward revision method to improve the adversarial policy optimization. We evaluate our approach using two-player competitive games in MuJoCo simulation environments, demonstrating that our method establishes the most promising attack performance and defense difficulty against the victims among the existing adversarial policy training techniques.

URL: https://openreview.net/forum?id=z4XtJWJC9K

---

Title: Architecture-Aware Generalization Bounds for Temporal Networks: Theory and Fair Comparison Methodology

Abstract: Deep temporal architectures such as Temporal Convolutional Networks (TCNs) achieve strong predictive performance on sequential data, yet theoretical understanding of their generalization remains limited. We address this gap by providing both the first non-vacuous, architecture-aware generalization bounds for deep temporal models and a principled evaluation methodology.

For exponentially $\beta$-mixing sequences, we derive bounds scaling as
$
\mathcal{O}\!\Bigl(R\,\sqrt{\tfrac{D\,p\,n\,\log N}{N}}\Bigr),
$
where $D$ is network depth, $p$ kernel size, $n$ input dimension, and $R$ weight norm. Our delayed-feedback blocking mechanism transforms dependent samples into effectively independent ones while discarding only $O(1/\log N)$ of the data, yielding $\sqrt{D}$ scaling instead of exponential-implying that doubling depth requires approximately quadrupling the training data.

We also introduce a fair-comparison methodology that fixes the effective sample size to isolate the effect of temporal structure from information content. Under $N_{\text{eff}}=2{,}000$, strongly dependent sequences ($\rho=0.8$) exhibit $\approx76\%$ smaller generalization gaps than weakly dependent ones ($\rho=0.2$), challenging the intuition that dependence is purely detrimental. Yet convergence rates diverge from theory: weak dependencies follow $N_{\text{eff}}^{-1.21}$ scaling and strong dependencies follow $N_{\text{eff}}^{-0.89}$, both steeper than the predicted $N^{-0.5}$. These findings reveal that temporal dependence can enhance learning under fixed information budgets, while highlighting gaps between theory and practice that motivate future research.

URL: https://openreview.net/forum?id=7GZ0TcV691

---

Title: Multimodal Cultural Safety: Evaluation Framework and Alignment Strategies

Abstract: Large vision-language models (LVLMs) are increasingly deployed in globally distributed applications, such as tourism assistants, yet their ability to produce culturally appropriate responses remains underexplored. Existing multimodal safety benchmarks primarily focus on physical safety and overlook violations rooted in cultural norms, which can result in symbolic harm. For example, suggesting clocks as gifts for a baby’s birthday in China may invoke associations with death, leading to user discomfort and undermining trust. To address this gap, we introduce CROSS, a benchmark designed to assess the cultural safety reasoning capabilities of LVLMs. CROSS includes 1,284 multilingual visually grounded queries from 16 countries, three everyday domains (i.e., shopping, meal planning, and outdoor activities), and 14 languages, where cultural norm violations emerge only when images are interpreted in context. We propose CROSS-Eval, an intercultural theory-based framework that measures four key dimensions: cultural awareness, norm education, compliance, and helpfulness. Using this framework, we evaluate 21 leading LVLMs, including mixture-of-experts models (e.g., Llama-4-Maverick) and reasoning models (e.g., o1 and Gemini-2.5-Pro). Results reveal significant cultural safety gaps: the best-performing model achieves only 61.79% in awareness and 37.73% in compliance. While some open-source models achieve performance better or comparable to GPT-4o, they still fall notably short of proprietary models. Our results further show that increasing reasoning capacity improves cultural alignment but does not fully resolve the issue. To improve model performance, we develop two enhancement strategies: supervised fine-tuning with culturally grounded, open-ended data and preference tuning with contrastive response pairs that highlight safe versus unsafe behaviors. These methods substantially improve GPT-4o’s cultural awareness (+60.14%) and compliance (+55.2%), while preserving general multimodal capabilities with minimal performance reduction on general multimodal understanding benchmarks. This work establishes a framework for evaluating and improving cultural safety in vision-language systems across diverse global contexts.

URL: https://openreview.net/forum?id=mkFBmxgnRh

---

Title: Oscillations Make Neural Networks Robust to Quantization

Abstract: We challenge the prevailing view that weight oscillations observed during Quantization Aware Training (QAT) are merely undesirable side-effects and argue instead that they are an essential part of QAT. We show in a linear model with a single weight that the straight-through estimator (STE) results in an additional loss term that causes oscillations by pushing weights away from their nearest quantization level. Based on the mechanism from the analysis, we then derive a regularizer that induces oscillations in the weights of neural networks during training. Our empirical results on ResNet-18 and Tiny ViT on CIFAR-10 and Tiny-ImageNet datasets demonstrate across a range of quantization levels that training with oscillations followed by post-training quantization (PTQ) is sufficient to recover the performance of QAT in most cases. With this work we shed further light on the dynamics of QAT and contribute a novel insight into explaining the role of oscillations in QAT which until now have been considered to have a primarily negative effect on quantization.

URL: https://openreview.net/forum?id=bPwcJ0nkDC

---

Title: Conditional Kernel Quantile Embeddings: A Nonparametric Framework for Conditional Two-Sample Testing

Abstract: Comparing conditional probability distributions, P(Y∣X) and Q(Y∣X), is a fundamental problem in machine learning, crucial for tasks like causal inference, detecting dataset shift, and model validation. The predominant approach, based on Conditional Kernel Mean Embeddings (KCMEs), suffers from significant drawbacks: it relies on strong and often unverifiable assumptions on the kernel to be a metric, incurs high computational costs, and may exhibit reduced sensitivity to higher-order distributional differences. We introduce Conditional Kernel Quantile Embeddings (CKQEs), a novel and robust framework for representing conditional distributions in a Reproducing Kernel Hilbert Space (RKHS). Throughout, we assume P_X = Q_X for conditional comparisons, and we require only that the output-space kernel be quantile-characteristic. From CKQEs, we construct the Conditional Kernel Quantile Discrepancy (CKQD), a new family of probability metrics. We prove that CKQD: (1) is a metric under substantially weaker and more practical kernel conditions than KCME-based distances, namely requiring only a quantile-characteristic kernel; (2) possesses a clear geometric interpretation, recovering a conditional version of the Sliced Wasserstein distance in a special case; and (3) admits a computationally efficient, statistically consistent non-parametric estimator with proven finite-sample convergence rates. By addressing the core weaknesses of the KCME framework, CKQE provides a more versatile and theoretically sound foundation for conditional two-sample testing.

URL: https://openreview.net/forum?id=AuqmLks0T7

---

Title: LZ Penalty: An information-theoretic repetition penalty for autoregressive language models.

Abstract: We introduce the Lempel-Ziv (LZ) penalty, a penalty specialized for reducing degenerate repetitions in autoregressive language models without loss of capability. The penalty is based on the codelengths in the LZ77 universal lossless compression algorithm. Through the lens of the prediction-compression duality, decoding with the LZ penalty has the interpretation of sampling from the residual distribution after removing the information that is highly compressible. We demonstrate the LZ penalty enables open-source reasoning models to operate with greedy decoding without loss of capability and without instances of degenerate repetition. Both the industry-standard frequency penalty and repetition penalty are ineffective, incurring degenerate repetition rates of up to 4% or more.

URL: https://openreview.net/forum?id=vNzPB4YCHj

---

Title: CANDOR: Counterfactual ANnotated DOubly Robust Off-Policy Evaluation

Abstract: When applying contextual bandit algorithms in high-stakes settings (e.g., medical treatment), practitioners rely on off-policy evaluation (OPE) methods that use historical data to evaluate the behavior of novel policies prior to deployment. Unfortunately, OPE techniques are inherently limited by the breadth of the available data, which may not reflect distribution shifts resulting from the application of a new policy. Recent work attempts to address this challenge by leveraging domain experts to increase dataset coverage by annotating counterfactual samples. However, such annotations are not guaranteed to be free of errors, and incorporating imperfect annotations can lead to worse policy value estimates than not using the annotations at all. To make use of imperfect annotations, we propose a family of OPE estimators based on the doubly robust (DR) principle, which combines importance sampling (IS) with a reward model (direct method, DM) for better statistical guarantees. We introduce three opportunities within the DR estimation framework to incorporate counterfactual annotations. Under mild assumptions, we prove that using annotations within just the DM component yields the most desirable results, providing an unbiased estimator even under noisy annotations. We validate our approaches in several settings, including a real-world medical domain, observing that the theoretical advantages of using annotations within just the DM component hold in practice under realistic conditions. By addressing the challenges posed by imperfect annotations, this work broadens the applicability of OPE methods and facilitates safer and more effective deployment of decision-making systems.

URL: https://openreview.net/forum?id=PDRFappNFQ

---

Title: TempFlex: Advancing MLLMs with Temporal Perception and Natively Scalable Resolution Encoding

Abstract: Multimodal large language models (MLLMs) have made significant progress across vision-language tasks, yet many designs still suffer from two core limitations. (i) Excessive visual tokens and broken global context: Tiled Patch Encoding fragments high-resolution images, leading to token overload and disrupting global attention modeling. (ii) Lack of temporal reasoning: Most models process video as independent frames using static image encoders, failing to capture temporal dynamics. We present TempFlex-VL, a token-efficient and temporally aware MLLM that addresses both issues through lightweight architectural enhancements. First, we introduce a resolution-agnostic visual encoder that directly processes full images without tiling, preserving global context while substantially reducing visual tokens. Second, we propose Temporal Fiber Fusion (TFF), a plug-and-play module with three complementary pathways: (1) a dynamic local-convolution branch for fine-grained motion, (2) a gated memory accumulator for long-term dependencies, and (3) a periodic encoder for modeling cyclic patterns. These signals are softly fused, enabling the model to adapt to diverse temporal structures without overfitting. To support large-scale video-language pretraining, we curate TempFlex-2M, a high-quality synthetic video–text corpus generated in a single stage via GPT-4o with direct visual prompting. We instantiate TempFlex-VL using two different language backbones, Gemma3-4B and Qwen3-4B, demonstrating the generality of our design across architectures. Both variants achieve state-of-the-art or competitive results on a wide range of image and video benchmarks while markedly improving token efficiency. We will release all code, models, and data to spur future research in unified multimodal understanding.

URL: https://openreview.net/forum?id=ietYdtRB3h

---

Title: MS-IMAP - A Multi-Scale Graph Embedding Approach for Interpretable Manifold Learning

Abstract: Deriving meaningful representations from complex, high-dimensional data in unsupervised settings is crucial across diverse machine learning applications. This paper introduces a framework for multi-scale graph network embedding based on spectral graph wavelets that employs a contrastive learning approach. We theoretically show that in Paley-Wiener spaces on combinatorial graphs, the spectral graph wavelets operator provides greater flexibility and control over smoothness compared to the Laplacian operator, motivating our approach. A key advantage of the proposed embedding is its ability to establish a correspondence between the embedding and input feature spaces, enabling the derivation of feature importance. We validate the effectiveness of our graph embedding framework on multiple public datasets across various downstream tasks, including clustering and unsupervised feature importance.

URL: https://openreview.net/forum?id=pc6BgWrCjp

---

Reply all
Reply to author
Forward
0 new messages