Daily TMLR digest for Nov 10, 2025

1 view

Skip to first unread message

TMLR

unread,

Nov 10, 2025, 12:30:09 AMNov 10

to tmlr-anno...@googlegroups.com

Accepted papers
===============

Title: Phase-driven Generalizable Representation Learning for Nonstationary Time Series Classification

Authors: Payal Mohapatra, Lixu Wang, Qi Zhu

Abstract: Pattern recognition is a fundamental task in continuous sensing applications, but real-world scenarios often experience distribution shifts that necessitate learning generalizable representations for such tasks. This challenge is exacerbated with time-series data, which also exhibit inherent nonstationarity—variations in statistical and spectral properties over time. In this work, we offer a fresh perspective on learning generalizable representations for time-series classification by considering the phase information of a signal as an approximate proxy for nonstationarity and propose a phase-driven generalizable representation learning framework for time-series classification, PhASER. It consists of three key elements: 1) Hilbert transform-based augmentation, which diversifies nonstationarity while preserving task-specific discriminatory semantics, 2) separate magnitude-phase encoding, viewing time-varying magnitude and phase as independent modalities, and 3) phase-residual feature broadcasting, integrating 2D phase features with a residual connection to the 1D signal representation, providing inherent regularization to improve distribution-invariant learning. Extensive evaluations on five datasets from sleep-stage classification, human activity recognition, and gesture recognition against 13 state-of-the-art baseline methods demonstrate that PhASER consistently outperforms the best baselines by an average of 5% and up to 11% in some cases. Additionally, the principles of PhASER can be broadly applied to enhance the generalizability of existing time-series representation learning models.

URL: https://openreview.net/forum?id=cb3nwoqLdd

---

Title: Dataset Condensation with Color Compensation

Authors: Huyu Wu, Duo Su, Junjie Hou, Guang Li

Abstract: Dataset condensation always faces a constitutive trade-off: balancing performance and fidelity under extreme compression. Existing methods struggle with two bottlenecks: image-level selection methods (Coreset Selection, Dataset Quantization) suffer from inefficiency in condensation, while pixel-level optimization (Dataset Distillation) introduces semantic distortion due to over-parameterization. With empirical observations, we find that a critical problem in dataset condensation is the oversight of color's dual role as an information carrier and a basic semantic representation unit. We argue that improving the colorfulness of condensed images is beneficial for representation learning. Motivated by this, we propose DC3: a Dataset Condensation framework with Color Compensation. After a calibrated selection strategy, DC3 utilizes the latent diffusion model to enhance the color diversity of an image rather than creating a brand-new one. Extensive experiments demonstrate the superior performance and generalization of DC3 that outperforms SOTA methods across multiple benchmarks. To the best of our knowledge, besides focusing on downstream tasks, DC3 is the first research to fine-tune pre-trained diffusion models with condensed datasets. The Frechet Inception Distance (FID) and Inception Score (IS) results prove that training networks with our high-quality datasets is feasible without model collapse or other degradation issues.

URL: https://openreview.net/forum?id=hIdwvIOiJt

---

Title: Adaptive Group Robust Ensemble Knowledge Distillation

Authors: Patrik Kenfack, Ulrich Aïvodji, Samira Ebrahimi Kahou

Abstract: Neural networks can learn spurious correlations in the data, often leading to performance degradation for underrepresented subgroups. Studies have demonstrated that the disparity is amplified when knowledge is distilled from a complex teacher model to a relatively ``simple'' student model. Prior work has shown that ensemble deep learning methods can improve the performance of the worst-case subgroups; however, it is unclear if this advantage carries over when distilling knowledge from an ensemble of teachers, especially when the teacher models are debiased. This study demonstrates that traditional ensemble knowledge distillation can significantly drop the performance of the worst-case subgroups in the distilled student model even when the teacher models are debiased. To overcome this, we propose Adaptive Group Robust Ensemble Knowledge Distillation (\AGREKD), a simple ensembling strategy to ensure that the student model receives knowledge beneficial for unknown underrepresented subgroups. Leveraging an additional biased model, our method selectively chooses teachers whose knowledge would better improve the worst-performing subgroups by upweighting the teachers with gradient directions deviating from the biased model. Our experiments on several datasets demonstrate the superiority of the proposed ensemble distillation technique and show that it can even outperform classic model ensembles based on majority voting. Our source code is available at https://github.com/patrikken/AGRE-KD.

URL: https://openreview.net/forum?id=G2BEBaKd8Y

---

Title: Two-Step Offline Preference-Based Reinforcement Learning on Explicitly Constrained Policies

Authors: Yinglun Xu, Tarun Suresh, Rohan Gumaste, David Zhu, Ruirui Li, Zhengyang Wang, Haoming Jiang, Xianfeng Tang, Qingyu Yin, Monica Xiao Cheng, Qi Zeng, Chao Zhang, Gagandeep Singh

Abstract: Preference-based reinforcement learning (PBRL) in the offline setting has succeeded greatly in industrial applications such as chatbots. A two-step learning framework that learns a reward model from an offline dataset first and then optimizes a policy over the learned reward model through online reinforcement learning has been widely adopted. However, such a method faces challenges from the risk of reward hacking and the complexity of reinforcement learning. To overcome the challenge, our insight is that both challenges come from the state-actions not supported in the dataset. Such state actions are unreliable and increase the complexity of the reinforcement learning problem. Based on the insight, we develop a novel two-step learning method called PRC: preference-based reinforcement learning on explicitly constrained policies. The high-level idea is to limit the reinforcement learning agent to optimize over policies supported on an explicitly constrained action space that excludes the out-of-distribution state-actions. We empirically verify that our method has high learning efficiency on various datasets in robotic control environments.

URL: https://openreview.net/forum?id=LxPg5GJuY3

---

Title: An Evolutionary Algorithm for Black-Box Adversarial Attack Against Explainable Methods

Authors: Phoenix Neale Williams, Jessica Schrouff, Lea Goetz

Abstract: The explainability of deep neural networks (DNNs) remains a major challenge in developing trustworthy AI, particularly in high-stakes domains such as medical imaging. Although explainable AI (XAI) techniques have advanced, they remain vulnerable to adversarial perturbations, underscoring the need for more robust evaluation frameworks. Existing adversarial attacks often focus on specific explanation strategies, while recent research has introduced black-box attacks capable of targeting multiple XAI methods. However, these approaches typically craft pixel-level perturbations that require a large number of queries and struggle to effectively attack less granular XAI methods such as Grad-CAM and LIME. To overcome these limitations, we propose a novel attack that generates perturbations using semi-transparent, RGB-valued circles optimized via an evolutionary strategy. This design reduces the number of tunable parameters, improves attack efficiency, and is adaptable to XAI methods with varying levels of granularity. Extensive experiments on medical and natural image datasets demonstrate that our method outperforms state-of-the-art techniques, exposing critical vulnerabilities in current XAI systems and highlighting the need for more robust interpretability frameworks.

URL: https://openreview.net/forum?id=MlUP5Euj6S

---

Title: Causal Dynamic Variational Autoencoder for Counterfactual Regression in Longitudinal Data

Authors: Mouad El Bouchattaoui, Myriam Tami, BENOIT LEPETIT, Paul-Henry Cournède

Abstract: Accurately estimating treatment effects over time is crucial in fields such as precision medicine, epidemiology, economics, and marketing. Many current methods for estimating treatment effects over time assume that all confounders are observed or attempt to infer unobserved ones. In contrast, our approach focuses on unobserved adjustment variables—variables that specifically have a causal effect on the outcome sequence. Under the assumption of unconfoundedness, we address estimating Conditional Average Treatment Effects (CATEs) while accounting for unobserved heterogeneity in response to treatment due to these unobserved adjustment variables. Our proposed Causal Dynamic Variational Autoencoder (CDVAE) is grounded in theoretical guarantees concerning the validity of latent adjustment variables and generalization bounds on CATEs estimation error. Extensive evaluations on synthetic and real-world datasets show that CDVAE outperforms existing baselines. Moreover, we demonstrate that state-of-the-art models significantly improve their CATE estimates when augmented with the latent substitutes learned by CDVAE—approaching oracle-level performance without direct access to the true adjustment variables.

URL: https://openreview.net/forum?id=atf9q49DeF

---

New submissions
===============

Title: Scalable physical source-to-field inference with hypernetworks

Abstract: We present a generative model that amortises computation for the field and potential around e.g.~gravitational or electromagnetic sources. Exact numerical calculation has either computational complexity $\mathcal{O}(M\times{}N)$ in the number of sources $M$ and evaluation points $N$, or requires a fixed evaluation grid to exploit fast Fourier transforms. Using an architecture where a hypernetwork produces an implicit representation of the field or potential around a source collection, our model instead performs as $\mathcal{O}(M + N)$, achieves relative error of $\sim\!4\%-6\%$, and allows evaluation at arbitrary locations for arbitrary numbers of sources, greatly increasing the speed of e.g.~physics simulations. We compare with existing models and develop two-dimensional examples, including cases where sources overlap or have more complex geometries, to demonstrate its application.

URL: https://openreview.net/forum?id=EvfwGpo135

---

Title: HAPEns: Hardware-Aware Post-Hoc Ensembling for Tabular Data

Abstract: Ensembling is commonly used in machine learning on tabular data to boost predictive performance and robustness, but larger ensembles often lead to increased hardware demand. We introduce HAPEns, a post-hoc ensembling method that explicitly balances accuracy against hardware efficiency. Inspired by multi-objective and quality diversity optimization, HAPEns constructs a diverse set of ensembles along the Pareto front of predictive performance and resource usage. Experiments on 83 tabular classification datasets show that HAPEns significantly outperforms baselines, achieving superior accuracy–efficiency trade-offs. Ablation studies further reveal that memory usage is a particularly effective objective metric. Further, we show that even a greedy ensembling algorithm can be significantly improved in this task with a static multi-objective weighting scheme.

URL: https://openreview.net/forum?id=FbuhDKWyx9

---

Title: Qini Curve Estimation under Clustered Network Interference

Abstract: Qini curves are a widely used tool for assessing treatment policies under allocation constraints as they visualize the incremental gain of a new treatment policy versus the cost of its implementation. Standard Qini curve estimation assumes no interference between units: that is, that treating one unit does not influence the outcome of any other unit. In many real-life applications such as public policy or marketing, however, the presence of interference is common. Ignoring interference in these scenarios can lead to systematically biased Qini curves that over- or under-estimate a treatment policy's cost-effectiveness. In this paper, we address the problem of Qini curve estimation under clustered network interference, where interfering units form independent clusters. We propose a formal description of the problem setting with an experimental study design under which we can account for clustered network interference. Within this framework, we describe three estimation strategies, each suited to different conditions, and provide guidance for selecting the most appropriate approach by highlighting the inherent bias-variance trade-offs. To complement our theoretical analysis, we introduce a marketplace simulator that replicates clustered network interference in a typical e-commerce environment, allowing us to evaluate and compare the proposed strategies in practice.

URL: https://openreview.net/forum?id=iYsLwAuCY5

---

Title: A Simple and Effective Reinforcement Learning Method for Text-to-Image Diffusion Fine-tuning

Abstract: Reinforcement learning (RL)-based fine-tuning has emerged as a powerful approach for aligning diffusion models with black-box objectives. Proximal policy optimization (PPO) is the most popular choice of method for policy optimization. While effective in terms of performance, PPO is highly sensitive to hyper-parameters and involves substantial computational overhead. REINFORCE, on the other hand, mitigates some computational complexities such as high memory overhead and sensitive hyper-parameter tuning, but has suboptimal performance due to high-variance and sample inefficiency. While the variance of the REINFORCE can be reduced by sampling multiple actions per input prompt and using a baseline correction term, it still suffers from sample inefficiency. To address these challenges, we systematically analyze the efficiency-effectiveness trade-off between REINFORCE and PPO, and propose leave-one-out PPO (LOOP), a novel RL for diffusion fine-tuning method. LOOP combines variance reduction techniques from REINFORCE, such as sampling multiple actions per input prompt and a baseline correction term, with the robustness and sample efficiency of PPO via clipping and importance sampling. Our results demonstrate that LOOP effectively improves diffusion models on various black-box objectives, and achieves a better balance between computational efficiency and performance.

URL: https://openreview.net/forum?id=i8WJhKn455

---

Title: On Almost Surely Safe Alignment of Large Language Models at Inference-Time

Abstract: We introduce a novel inference-time alignment approach for LLMs that aims to generate safe responses almost surely, i.e., with probability approaching one w.r.t. a given cost model. Our approach models the generation of safe responses as a constrained Markov Decision Process (MDP) within the LLM's latent space. We augment a safety state that tracks the evolution of safety constraints and dynamically penalize unsafe generations to ensure the generation of safe responses. Consequently, we demonstrate formal safety guarantees w.r.t. the given cost model upon solving the MDP in the latent space with sufficiently large penalties. Building on this foundation, we propose $\texttt{InferenceGuard}$, a practical implementation that safely aligns LLMs without modifying the model weights. Empirically, we demonstrate that $\texttt{InferenceGuard}$ effectively balances safety and task performance, outperforming existing inference-time alignment methods in generating safe and aligned responses. Our findings contribute to the advancement of safer LLM deployment through alignment at inference-time, thus presenting a promising alternative to resource-intensive, overfitting-prone alignment techniques like RLHF.

URL: https://openreview.net/forum?id=FlnokjaSEu

---

Title: LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization

Abstract: We introduce LLM-Lasso, a novel framework that leverages large language models (LLMs) to guide feature selection in Lasso $\ell_1$ regression. Unlike traditional methods that rely solely on numerical data, LLM-Lasso incorporates domain-specific knowledge extracted from natural language, optionally enhanced through a retrieval-augmented generation (RAG) pipeline, to seamlessly integrate data-driven modeling with contextual insights. Specifically, the LLM generates penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model. This is, to our knowledge, the first embedded LLM-driven feature selector. By design, LLM-Lasso addresses the key robustness challenges of LLM-driven feature selection: the risk of LLM hallucinations or low-quality responses. An internal cross-validation step is crucial to LLM-Lasso’s robustness, determining how heavily the prediction pipeline relies on the LLM’s outputs. Consequently, irrespective of the LLM’s generation quality, LLM-Lasso is guaranteed never to perform worse than standard Lasso.
In various biomedical case studies, LLM-Lasso outperforms standard Lasso and existing feature selection baselines.

URL: https://openreview.net/forum?id=AJPl6rwus3

---

Title: LLM-based Contrastive Self-Supervised AMR Learning with Masked Graph Autoencoders for Fake News Detection

Abstract: The proliferation of misinformation in the digital age has led to significant societal challenges. Existing approaches often struggle with capturing long-range dependencies, complex semantic relations, and the social dynamics influencing news dissemination. Furthermore, these methods require extensive labelled datasets, making their deployment resource-intensive. In this study, we propose a novel self-supervised misinformation detection framework that integrates both complex semantic relations using Abstract Meaning Representation (AMR) and news propagation dynamics. We introduce an LLM-based graph contrastive loss (LGCL) that utilizes negative anchor points generated by a Large Language Model (LLM) to enhance feature separability in a zero-shot manner. To incorporate social context, we employ a multi view graph masked autoencoder, which learns news propagation features from social context graph. By combining these semantic and propagation-based features, our approach effectively differentiates between fake and real news in a self-supervised manner. Extensive experiments demonstrate that our self-supervised framework achieves superior performance compared to other state-of-the-art methodologies, even with limited labelled datasets while improving generalizability.

URL: https://openreview.net/forum?id=puFYjgDXz6

---

Title: When Does LoRA Reuse Work? Theoretical Limits and Mechanisms for Recycling LoRAs Without Data Access

Abstract: Reusing low-rank adapters (LoRAs) by merging or routing is a common strategy for adapting large language models to new tasks, especially when training data is unavailable but many fine-tuned LoRAs are accessible. While the availability of publicly shared LoRA weights has inspired new algorithms for composing them to solve new tasks, recent findings highlight limitations in LoRA’s ability to integrate new knowledge. This work investigates when LoRA reuse could be viable without direct access to training data. Through theoretical analysis and experiments on synthetic two-hop reasoning and math word problems, we show that data-agnostic methods, such as parameter averaging and dynamic selection, often fail to combine knowledge from logically disjoint fine-tuning datasets. This challenge is particularly pronounced when the relevant knowledge is underrepresented during pretraining. However, reuse can succeed when fine-tuning datasets share solution templates, such as reasoning patterns or reusable code, which serve as bridges among tasks. Our results suggest that LoRA reuse relies more on shallow pattern matching than on logical integration of existing knowledge. This mechanism-based perspective offers practical guidance for curating datasets and designing systems that enable LoRA reuse to overcome data-access limitations. Findings indicate that future research should focus on the mechanisms enabling effective adapter reuse rather than solely on developing new reuse algorithms.

URL: https://openreview.net/forum?id=lVqUJlsnRy

---

Reply all

Reply to author

Forward

0 new messages