Weekly TMLR digest for Apr 19, 2026

8 views

Skip to first unread message

TMLR

unread,

Apr 19, 2026, 12:00:14 AMApr 19

to tmlr-annou...@googlegroups.com

New certifications
==================

J2C Certification: Dimension-free error estimate for diffusion model and optimal scheduling

Valentin De Bortoli, Romuald Elie, Anna Kazeykina, Zhenjie Ren, Jiacheng Zhang

https://openreview.net/forum?id=uArYtsvW8o

---

J2C Certification: Forgetting: A New Mechanism Towards Better Large Language Model Fine-tuning

Ali Taheri, Alireza Taban, Qizhou Wang, Shanshan Ye, Abdolreza Mirzaei, Tongliang Liu, Bo Han

https://openreview.net/forum?id=s36smEoUoX

---

Featured Certification: Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips

Ido Galil, Moshe Kimhi, Ran El-Yaniv

https://openreview.net/forum?id=kN1s53X3zl

---

Accepted papers
===============

Title: Explainable Error Detection in Integrated Circuits Image Segmentation via Graph Neural Networks

Authors: MA XIAOYU, Jingyang Dai, Feng Ji, Deruo Cheng, Yiqiong Shi, Bah-Hwee Gwee

Abstract: Automated segmentation of integrated circuit (IC) images plays a critical role in hardware assurance, yet remains challenging due to nanoscale structural complexity, extremely low error tolerance, and the limited interpretability of existing deep learning–based approaches. Most convolutional neural network (CNN)–based error detection methods operate at the whole-image level, making it difficult to localize specific faults or explain their structural causes. In this work, we propose an explainable graph neural network (GNN) framework for component-level error detection in IC segmentation masks. Each connected component in the binary mask is converted into a feature-annotated graph that captures both topological connectivity and geometric properties. Error detection is then formulated as graph-based classification, enabling the identification of anomalous components and precise localization of erroneous regions. Experiments on multiple IC layouts under diverse imaging conditions demonstrate that the proposed method achieves robust and generalizable performance. In addition to accurate detection, the graph-based formulation provides improved interpretability by explicitly linking predictions to structural deviations at the component level.

URL: https://openreview.net/forum?id=q68B9mTom5

---

Title: On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds

Authors: Matteo Vilucchio, Nikolaos Tsilivis, Bruno Loureiro, Julia Kempe

Abstract: Regularization, whether explicit in terms of a penalty in the loss or implicit in the choice of algorithm, is a cornerstone of modern machine learning. Indeed, controlling the complexity of the model class is particularly important when data is scarce, noisy or contaminated, as it translates a statistical belief on the underlying structure of the data. This work investigates the question of how to choose the regularization norm $\lVert \cdot \rVert$ in the context of high-dimensional adversarial training for binary classification. To this end, we first derive an exact asymptotic description of the robust, regularized empirical risk minimizer for various types of adversarial attacks and regularization norms (including non-$\ell_p$ norms). We complement this analysis with a uniform convergence analysis, deriving bounds on the Rademacher Complexity for this class of problems. Leveraging our theoretical results, we quantitatively characterize the relationship between perturbation size and the optimal choice of $\lVert \cdot \rVert$, confirming the intuition that, in the data scarce regime, the type of regularization becomes increasingly important for adversarial training as perturbations grow in size.

URL: https://openreview.net/forum?id=vkmvuranbm

---

Title: Retrieval as a Decision: Training-Free Adaptive Gating for Efficient RAG

Authors: Yufeng Wang, Lu Wei, Haibin Ling

Abstract: Retrieval-Augmented Generation (RAG) improves factuality but retrieving for every query often hurts quality while inflating tokens and latency. We propose Training-free Adaptive Retrieval Gating (\textbf{TARG}), a single-shot policy that decides when to retrieve using only a short, no-context draft from the base model. From the draft’s prefix logits, TARG computes lightweight uncertainty scores—mean token entropy, a margin signal derived from the top-1/top-2 logit gap via a monotone link, or small-$N$ variance across a handful of stochastic prefixes—and triggers retrieval only when the score exceeds a threshold. The gate is model-agnostic, adds only tens to hundreds of draft tokens, and requires no additional training or auxiliary heads. On NQ-Open, TriviaQA, and PopQA, TARG consistently pushes the accuracy–efficiency frontier: compared with Always-RAG\footnote{\textsc{Always-RAG}: retrieve for every query; \textsc{Never-RAG}: never retrieve.}, TARG matches or improves EM/F1 while reducing retrieval by 70–90\% and cutting end-to-end latency, and it remains close to Never-RAG in overhead. A central empirical finding is that under modern instruction-tuned LLMs the margin signal is a robust default (entropy compresses as backbones sharpen), with small-$N$ variance offering a conservative, budget-first alternative. We provide ablations over gate type and prefix length and use a $\Delta$-latency view to make budget trade-offs explicit.

URL: https://openreview.net/forum?id=L8gYtUZfVU

---

Title: CatScreen: A Large MultiModal Benchmark Dataset for Cataract Screening

Authors: Mahapara Khurshid, Sonam Kumar, Dr Anusuya Bhattacharyya, Dhruve Kiyawat, Anshul Chauhan, Suklengmung Buragohain, Harsha Bhattacharjee, Limalemla Jamir, Vishali Gupta, Mona Duggal, Mayank Vatsa, Richa Singh

Abstract: Low-cost slit-lamp imaging holds significant potential for transforming eye care by facilitating affordable and scalable cataract diagnosis. However, the development of robust, generalizable AI-based cataract screening solutions is currently constrained by the limited availability of large-scale, richly annotated datasets. To address this critical gap, we introduce CatScreen, a comprehensive multimodal benchmark dataset specifically designed for cataract screening, comprising approximately 18,000 slit-lamp images collected from 2,251 subjects using a portable slit-lamp camera. CatScreen is structured into three subsets: (i) a clean set meticulously annotated using a structured multi-tier framework involving trained optometrists with final validation by an experienced ophthalmologist across clinically relevant dimensions, including image gradability, quality assessment, illumination type, diagnostic classification, cataract subtype, and severity grading according to established standards; (ii) a noisy-labeled set that simulates real-world annotation inaccuracies; and (iii) an unlabeled set intended to foster the development of self-supervised and semi-supervised learning approaches. Furthermore, CatScreen integrates extensive subject-level metadata encompassing demographics, lifestyle factors, and detailed clinical histories, and includes a subset with anatomical and pathological annotations to support multimodal modeling and anatomically grounded analysis. We present baseline experiments under independent, structured sequential, and multitask prediction settings in both unimodal and multimodal configurations. These results establish initial benchmarks for CatScreen and demonstrate the value of metadata for selected diagnostic tasks, while also highlighting open challenges, such as class imbalance and fine-grained subtype discrimination. CatScreen is intended as a benchmark resource for future research in cataract screening, robust learning, semi-supervised learning, and interpretability-oriented analysis. The database is available at: https://iab-rubric.org/resources/healthcare-datasets/catscreen.

URL: https://openreview.net/forum?id=cF7tSNAVQ6

---

Title: LinMU: Multimodal Understanding Made Linear

Authors: Hongjie Wang, Niraj Jha

Abstract: Modern Vision-Language Models (VLMs) achieve impressive performance but are limited by the quadratic complexity of self-attention, which prevents their deployment on edge devices and makes their understanding of high-resolution images and long-context videos prohibitively expensive. To address this challenge, we introduce LinMU (Linear-complexity Multimodal Understanding), a VLM design that achieves linear complexity for the language model decoder without using any quadratic-complexity modules while maintaining the performance of global-attention-based VLMs. LinMU replaces every self-attention layer in the language model decoder with an M-MATE block: a dual-branch module that combines a bidirectional state-space model for global context (Flex-MA branch) with localized Swin-style window attention (Local-Swin branch) for adjacent correlations. To transform a pre-trained VLM into the LinMU architecture, we propose a three-stage distillation framework that (i) initializes both branches with self-attention weights and trains the Flex-MA branch alone, (ii) unfreezes the Local-Swin branch and fine-tunes it jointly with the Flex-MA branch, and (iii) unfreezes the remaining blocks and fine-tunes them using LoRA adapters, while regressing on hidden states and token-level logits of the frozen VLM teacher. On MMMU, TextVQA, LongVideoBench, Video-MME, and other benchmarks, LinMU matches the performance of teacher models, yet reduces Time-To-First-Token (TTFT) by up to 2.7$\times$ and improves token throughput by up to 9.0$\times$ on minute-length videos. Ablations confirm the importance of each distillation stage and the necessity of the two branches of the M-MATE block. We also conduct distillation on various VLM backbones to validate the universality of LinMU. The proposed framework demonstrates that state-of-the-art multimodal reasoning can be achieved without quadratic attention, thus opening up avenues for long-context VLMs that can deal with high-resolution images and long videos.

URL: https://openreview.net/forum?id=6BYdTSNrab

---

Title: ClimateAgent: Multi-Agent Orchestration for Complex Climate Data Science Workflows

Authors: Chenyue Li, Hyeonjae Kim, Wen Deng, Mengxi Jin, HUANG Wen, Mengqian Lu, Binhang Yuan

Abstract: Climate science demands automated workflows to transform comprehensive questions into data-driven statements across massive, heterogeneous datasets. However, generic LLM agents and static scripting pipelines lack climate-specific context and flexibility, thus, perform poorly in practice. We present ClimateAgent, an autonomous multi-agent framework that orchestrates end-to-end climate data analytic workflows. ClimateAgent decomposes user questions into executable sub-tasks coordinated by an Orchestrate-Agent and a Plan-Agent; acquires data via specialized Data-Agents that dynamically introspect APIs to synthesize robust download scripts; and completes analysis and reporting with a Coding-Agent that generates Python code, visualizations, and a final report with a built-in self-correction loop. To enable systematic evaluation, we introduce Climate-Agent-Bench-85, a benchmark of 85 real-world tasks spanning atmospheric rivers, drought, extreme precipitation, heat waves, sea surface temperature, and tropical cyclones. On Climate-Agent-Bench-85, ClimateAgent achieves $100\%$ task completion and a report quality score of $8.32$, outperforming GitHub-Copilot ($6.27$) and a GPT-5 baseline ($3.26$). These results demonstrate that our multi-agent orchestration with dynamic API awareness and self-correcting execution substantially advances reliable, end-to-end automation for climate science analytic tasks. The source code of ClimateAgent is available
at https://github.com/Relaxed-System-Lab/ClimateAgent.

URL: https://openreview.net/forum?id=XLWvXNumGa

---

Title: Dimension-free error estimate for diffusion model and optimal scheduling

Authors: Valentin De Bortoli, Romuald Elie, Anna Kazeykina, Zhenjie Ren, Jiacheng Zhang

Abstract: Diffusion generative models have emerged as powerful tools for producing synthetic data from an empirically observed distribution. A common approach involves simulating the time-reversal of an Ornstein–Uhlenbeck (OU) process initialized at the true data distribution. Since the score function associated with the OU process is typically unknown, it is approximated using a trained neural network. This approximation, along with finite time simulation, time discretization and statistical approximation, introduce several sources of error whose impact on the generated samples must be carefully understood.
Previous analyses have quantified the error between the generated and the true data distributions in terms of Wasserstein distance or Kullback–Leibler (KL) divergence. However, both metrics present limitations: KL divergence requires absolute continuity between distributions, while Wasserstein distance, though more general, leads to error bounds that scale poorly with dimension, rendering them impractical in high-dimensional settings.
In this work, we derive an explicit, dimension-free bound on the discrepancy between the generated and the true data distributions. The bound is expressed in terms of a smooth test functional with bounded first and second derivatives. The key novelty lies in the use of this weaker, functional metric to obtain dimension-independent guarantees, at the cost of higher regularity on the test functions. As an application, we formulate and solve a variational problem to minimize the time-discretization error, leading to the derivation of an optimal time-scheduling strategy for the reverse-time diffusion. Interestingly, this scheduler has appeared previously in the literature in a different context; our analysis provides a new justification for its optimality, now grounded in minimizing the discretization bias in generative sampling.

URL: https://openreview.net/forum?id=uArYtsvW8o

---

Title: LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking

Authors: Yifan Zeng, Ojas Tendolkar, Raymond Baartmans, Qingyun Wu, Lizhong Chen, Huazheng Wang

Abstract: Ranking passages by prompting a large language model (LLM) can achieve promising performance in modern information retrieval (IR) systems. A common approach to sort the ranking list is by prompting LLMs for a pairwise or setwise comparison which often relies on sorting algorithms. However, sorting-based methods require consistent comparisons to sort the passages correctly, which we show that LLMs often violate. We identify two kinds of intrinsic inconsistency in LLM-based pairwise comparisons: order inconsistency which leads to conflicting results when switching the passage order, and transitive inconsistency which leads to non-transitive triads among all preference pairs. Our study of these inconsistencies is relevant for understanding and improving the stability of any ranking scheme based on relative preferences. In this paper, we propose LLM-RankFusion, an LLM-based ranking framework that mitigates these inconsistencies and produces a robust ranking list. LLM-RankFusion mitigates order inconsistency using in-context learning (ICL) to demonstrate order-agnostic comparisons and calibration to estimate the underlying preference probability between two passages. We then address transitive inconsistency by aggregating the ranking results from multiple rankers. In our experiments, we empirically show that LLM-RankFusion can significantly reduce inconsistent comparison results, improving the ranking quality by making the final ranking list more robust. Our code is available at https://github.com/XHMY/LLM-RankFusion

URL: https://openreview.net/forum?id=VUY0j74Yes

---

Title: Incremental3D: Real-time Incremental 3D Scene Generation with Scene Graphs

Authors: Penggang GAO, Yonas Teodros Tefera, Darwin G. Caldwell, Nikhil Deshpande

Abstract: Realistic 3D environments are important for a wide range of applications, including robotics, simulation, virtual reality, and video games.
The goal of 3D scene generation is to create spatially structured, semantically meaningful, and visually realistic environments that capture objects and their relationships in space. Graph-based 3D scene generation approaches represent environments as scene graphs, where nodes correspond to objects and edges encode their semantic and spatial relationships. However, existing methods become inefficient when the 3D scene graph evolves incrementally, because they are fundamentally single-shot: inserting even a single new object requires regenerating the entire scene. This global re-computation incurs prohibitive latency and scalability limitations. To address this limitation, we propose Incremental3D, a framework for incremental 3D scene generation in real-time from evolving scene graphs. Incremental3D augments the scene graph with a global context node that captures a holistic representation of the evolving environment. At each update step, this node aggregates information from new nodes and edges to form a global embedding. Newly inserted objects are then generated by conditioning on both this embedding and their local features, enabling geometry synthesis and spatial prediction without recomputing unchanged regions. Extensive experiments demonstrate that Incremental3D achieves a generation rate of 38 Hz, while maintaining high spatial and geometric accuracy, indicating its potential for real-time and latency-sensitive applications.

URL: https://openreview.net/forum?id=am8Zv3R8GW

---

Title: Conformal Calibration of Statistical Confidence Sets

Authors: Luben Miguel Cruz Cabezas, Guilherme Soares, Thiago Ramos, Rafael Bassi Stern, Rafael Izbicki

Abstract: Constructing valid confidence sets is a crucial task in statistical inference, yet traditional methods often face challenges when dealing with complex models or limited observed sample sizes. These challenges are frequently encountered in modern applications, such as Likelihood-Free Inference (LFI). In these settings, confidence sets may fail to maintain a confidence level close to the nominal value. In this paper, we introduce two novel methods, TRUST and TRUST++, for calibrating confidence sets to achieve distribution-free conditional coverage. These methods rely entirely on simulated data from the statistical model to perform calibration. Leveraging insights from conformal prediction techniques adapted to the statistical inference context, our methods ensure both finite-sample local coverage and asymptotic conditional coverage as the number of simulations increases, even if the observed (real) sample size n is small. They effectively handle nuisance parameters and provide computationally efficient uncertainty quantification for the estimated confidence sets. This allows users to assess whether additional simulations are necessary for robust inference. Through theoretical analysis and experiments on models with tractable and intractable likelihoods, we demonstrate that our methods outperform existing approaches, particularly in small-sample regimes. This work bridges the gap between conformal prediction and statistical inference, offering practical tools for constructing valid confidence sets in complex models.

URL: https://openreview.net/forum?id=J4lK62PVE6

---

Title: Revisit, Extend, and Enhance Hessian-Free Influence Functions

Authors: Ziao Yang, Han Yue, Jian Chen, Hongfu Liu

Abstract: Influence functions serve as crucial tools for assessing sample influence. By employing the first-order Taylor expansion, sample influence can be estimated without the need for expensive model retraining. However, applying influence functions directly to deep models presents challenges, primarily due to the non-convex nature of the loss function and the large size of model parameters. This difficulty not only makes computing the inverse of the Hessian matrix costly but also renders it non-existent in some cases. In this paper, we revisit a Hessian-free method, which substitutes the inverse of the Hessian matrix with an identity matrix, and offer deeper insights into why this straightforward approximation method is effective. Furthermore, we extend its applications beyond measuring model utility to include considerations of fairness and robustness. Finally, we enhance this method through an ensemble strategy. To validate its effectiveness, we conduct experiments on synthetic data and extensive evaluations on noisy label detection, sample selection for large language model fine-tuning, and defense against adversarial attacks.

URL: https://openreview.net/forum?id=ijL2681Tau

---

Title: Data Compressibility Quantifies LLM Memorization

Authors: Yizhan Huang, Zhe YANG, Meifang Chen, HUANG Nianchen, Jianping Zhang, Michael R. Lyu

Abstract: Large Language Models (LLMs) are known to memorize portions of their training data, sometimes even reproduce content verbatim when prompted appropriately. Despite substantial interest, existing LLM memorization research has offered limited insight into how training data influences memorization and largely lacks quantitative characterization. In this work, we build upon the line of research that seeks to quantify memorization through data compressibility. We analyze why prior attempts fail to yield a reliable quantitative measure and show that a surprisingly simple shift from instance-level to set-level metrics uncovers a robust phenomenon, which we term the \textit{Entropy--Memorization (EM) Linearity}. This law states that a set-level data entropy estimator exhibits a linear correlation with memorization scores.

We validate the EM Linearity through extensive experiments across a wide range of open-source models and experimental configurations. We further investigate the role of the token space—an implicit yet pivotal factor in our method—and identify an additional variant of the EM Law. Besides, we made a side observation that EM Linearity enables a simple application to distinguish between LLM train data and test data.

URL: https://openreview.net/forum?id=6L4UXc7P3h

---

Title: Multimodal Deception in Explainable AI: Concept-Level Backdoor Attacks on Concept Bottleneck Models

Authors: Songning Lai, Jiayu Yang, Yu Huang, Lijie Hu, TianlangXue, Zhangyi Hu, Jiaxu Li, Haicheng Liao, Zongyang Liu, Yutao Yue

Abstract: Deep learning has demonstrated transformative potential across domains, yet its inherent opacity has driven the development of Explainable Artificial Intelligence (XAI). Concept Bottleneck Models (CBMs), which enforce interpretability through human-understandable concepts, represent a prominent advancement in XAI. However, despite their semantic transparency, CBMs remain vulnerable to security threats such as backdoor attacks—malicious manipulations that induce controlled misbehaviors during inference. While CBMs leverage multimodal representations (visual inputs and textual concepts) to enhance interpretability, their dual-modality structure introduces unique, unexplored attack surfaces. To address this risk, we propose CAT (Concept-level Backdoor ATtacks), a methodology that injects stealthy triggers into conceptual representations during training. Unlike naive attacks that randomly corrupt concepts, CAT employs a sophisticated filtering mechanism to enable precise prediction manipulation without compromising clean-data performance. We further propose CAT+, an enhanced variant incorporating a concept correlation function to iteratively optimize trigger-concept associations, thereby maximizing attack effectiveness and stealthiness. Crucially, we validate our approach through a rigorous two-stage evaluation framework. First, we establish the fundamental vulnerability of the concept bottleneck layer in a controlled setting, showing that CAT+ achieves high attack success rates (ASR) while remaining statistically indistinguishable from natural data. Second, we demonstrate practical end-to-end feasibility via our proposed Image2Trigger_c method, which translates visual perturbations into concept-level triggers, achieving an end-to-end ASR of 53.29%. Extensive experiments show that CAT outperforms random-selection baselines significantly, and standard defenses like Neural Cleanse fail to detect these semantic attacks. This work highlights critical security risks in interpretable AI systems and provides a robust methodology for future security assessments of CBMs.

URL: https://openreview.net/forum?id=bntZBG9fBY

---

Title: Topology-Guided Graph Pre-training and Prompt Learning on Directed Graphs

Authors: Peiyu Liang, Chenguang Yang, Yixuan He, Rong Pan, Yuzhou Chen

Abstract: In recent years, graph neural networks (GNNs) have been the dominant approach for graph representation learning, leading to new state-of-the-art results on many classification and prediction tasks. However, they are limited by the fact that they cannot effectively learn expressive node representations without the guide of labels, thus suffering from the labeled data scarcity problem. To address the challenges of labeling costs and improve robustness in few-shot scenarios, pre-training on self-supervised tasks has garnered significant attention. Additionally, numerous prompting methods have been proposed as effective ways to bridge the gap between pretext tasks and downstream applications. Although graph pre-training and prompt tuning methods have explored various downstream tasks on undirected graphs, directed graphs have been largely under-explored, and these models suffer limitations in capturing directional and topological information in directed graphs. In this paper, we propose a novel topology-guided directed graph pre-training and prompt tuning model, named TopoDIG, that can effectively capture intrinsic directional structural and local topological features in directed graphs. These features play essential roles in transferring knowledge from a pre-trained model to downstream tasks. TopoDIG consists of an encoder in the form of a magnetic Laplacian matrix, a topological encoder, and a graph prompt learning function. Experimental results on both real-world and synthetic directed graphs demonstrate the superior performance of TopoDIG compared to prominent baseline methods.

URL: https://openreview.net/forum?id=kMIdkLTys8

---

Title: LoDAdaC: a unified local training-based decentralized framework with adaptive gradients and compressed communication

Authors: Wei Liu, Anweshit Panda, Ujwal Pandey, Haven Cook, George Slota, Naigang Wang, Jie Chen, Yangyang Xu

Abstract: In the decentralized distributed learning, achieving fast convergence and low communication cost is essential for scalability and high efficiency. Adaptive gradient methods, such as Adam, have demonstrated strong practical performance in deep learning and centralized distributed settings. However, their convergence properties remain largely unexplored in decentralized settings involving multiple local training steps, such as federated learning. To address this limitation, we propose LoDAdaC, a unified multiple \textbf{Lo}cal Training (MLT) \textbf{D}ecentralized framework with \textbf{Ada}m-type updates and \textbf{C}ompressed communication (CC). LoDAdaC accommodates a broad class of optimizers for its local adaptive updates, including AMSGrad, Adam, and AdaGrad; it is compatible with standard (possibly biased) compressors such as low-bit quantization and sparsification. MLT and CC enable LoDAdaC to achieve multiplied reduction of communication cost, while the technique of adaptive updates enables fast convergence.
We rigorously prove the combined advantage through complexity analysis. In addition, experiments on image classification and GPT-style language model training validate our theoretical findings and show that LoDAdaC significantly outperforms existing decentralized algorithms in terms of convergence speed and communication efficiency.

URL: https://openreview.net/forum?id=0qoy9usvnm

---

Title: Combinatorial Capacity of modReLU Complex Networks: VC-Dimension Bounds and Lower Limits

Authors: Mehmet Altunören

Abstract: Complex-valued neural networks (CVNNs) are increasingly used in settings where both magnitude and phase of the signal carry information; see, e.g., (10; 11; 13; 15). In particular, deep networks with the modReLU activation (18) have been used extensively in applications such as MRI reconstruction, radar, and complex-valued time-series modeling. While approximation properties of such networks have recently been analyzed in detail (6), their statistical capacity in the sense of VC-dimension has not, to the best of our knowledge, been studied. In this paper we formalize a natural class of fully connected deep complex-valued networks with modReLU activation and real sign output. Via the standard identification $\mathbb{C}^d \cong \mathbb{R}^{2d}$, we view these models as binary classifiers on $\mathbb{R}^{2d}$. Let $W$ denote the total number of real-valued trainable parameters, including the real and imaginary parts of all weights and biases as well as the real modReLU bias parameters. Using tools from real algebraic geometry and a VC-dimension bound for semi-algebraic concept classes due to Goldberg and Jerrum (8), together with quantitative bounds for quantifier elimination (4), we prove that for any architecture with $W$ parameters and depth $L$, the VC-dimension of the corresponding hypothesis class is at most on the order of $W^2 \log W$, with a universal constant independent of the particular architecture. On the other hand, by restricting to real inputs and parameters and exploiting results of Harvey, Liaw, and Mehrabian (9) and Bartlett et al. (3) on deep networks with piecewise-linear activations, we obtain lower bounds of order $WL \log(W/L)$ for suitable depth-$L$ architectures within the modReLU class. Thus the VC-dimension of these networks grows at least linearly in both $W$ and $L$, and at most quadratically in $W$ up to a logarithmic factor. Closing this gap is an interesting open problem.

URL: https://openreview.net/forum?id=jfeJnfST36

---

Title: A Lower Bound for the Number of Linear Regions of Ternary ReLU Regression Neural Networks

Authors: Yuta Nakahara, Manabu Kobayashi, Toshiyasu Matsushima

Abstract: With the advancement of deep learning, reducing computational complexity and memory consumption has become a critical challenge, and ternary neural networks (NNs) that restrict parameters to $\{-1, 0, +1\}$ have attracted attention as a promising approach. While ternary NNs demonstrate excellent performance in practical applications such as image recognition and natural language processing, their theoretical understanding remains insufficient. In this paper, we theoretically analyze the expressivity of ternary NNs from the perspective of the number of linear regions. Specifically, we evaluate the number of linear regions of ternary regression NNs with Rectified Linear Unit (ReLU) for activation functions and prove that the number of linear regions increases polynomially with respect to network width and exponentially with respect to depth, similar to standard NNs. Moreover, we show that it suffices to first double the width, then either square the width or double the depth of ternary NNs with alternating ReLU and identity layers to achieve a lower bound on the maximum number of linear regions comparable to that of general ReLU regression NNs. When using ReLU in all the layers, a similar bound is obtained by further doubling the width. This provides a theoretical explanation, in some sense, for the practical success of ternary NNs.

URL: https://openreview.net/forum?id=Yg7tt1hWiF

---

Title: Curvature-Aware Safety Restoration In LLMs Fine-Tuning

Authors: Thong Bach, Thanh Nguyen-Tang, Dung Nguyen, Thao Minh Le, Truyen Tran

Abstract: Fine-tuning Large Language Models (LLMs) for downstream tasks often compromises safety alignment, even when using parameter-efficient methods like LoRA. In this work, we uncover a notable property: fine-tuned models preserve the geometric structure of their loss landscapes concerning harmful content, regardless of the fine-tuning method employed. This suggests that safety behaviors are not erased but shifted to less influential regions of the parameter space. Building on this insight, we propose a curvature-aware alignment restoration method that leverages influence functions and second-order optimization to selectively increase loss on harmful inputs while preserving task performance. By navigating the shared geometry between base and fine-tuned models, our method discourages unsafe outputs while preserving task-relevant performance, avoiding full reversion and enabling precise, low-impact updates. Extensive evaluations across multiple model families and adversarial settings show that our approach efficiently reduces harmful responses while maintaining or even improving utility and few-shot learning performance.

URL: https://openreview.net/forum?id=FSUehLhGyl

---

Title: Thought-Retriever: Don't Just Retrieve Raw Data, Retrieve Thoughts for Memory-Augmented Agentic Systems

Authors: Tao Feng, Pengrui Han, Guanyu Lin, Ge Liu, Jiaxuan You

Abstract: Large language models (LLMs) have transformed AI research thanks to their powerful internal capabilities and knowledge. However, existing LLMs still fail to effectively incorporate the massive external knowledge when interacting with the world. Although retrieval-augmented LLMs are proposed to mitigate the issue, they are still fundamentally constrained by the context length of LLMs, as they can only retrieve top-K raw data chunks from the external knowledge base which often consists of millions of data chunks. Here we propose Thought-Retriever, a novel model-agnostic algorithm that helps LLMs generate output conditioned on arbitrarily long external data, without being constrained by the context length or number of retrieved data chunks. Our key insight is to let an LLM fully leverage its intermediate responses generated when solving past user queries (thoughts), filtering meaningless and redundant thoughts, organizing them in thought memory, and retrieving the relevant thoughts when addressing new queries. This effectively equips LLM-based agents with a self-evolving long-term memory that grows more capable through continuous interaction. Besides algorithmic innovation, we further meticulously prepare a novel benchmark, AcademicEval, which requires an LLM to faithfully leverage ultra-long context to answer queries based on real-world academic papers. Extensive experiments on AcademicEval and two other public datasets validate that Thought-Retriever remarkably outperforms state-of-the-art baselines, achieving an average increase of at least 7.6% in F1 score and 16% in win rate across various tasks. More importantly, we further demonstrate two exciting findings: (1) Thought-Retriever can indeed help LLM self-evolve after solving more user queries; (2) Thought-Retriever learns to leverage deeper thoughts to answer more abstract user queries.

URL: https://openreview.net/forum?id=emCcuhtENL

---

Title: WAREX: Web Agent Reliability Evaluation on Existing Benchmarks

Authors: Su Kara, Fazle Elahi Faisal, Suman Nath

Abstract: Recent advances in browser-based LLM agents have shown promise for automating tasks ranging from simple form filling to hotel booking or online shopping. Current benchmarks measure agent performance in controlled environments, such as containers or stable networks, where websites behave deterministically. However, in the real world, users access websites over networks and HTTPS connections that introduce instability from multiple sources: client-side, server-side issues or broader system failures. Moreover, live websites are prone to web attacks such Cross-Site Scripting, as well as general site modifications which can cause unexpected or malicious pop-ups or improper functionality. To address this gap, we present WAREX, a plug-and-play, network-layer tool that integrates with existing web agent benchmarks by simulating common website failures. We measure the impact of WAREX across three popular benchmarks: WebArena, WebVoyager, and REAL. Our experiments show that introducing WAREX leads to significant drops in task success rates, highlighting the limited robustness of state-of-the-art agents. We demonstrate that WAREX serves as more than a diagnostic tool. By fine-tuning an open-source model (Qwen3-8B) on WAREX-generated "failure-recovery" trajectories, we achieve an 88.9% relative improvement in error recovery rates, validating WAREX as a core component for training the next generation of reliable web agents.

URL: https://openreview.net/forum?id=o4pXVP8RCD

---

Title: Forgetting: A New Mechanism Towards Better Large Language Model Fine-tuning

Authors: Ali Taheri, Alireza Taban, Qizhou Wang, Shanshan Ye, Abdolreza Mirzaei, Tongliang Liu, Bo Han

Abstract: Supervised fine-tuning (SFT) plays a critical role for pretrained large language models (LLMs), notably enhancing their capacity to acquire domain-specific knowledge while preserving or potentially augmenting their general-purpose capabilities. However, the efficacy of SFT hinges on data quality as well as data volume, otherwise it may result in limited performance gains or even degradation relative to the associated baselines. To mitigate such reliance, we suggest categorizing tokens within each corpus into two parts---positive and negative tokens---based on whether they are useful to improve model performance. Positive tokens can be trained in common ways, whereas negative tokens, which may lack essential semantics or be misleading, should be explicitly forgotten. Overall, the token categorization facilitates the model to learn less informative messages, and the forgetting guides the model on what information to learn more precisely. We conduct experiments across diverse and well-established benchmarks using various model architectures, demonstrating that this forgetting mechanism enhances model performance.

URL: https://openreview.net/forum?id=s36smEoUoX

---

Title: Improving LLM Unlearning Robustness via Random Perturbations

Authors: Dang Huu-Tien, Hoang Thanh-Tung, Anh Tuan Bui, Phuong Minh Nguyen, Le-Minh Nguyen, Naoya Inoue

Abstract: Here, we show that current LLM unlearning methods inherently reduce models' robustness, causing them to misbehave even when a single non-adversarial forget-token is present in the retain-query. Toward understanding underlying causes, we propose a novel theoretical framework that reframes the unlearning process as a backdoor attack and defense problem: we formulate how the forgetting process inadvertently learns to align forget-tokens (backdoor triggers) with the target-representations (target labels). As a result, forget-tokens act as backdoor triggers that, when activated in retain-queries, cause disruptions in unlearned models' behaviors, similar to successful backdoor attacks. The sense that, LLM unlearning methods themselves poison the model, make it more vulnerable to forget-tokens, and hide rather than erase target knowledge, describes their true mechanism. To mitigate the vulnerability caused by the forgetting process, we reinterpret the retaining process as a backdoor defense and propose Random Noise Augmentation (RNA), a lightweight, model and method-agnostic approach with theoretical guarantees for improving the robustness of unlearned models. Extensive experiments demonstrate that RNA significantly improves the robustness of unlearned models while preserving forget and retain performances. This backdoor attack-defense framework offers insights into the mechanism of unlearning that can shed light on future research directions for improving unlearning robustness.

URL: https://openreview.net/forum?id=QYw192hTdH

---

Title: CP-POL + PPI: Conformal Guarantees in Partially-Observed Label Space

Authors: Christian NGNIE

Abstract: We study Conformal Prediction (CP) in the practical and challenging regime where labeled training and calibration data observe only a subset of the label space. In this setting, classical Conformal guarantees no longer control marginal risk and naive unseen labels detection methods are either overconservative or uninformative. We introduce CP-POL, a simple operational pipeline that couples Split CP over observed labels with a calibrated novelty test and integrates Prediction-Powered Inference (PPI) for finite sample population estimation. We provide a non-asymptotic theory that (i) proves Le Cam impossibility result: novelty test from features alone is hopeless without structural assumptions, (ii) derives tight finite-sample coverage decompositions that isolate the role of the non-conforming event $s(X)>q$, (iii) gives Dvoretzky-Kiefer-Wolfowitz (DKW)-based conservative estimators and anytime martingale analogues for the novel mass function $\pi_{nov}$, (iv) identifies practically meaningful structural conditions under which strong guarantees for novel region prediction hold, and (v) proves finite-sample PPI bounds that cleanly separate sampling fluctuation, trained model error and novel-mass effects. We validate the theory with reproducible simulations. All bounds are non-asymptotic and designed for immediate use in deployed monitoring pipelines.

URL: https://openreview.net/forum?id=GEy2BtBQKa

---

Title: BalancedDPO: Adaptive Multi-Metric Alignment

Authors: Dipesh Tamboli, Souradip Chakraborty, Aditya Malusare, Biplab Banerjee, Amrit Singh Bedi, Vaneet Aggarwal

Abstract: Diffusion models have achieved remarkable progress in text-to-image generation, yet aligning them with human preference remains challenging due to the presence of multiple, sometimes conflicting, evaluation metrics (e.g., semantic consistency, aesthetics, and human preference scores). Existing alignment methods typically optimize for a single metric or rely on scalarized reward aggregation, which can bias the model toward specific evaluation criteria. To address this challenge, we propose BalancedDPO, a framework that achieves multi-metric preference alignment within the Direct Preference Optimization (DPO) paradigm. Unlike prior DPO variants that rely on a single metric, BalancedDPO introduces a majority-vote consensus over multiple preference scorers and integrates it directly into the DPO training loop with dynamic reference model updates. This consensus-based formulation avoids reward- scale conflicts and ensures more stable gradient directions across heterogeneous metrics. Experiments on Pick-a-Pic, PartiPrompt, and HPD datasets demonstrate that BalancedDPO consistently improves preference win rates over the baselines across Stable Diffusion 1.5, Stable Diffusion 2.1 and SDXL backbones. Comprehensive ablations further validate the benefits of majority-vote aggregation and dynamic reference updating, highlighting the method's robustness and generalizability across diverse alignment settings.

URL: https://openreview.net/forum?id=8HRID5VLQw

---

Title: DASB - Discrete Audio and Speech Benchmark

Authors: Pooneh Mousavi, Jarod Duret, Darius Petermann, Artem Ploujnikov, Luca Della Libera, Anastasia Kuznetsova, Cem Subakan, Mirco Ravanelli

Abstract: Discrete audio tokens have recently gained considerable attention for their potential to bridge audio and language processing, enabling multimodal language models that can both generate and understand audio. However, preserving key information such as phonetic content, speaker identity, and paralinguistic cues remains a major challenge.
Identifying the optimal tokenizer and configuration is further complicated by inconsistent evaluation settings across existing studies. To address this, we introduce the Discrete Audio and Speech Benchmark (DASB), a comprehensive framework for benchmarking discrete audio tokens across speech, general audio, and music domains on a range of discriminative and generative tasks. Our results show that discrete representations are less robust than continuous ones and require careful tuning of factors such as model architecture, data size, learning rate, and capacity. Semantic tokens generally outperform acoustic tokens, but a gap remains between discrete tokens and continuous features, highlighting the need for further research. DASB codes, evaluation setup, and leaderboards are publicly available at https://poonehmousavi.github.io/DASB-website/.

URL: https://openreview.net/forum?id=vGWrp0NjaE

---

Title: State Design Matters: How Representations Shape Dynamic Reasoning in Large Language Models

Authors: Annie Wong, Aske Plaat, Thomas Bäck, Niki van Stein, Anna V Kononova

Abstract: As large language models (LLMs) move from static reasoning tasks toward dynamic environments, their success depends on the ability to navigate and respond to an environment that changes as they interact at inference time. An underexplored factor in these settings is the representation of the state. Holding model parameters fixed, we systematically vary three key aspects: (1) state granularity (long form versus summary), (2) structure (natural language versus symbolic), and (3) spatial grounding (text-only versus images or textual map encodings) across sequential decision-making benchmarks. We find that trajectory summarisation improves performance by reducing noise and stabilising long-horizon reasoning. Second, natural language representations are the most robust across models, whereas structured encodings help mainly for models with strong code or structured output priors, such as JSON schemas. Third, while image-inputs show some benefit, text-based spatial encodings prove most effective. This advantage stems not from the spatial information itself, but from the act of construction, which compels the model to perform the spatial reasoning that static input does not elicit. Overall, we demonstrate that design choices for representing state are a decisive factor in performance, distinct from the availability of information itself. We note, however, that even with improved representations, current LLMs and VLMs remain brittle over long horizons, particularly when they must synthesise information to manage multiple subtasks to reach a goal.

URL: https://openreview.net/forum?id=sKoazMNH84

---

Title: A tale of two goals: leveraging short term goals performs best in multi-goal scenarios

Authors: Olivier Serris, Stephane Doncieux, Olivier Sigaud

Abstract: When an agent must learn to reach far away goals, several hierarchical reinforcement learning methods leverage planning to create a sequence of intermediate goals guiding a lower-level goal-conditioned policy. The low-level policy is typically conditioned on the current goal, with the aim of reaching it as quickly as possible. However, this approach can fail when intermediate goals can be reached in multiple ways, some of which may prevent continuing toward subsequent goals. To address this issue, we introduce an enriched Markov Decision Process (MDP) framework where the optimization objective not only considers reaching the current goal, but also subsequent ones. Using this framework, we can specify which goals the agent prepares to achieve ahead of time. To study the impact of this design, we conduct a series of experiments on navigation, balancing and locomotion tasks in which sequences of intermediate goals are given. By evaluating policies trained with an off-policy actor-critic algorithm on both the standard goal-conditioned MDP framework and ours in these tasks, we show that learning policies conditioned on the next two goals generally require less interaction data than all other types to reach the same level of performance or better.

URL: https://openreview.net/forum?id=qsUeLwbErp

---

Title: Adapt via Bayesian Nonparametric Clustering: Fine-Grained Classification for Model Recycling Under Domain and Category Shift

Authors: Zeya Wang, Longwen Shang, Yang Ni, Chenglong Ye

Abstract: Recycling pretrained classification models for new domains, known as Source-Free Domain Adaptation (SFDA), has been extensively studied under the closed-set assumption that source and target domains share identical label spaces. However, this assumption does not hold when unseen classes appear in the target domain. Addressing this category shift is challenging, as unknown target classes usually arise with no prior knowledge of their identities or number, and becomes particularly difficult in the source-free setting, where access to source data is unavailable. Most existing methods treat all unknown classes as a single group during both training and evaluation, limiting their capacity to model the underlying structure within the unknown class space. In this work, we present Adapt via Bayesian Nonparametric Clustering (ABC), a novel framework designed for SFDA scenarios where unknown target classes are present. Unlike prior methods, ABC explicitly achieves fine-grained classification of unknown target classes, offering a more structured vision of the problem. Our method first identifies high-confidence target samples likely to belong to known source classes. Using these as guidance, we develop a guided Bayesian nonparametric clustering approach that learns distinct prototypes for both known and unknown classes without requiring the number of unknown classes a priori, and assigns target samples accordingly. We further introduce a training objective that refines the source model by encouraging prototype-based discriminability and local prediction consistency. Experiments show that our method achieves competitive performance on standard benchmarks while simultaneously providing effective clustering of unknown classes.

URL: https://openreview.net/forum?id=J5B4yt7C37

---

Title: deCIFer: Crystal Structure Prediction from Powder Diffraction Data using Autoregressive Language Models

Authors: Frederik Lizak Johansen, Ulrik Friis-Jensen, Erik B Dam, Kirsten M. Ø. Jensen, Rocío Mercado, Raghavendra Selvan

Abstract: Novel materials drive advancements in fields ranging from energy storage to electronics, with crystal structure characterization forming a crucial yet challenging step in materials discovery. In this work, we introduce \emph{deCIFer}, an autoregressive language model designed for powder X-ray diffraction (PXRD)-conditioned crystal structure prediction (PXRD-CSP). Unlike traditional CSP methods that rely primarily on composition or symmetry constraints, deCIFer explicitly incorporates PXRD data, directly generating crystal structures in the widely adopted Crystallographic Information File (CIF) format. The model is trained on nearly 2.3 million crystal structures, with PXRD conditioning augmented by basic forms of synthetic experimental artifacts, specifically Gaussian noise and instrumental peak broadening, to reflect fundamental real-world conditions. Validated across diverse synthetic datasets representative of challenging inorganic materials, deCIFer achieves a 94\% structural match rate. The evaluation is based on metrics such as the residual weighted profile ($R_{wp}$) and structural match rate (MR), chosen explicitly for their practical relevance in this inherently underdetermined problem. deCIFer establishes a robust baseline for future expansion toward more complex experimental scenarios, bridging the gap between computational predictions and experimental crystal structure determination.

URL: https://openreview.net/forum?id=LftFQ35l47

---

Title: Maximal Brain Damage Without Data or Optimization: Disrupting Neural Networks via Sign-Bit Flips

Authors: Ido Galil, Moshe Kimhi, Ran El-Yaniv

Abstract: Deep Neural Networks (DNNs) can be catastrophically disrupted by flipping only a handful of sign bits in their parameters. We introduce Deep Neural Lesion (DNL), a data-free, lightweight method that locates these critical parameters and triggers massive accuracy drops. We validate its efficacy on a wide variety of computer vision models and datasets. The method requires no training data or optimization and can be carried out via common exploits software, firmware or hardware based attack vectors. An enhanced variant that uses a single forward and backward pass further amplifies the damage beyond DNL's zero-pass approach. Flipping just two sign bits in ResNet50 on ImageNet reduces accuracy by 99.8%. We also show that selectively protecting a small fraction of vulnerable sign bits provides a practical defense against such attacks.

URL: https://openreview.net/forum?id=kN1s53X3zl

---

Title: AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting

Authors: Shijue Huang, Hongru WANG, Wanjun Zhong, Zhaochen Su, Jiazhan Feng, Bowen Cao, Yi R. Fung

Abstract: With the advent of test-time scaling, Large Reasoning Models have achieved remarkable performance. However, the reinforcement learning process used to unlock these capabilities often leads to uncontrolled generation length, resulting in substantial computational overhead and unnecessary "overthinking" on simple tasks. Current methods either uniformly minimize reasoning tokens, thereby neglecting the necessity for more intricate reasoning on complex tasks, or employ precise token-level control, which often hinges on accurate difficulty estimation and suffers from unreliable model interpretation for nuanced instructions. To address these limitations, we introduce AdaCtrl,
a novel framework that can dynamically adjust its reasoning length based on the model’s self-assessed problem difficulty and also allow human-in-the-loop control of the budget to prioritize either efficiency or effectiveness. Specifically, we carefully develop a two-stage training pipeline: 1) Cold-start fine-tuning stage, where we first design explicit difficulty-aware tags (e.g., ``[Easy]'' or ``[Hard]'') to indicate difficulty of problems, and train the model on a curated dataset to align its reasoning behavior with these difficulty levels; and 2) Difficulty-aware reinforcement learning stage, which further refines the model’s adaptive reasoning behavior and calibrates its self-assessment of problem difficulty. In this way, AdaCtrl not only empowers the model to adaptively assess the difficulty of problem and adjust reasoning budget allocation, but also enables the user to explicitly control the desired reasoning mode by injecting the specific difficulty-aware tag.
Empirical results across four benchmarks show that, compared to different types of baselines, AdaCtrl effectively balances performance and computational efficiency, leading to performance improvements while dynamically reducing response lengths by up to 90%.

URL: https://openreview.net/forum?id=4J2Ako20V4

---

Title: V-OCBF: Learning Safety Filters from Offline Data via Value-Guided Offline Control Barrier Functions

Authors: Mumuksh Tayal, Manan Tayal, Aditya Singh, Shishir Kolathaya, Ravi Prakash

Abstract: Ensuring safety in autonomous systems requires controllers that aim to satisfy state-wise constraints without relying on online interaction.While existing Safe Offline RL methods typically enforce soft expected-cost constraints, they struggle to ensure strict state-wise safety. Conversely, Control Barrier Functions (CBFs) offer a principled mechanism to enforce forward invariance, but often rely on expert-designed barrier functions or knowledge of the system dynamics. We introduce Value-Guided Offline Control Barrier Functions (V-OCBF), a framework that learns a neural CBF entirely from offline demonstrations. Unlike prior approaches, V-OCBF does not assume access to the dynamics model; instead, it derives a recursive finite-difference barrier update, enabling model-free learning of a barrier that propagates safety information over time. Moreover, V-OCBF incorporates an expectile-based objective that avoids querying the barrier on out-of-distribution actions and restricts updates to the dataset-supported action set. The learned barrier is then used with a Quadratic Program (QP) formulation to synthesize real-time safe control. Across multiple case studies, V-OCBF yields substantially fewer safety violations than baseline methods while maintaining strong task performance, highlighting its scalability for offline synthesis of safety-critical controllers without online interaction or hand-engineered barriers.

URL: https://openreview.net/forum?id=PGO9mpIyyb

---

Title: Incorporating New Knowledge into Federated Learning: Advances, Insights, and Future Directions

Authors: Lixu Wang, Sun Yinggang, Yang Zhao, Jiaqi Wu, Jiahua Dong, Ating Yin, Qinbin Li, Qingqing Ye, Dusit Niyato, Tianwei Zhang, Kwok-Yan Lam, Yu Haining, Haibo Hu, Wei Dong

Abstract: Federated Learning (FL) is a distributed learning approach that allows participants to collaboratively train machine learning models without sharing the raw data. It is rapidly developing in an era where privacy protection is increasingly valued. It is this rapid development trend, along with the continuous emergence of new demands for FL in the real world, that prompts us to focus on a very important problem: How to Incorporate New Knowledge into Federated Learning? The primary challenge here is to effectively and timely incorporate various new knowledge into existing FL systems and evolve these systems to reduce costs, upgrade functionalities, and facilitate sustainable development. In the meantime, established FL systems should preserve existing functionalities during the incorporation of new knowledge. In this paper, we systematically define the main sources of new knowledge in FL, including new features, tasks, models, and algorithms. For each source, we thoroughly analyze and discuss the technical approaches for incorporating new knowledge into existing FL systems and examine the impact of the form and timing of new knowledge arrival on the incorporation process. Unlike prior surveys that primarily catalogue FL techniques under a fixed system specification, we adopt a lifecycle evolution perspective and synthesize methods that enable time-varying integration of new features, tasks, models, and aggregation algorithms while preserving existing functionality. Furthermore, we comprehensively discuss the potential future directions for FL, incorporating new knowledge and considering a variety of factors, including scenario setups, security and privacy threats, and incentives.

URL: https://openreview.net/forum?id=BWBfK3B3b7

---

Title: Unsupervised Domain Adaptation for Binary Classification with an Unobservable Source Subpopulation

Authors: chao ying, Jun Jin, Haotian Zhang, Qinglong Tian, Yanyuan Ma, Sharon Li, Jiwei Zhao

Abstract: We study an unsupervised domain adaptation problem where the source domain consists of subpopulations defined by the binary label $Y$ and a binary background (or environment) $A$. We focus on a challenging setting in which one such subpopulation in the source domain is unobservable. Naively ignoring this unobserved group can result in biased estimates and degraded predictive performance. Despite this structured missingness, we show that the prediction in the target domain can still be recovered. Specifically, we rigorously derive both background-specific and overall predictive probabilities for the target domain. For practical implementation, we propose the distribution matching method to estimate the subpopulation proportions. We provide theoretical guarantees for the asymptotic behavior of our estimator, and establish an upper bound on the prediction error. Experiments on both synthetic and real-world datasets show that our method outperforms the naive benchmarks that do not account for this unobservable source subpopulation properly.

URL: https://openreview.net/forum?id=aOKcvMt8xE

---

Title: Variational Visual Question Answering for Uncertainty-Aware Selective Prediction

Authors: Tobias Jan Wieczorek, Nathalie Daun, Mohammad Emtiyaz Khan, Marcus Rohrbach

Abstract: Despite remarkable progress in recent years, Vision Language Models (VLMs) remain prone to overconfidence and hallucinations on tasks such as Visual Question Answering (VQA) and Visual Reasoning. Bayesian methods can potentially improve reliability by helping models predict selectively, that is, models respond only when they are sufficiently confident. Unfortunately, such approaches can be costly and ineffective for large models, and there exists little evidence to show otherwise for multimodal applications. Here, we show for the first time the effectiveness and competitive edge of variational Bayes for selective prediction in VQA. We build on recent advances in variational methods for deep learning and propose an extension called "Variational VQA". This method improves calibration and yields significant gains for selective prediction on VQA and Visual Reasoning, particularly when the error tolerance is low (≤ 1%). Often, just one posterior sample yields more reliable answers than those given by models trained with AdamW. In addition, we propose a new risk-averse selector that outperforms standard sample averaging by considering the variance of predictions. Overall, we present compelling evidence that variational learning is a viable option to make large VLMs safer and more trustworthy.

URL: https://openreview.net/forum?id=jtnMIbJIso

---

Title: Optimal Pattern Detection Tree for Symbolic Rule-Based Classification

Authors: Young-Chae Hong, Yangho Chen

Abstract: Pattern discovery in data plays a crucial role across diverse domains, including healthcare, risk assessment, and machinery maintenance. In contrast to black-box deep learning models, symbolic rule discovery emerges as a key data mining task, generating human-interpretable rules that offer both transparency and intuitive explainability. This paper introduces the Optimal Pattern Detection Tree (OPDT), a rule-based machine learning model based on novel mixed-integer programming to discover a single optimal pattern in data through binary classification. To incorporate prior knowledge and compliance requirements, we further introduce the Branching Structure Constraints (BSC) framework, which enables decision makers to encode domain knowledge and constraints directly into the model. This optimization-based approach discovers a hidden underlying pattern in datasets, when it exists, by identifying an optimal rule that maximizes coverage while minimizing the false positive rate due to misclassification. Our computational experiments show that OPDT discovers a pattern with optimality guarantees on moderately sized datasets within reasonable runtime.

URL: https://openreview.net/forum?id=RJ6eMDcDCv

---

Title: Improved Sample Complexity Bounds For Diffusion Model Training Without Empirical Risk Minimizer Access

Authors: Mudit Gaur, Prashant Trivedi, Sasidhar Kunapuli, Amrit Singh Bedi, Vaneet Aggarwal

Abstract: Diffusion models have demonstrated state-of-the-art performance across vision, language, and scientific domains. Despite their empirical success, prior theoretical analyses of the sample complexity suffer from poor scaling with input data dimension or rely on unrealistic assumptions such as access to exact empirical risk minimizers. In this work, we provide a principled analysis of score estimation, establishing a sample complexity bound of $\mathcal{O}(\epsilon^{-4})$. Our approach leverages a structured decomposition of the score estimation error into statistical, approximation, and optimization errors, enabling us to eliminate the exponential dependence on neural network parameters that arises in prior analyses. It is the first such result that achieves sample complexity bounds without assuming access to the empirical risk minimizer of score function estimation loss.

URL: https://openreview.net/forum?id=CFdNqqlqOv

---

New submissions
===============

Title: Revealing Positive and Negative Role Models to Help People Make Good Decisions

Abstract: We consider a setting where agents take action by following their role models in a social network, and study strategies for a social planner to help agents by revealing whether the role models are positive or negative. Specifically, agents observe a local neighborhood of possible role models they can emulate, but do not know their true labels. Revealing a positive label encourages emulation, while revealing a negative one redirects agents toward alternative options. The social planner observes all labels, but operates under a limited disclosure budget that it selectively allocates to maximize social welfare (the expected number of agents who emulate adjacent positive role models). We consider both algorithms and hardness results for welfare maximization, and provide a sample-complexity guarantee when the planner observes a sampled subset of agents. We also consider fairness guarantees when agents belong to different groups. It is a technical challenge that the ability to reveal negative role models breaks submodularity. We thus introduce a proxy welfare function that remains submodular even when revealed targets include negative ones. When each agent has at most a constant number of negative target neighbors, we use this proxy to achieve a constant-factor approximation to the true optimal welfare gain. When agents belong to different groups, we also show that each group's welfare gain is within a constant factor of the optimum achievable if the full budget were allocated to that group. Beyond this basic model, we also propose an intervention model that directly connects high-risk agents to positive role models, and a coverage radius model that expands the visibility of selected positive role models. Lastly, we conduct extensive experiments on four real-world datasets to support our theoretical results and assess the effectiveness of the proposed algorithms.

URL: https://openreview.net/forum?id=jdcXfoENf0

---

Title: Cross-Domain Offline Policy Adaptation via Selective Transition Correction

Abstract: It remains a critical challenge to adapt policies across domains with mismatched dynamics in reinforcement learning (RL). In this paper, we study cross-domain offline RL, where an offline dataset from another similar source domain can be accessed to enhance policy learning upon a target domain dataset. Directly merging the two datasets may lead to suboptimal performance due to potential dynamics mismatches. Existing approaches typically mitigate this issue through source domain transition filtering or reward modification, which, however, may lead to insufficient exploitation of the valuable source domain data. Instead, we propose to modify the source domain data into the target domain data. To that end, we leverage an inverse policy model and a reward model to correct the actions and rewards of source transitions, explicitly achieving alignment with the target dynamics. Since limited data may result in inaccurate model training, we further employ a forward dynamics model to retain corrected samples that better match the target dynamics than the original transitions. Consequently, we propose the Selective Transition Correction (STC) algorithm, which enables reliable usage of source domain data for policy adaptation. Experiments on various environments with dynamics shifts demonstrate that STC achieves superior performance against existing baselines.

URL: https://openreview.net/forum?id=TupiNRpgHw

---

Title: ENIGMA: EEG-to-Image in 15 Minutes Using Less Than 1% of the Parameters

Abstract: To be practical for real-life applications, models for brain-computer interfaces must be easily and quickly deployable on new subjects, effective on affordable scanning hardware, and small enough to run locally on accessible computing resources. To directly address these current limitations, we introduce ENIGMA, a multi-subject electroencephalography (EEG)-to-Image decoding model that reconstructs seen images from EEG recordings and achieves state-of-the-art (SOTA) performance on the research-grade THINGS-EEG2 and consumer-grade AllJoined-1.6M benchmarks, while fine-tuning effectively on new subjects with as little as 15 minutes of data. ENIGMA boasts a simpler architecture and requires less than 1% of the trainable parameters necessary for previous approaches. Our approach integrates a subject-unified spatio-temporal backbone along with a set of multi-subject latent alignment layers and an MLP projector to map raw EEG signals to a rich visual latent space. We evaluate our approach using a broad suite of image reconstruction metrics that have been standardized in the adjacent field of fMRI-to-Image research, and we describe the first EEG-to-Image study to conduct extensive behavioral evaluations of our reconstructions using human raters. Our simple and robust architecture provides a significant performance boost across both research-grade and consumer-grade EEG hardware, and a substantial improvement in fine-tuning efficiency and inference cost. Finally, we provide extensive ablations to determine the architectural choices most responsible for our performance gains in both single and multi-subject cases across multiple benchmark datasets. Collectively, our work provides a substantial step towards the development of practical brain-computer interface applications.

URL: https://openreview.net/forum?id=F0sRVEGjGm

---

Title: Microcanonical Hamiltonian Monte Carlo and the Helmholtz Theorem

Abstract: The recently proposed Microcanonical Hamiltonian Monte Carlo algorithm has not yet been
studied in detail from a thermodynamic point of view, this work aims to fill that gap. We
demonstrate how thermodynamical state variables and potentials can be derived and clarify
the relation between microcanonical entropy and the corresponding information entropy
of the posterior distribution. In particular, we demonstrate that the algorithm fulfils the
Helmholtz theorem, an alternative formulation of the first law of thermodynamics. Taking a
more general look on the thermodynamic ensembles corresponding to sampling algorithms,
we argue that canonical Markov Chain Monte Carlo algorithms are more natural than
Microcanonical Hamiltonian Monte Carlo from the thermodynamic and information theoretic
point of view.

URL: https://openreview.net/forum?id=jbVVrks50D

---

Title: Towards Near-Real-Time Telemetry-Aware Routing with Neural Routing Algorithms

Abstract: Routing algorithms are crucial for efficient computer network operations, and in many settings they must be able to react to traffic bursts within milliseconds. Live telemetry data can provide informative signals to routing algorithms, and recent work has trained neural networks to exploit such signals for traffic-aware routing. Yet, aggregating network-wide information is subject to communication delays, and existing neural approaches either assume unrealistic delay-free global states, or restrict routers to purely local telemetry. This leaves their deployability in real-world environments unclear. We cast telemetry-aware routing as a delay-aware closed-loop control problem and introduce a framework that trains and evaluates neural routing algorithms, while explicitly modeling communication and inference delays. On top of this framework, we propose LOGGIA, a scalable graph neural routing algorithm that predicts log-space link weights from attributed topology-and-telemetry graphs. It utilizes a data-driven pre-training stage, followed by on-policy Reinforcement Learning. Across synthetic and real network topologies, and unseen mixed TCP/UDP traffic sequences, LOGGIA consistently outperforms shortest-path baselines, whereas neural baselines fail once realistic delays are enforced. Our experiments further suggest that neural routing algorithms like LOGGIA perform best when deployed fully locally, i.e., observing network states and inferring actions at every router individually, as opposed to centralized decision making.

URL: https://openreview.net/forum?id=jmk2SoQklg

---

Title: Signet: MoE Routing Signatures as Domain Fingerprints for Training-Free LoRA Dispatch

Abstract: Multi-adapter serving on a shared Mixture-of-Experts (MoE) backbone requires a dispatch mechanism that identifies the correct LoRA adapter before decoding begins. Existing approaches either impose latency on the critical serving path or require retraining when adapters are updated, coupling dispatch infrastructure to the adapter lifecycle. We present Signet, a training-free framework that repurposes MoE router logits (computed during every forward pass) as domain-discriminative routing signatures, with no gradient updates, auxiliary models, or additional inference passes.

On an 11-domain benchmark spanning semantically similar sub-domains, Signet achieves 90.5% accuracy (outperforming the best sentence-encoder baseline by +8.7pp) at under 2.1% inference overhead. The performance stems from a structural property of MoE expert specialization: routing signature covariance is inherently anisotropic, and Signet exploits this with a shrinkage-based Mahalanobis classifier suited to the few-shot, high-dimensional calibration regime. The framework extends to open-world deployment via a percentile-calibrated out-of-distribution (OOD) gate and a parameter-free adaptive window for mid-session domain shifts, reaching 93.1% end-to-end accuracy on mixed in-distribution and OOD traffic.

Because router weights are architecturally excluded from standard LoRA target modules, one-time base-model calibration transfers to all fine-tuned variants without recalibration, permanently decoupling dispatch from the adapter lifecycle. These results establish that MoE routing logits are reliable domain-discriminative signals at zero marginal model cost: adapter dispatch becomes a fixed, one-time calibration step compatible with rapidly evolving adapter registries.

URL: https://openreview.net/forum?id=JgplkdntOI

---

Title: Influencing Humans to Conform to Preference Models for RLHF

Abstract: Designing a reinforcement learning from human feedback (RLHF) algorithm to approximate a human's unobservable reward function requires assuming, implicitly or explicitly, a model of human preferences. In sequential decision making tasks, a preference model that poorly describes how humans generate preferences risks learning a poor approximation of the human’s reward function. In this paper, we conduct human studies to assess whether one can influence the expression of real human preferences to more closely conform to a desired preference model. Importantly, our approach does not seek to alter the human's unobserved reward function. Rather, we change how humans use this reward function to generate preferences, such that they better match whatever preference model is assumed by a particular RLHF algorithm. We introduce three interventions: showing humans the quantities that underlie a preference model, which is normally unobservable information derived from the reward function; training people to follow a specific preference model; and modifying the preference elicitation question. All intervention types show significant effects for at least one preference model, providing practical tools to improve preference data quality and the resultant alignment of learned reward functions. Overall we establish a novel research direction in model alignment: designing interfaces and training interventions to increase human conformance with the modeling assumptions of the algorithm that will learn from their input.

URL: https://openreview.net/forum?id=7YPlw1nUmW

---

Title: NeuroSymAD: A Neuro-Symbolic Framework for Interpretable Alzheimer's Disease Diagnosis

Abstract: Early diagnosis of Alzheimer's disease (AD) by integrating neuroimaging and clinical data shows great potential for accurate assessment. While deep learning technique achieves tremendous success, it often functions as a black box, limiting interpretability and lacking mechanisms to effectively integrate critical clinical data such as biomarkers, medical history, and demographic information.
To bridge this gap, we propose NeuroSymAD, a neuro-symbolic framework that synergizes neural networks with symbolic reasoning. A neural network perceives brain MRI scans, while a large language model (LLM) distills medical rules to guide a symbolic system in reasoning over biomarkers and medical history. This structured integration enhances both diagnostic accuracy and explainability.
Experiments on the ADNI dataset demonstrate that NeuroSymAD outperforms state-of-the-art methods by up to 2.91% in accuracy and 3.43% in F1-score while providing transparent and interpretable diagnosis.

URL: https://openreview.net/forum?id=1Ay4lRpmcQ

---

Title: Probing and Controlling Self-Reflection in Language Models

Abstract: Self-reflection, the ability of a large language model (LLM) to revisit, evaluate, and revise its own reasoning, has recently emerged as a powerful behavior enabled by reinforcement learning with verifiable rewards (RLVR). While self-reflection correlates with improved reasoning accuracy, its origin and underlying mechanisms remain poorly understood. In this work, we first show that self-reflection is not exclusive to RLVR fine-tuned models: it already emerges, albeit rarely, in pretrained models. To probe this latent ability, we introduce Reflection-Inducing Probing, a method that injects reflection-triggering reasoning traces from fine-tuned models into pretrained models. This intervention raises self-reflection frequency of Qwen2.5 from 0.6% to 38.9%, revealing a hidden capacity for reflection. Moreover, our analysis of internal representations shows that both pretrained and fine-tuned models maintain hidden states that distinctly separate self-reflective from non-reflective contexts. Leveraging this observation, we then construct a self-reflection vector, a direction in activation space associated with self-reflective reasoning. By manipulating this vector, we enable bidirectional control over the self-reflective behavior for both pretrained and fine-tuned models. Experiments across multiple reasoning benchmarks show that enhancing these vectors improves reasoning performance by up to 13.7, while suppressing them reduces computational cost, providing a flexible mechanism to navigate the trade-off between reasoning quality and efficiency without requiring additional training. Our findings further our understanding of self-reflection and support a growing body of work showing that understanding model internals can enable precise behavioral control.

URL: https://openreview.net/forum?id=AwVIfBZwy0

---

Title: Conformal Prediction for Generative Models via Adaptive Cluster-Based Density Estimation

Abstract: Conditional generative models map input variables to complex, high-dimensional distributions, enabling realistic sample generation in a diverse set of domains. A critical challenge with these models is the absence of calibrated uncertainty, which undermines trust in individual outputs for high-stakes applications. To address this issue, we propose a systematic conformal prediction approach tailored to conditional generative models, leveraging density estimation on model-generated samples. We introduce a novel method called CP4Gen, which utilizes cluster-based density estimation to construct prediction sets that are less sensitive to outliers, more interpretable, and of lower structural complexity than existing methods. Extensive experiments on synthetic datasets and real-world applications, including climate emulation tasks, demonstrate that CP4Gen consistently achieves superior performance in terms of prediction set volume and structural simplicity. Our approach offers practitioners a powerful tool for uncertainty estimation associated with conditional generative models, particularly in scenarios demanding rigorous and interpretable prediction sets.

URL: https://openreview.net/forum?id=goxeVsh9Po

---

Title: Pathway to $O(\sqrt{d})$ Complexity bound under Wasserstein metric of flow-based models

Abstract: We develop general analytical tools to estimate the error of flow-based generative models under the Wasserstein metric and to establish an upper bound on the sampling iteration complexity, measured in terms of $\mathcal{O}(\sqrt{\mathrm{Tr}\, C})$, where $C$ is the covariance matrix of the prior distribution. When using the standard Gaussian as the prior, the posterior scales as $O(\sqrt{d})$, which is shown to be optimal when the target distribution is Gaussian.
More precisely, we show that the error under $W_2$ metric can be explicitly controlled by two parts: the Lipschitzness of the push-forward maps of the backward flow, which scales independently of the dimension, and a local discretization error that scales with the square root of the trace of the covariance. The former one is related to the existence of Lipschitz changes of variables induced by the (heat) flow. The latter one consists of the regularity of the score function in both spatial and temporal directions.
We validate the pathway in the flow-based generative model associated with the F\"{o}llmer process and $1$-rectified flow under the Gaussian tail assumption and find their wide applications in the Bayesian inverse problems and general data distribution with bounded support. Furthermore, we design numerical experiments to validate the optimality of the analysis with respect to discretization and spatial dimension.

URL: https://openreview.net/forum?id=5Z7RRnNQmC

---

Title: Small Updates, Big Doubts: Does Parameter-Efficient Fine-tuning Enhance Uncertainty Awareness for Large Language Models?

Abstract: Parameter-efficient fine-tuning (PEFT) is the de facto approach for adapting large language
models (LLMs), yet its effect on model uncertainty—the very signal that hallucination
detectors rely on—remains poorly understood. We present a controlled empirical study
of how PEFT reshapes uncertainty-aware hallucination detection in fact-seeking question
answering. Across three open-weight LLMs (LLaMA-3.2-3B, Qwen-2.5-3B, Mistral-7B),
three QA benchmarks (TriviaQA, NQ-Open, SQuAD), and seven answer-level detectors
spanning semantic-consistency, confidence, and entropy families, we find a striking asymmetry:
PEFT improves answer accuracy by only 1–3% on average, yet boosts detection AUROC
by up to 8.7%, with consistent gains across most black-box detectors. Behavioral analysis
reveals that this improvement stems largely from PEFT shifting uncertainty scores away
from overconfident error regimes, with gains from improved factual knowledge appearing
to play a more limited role. In contrast, white-box linear probes on hidden states show
inconsistent results, indicating that PEFT reshapes how uncertainty is expressed more
than how correctness is linearly encoded. Our findings demonstrate that, in fact-seeking
QA, PEFT acts primarily as an uncertainty reshaper that makes incorrect answers more
detectable, and we caution that these results concern answer-level detection with external
verification rather than open-ended generation.

URL: https://openreview.net/forum?id=BaW14onJX2

---

Title: Judging the Judges: A Systematic Evaluation of Bias Miti- gation Strategies in LLM-as-a-Judge Pipelines

Abstract: LLM-as-a-Judge has become the dominant paradigm for evaluating language model outputs, yet LLM judges exhibit systematic biases that compromise evaluation reliability. We present a comprehensive empirical study comparing nine debiasing strategies across five judge models from four provider families (Google, Anthropic, OpenAI, Meta), three benchmarks (MT-Bench n=400, LLMBar n=200, custom n=225), and four bias types. Our key findings: (1) Style bias is the dominant bias (0.76–0.92 across all models), far exceeding position bias (≤ 0.04), yet has received minimal research attention. (2) All models show a conciseness preference on expansion pairs, but truncation controls confirm they correctly distinguish quality from length (0.92–1.00 accuracy), suggesting quality-sensitive evaluation rather than a simple length bias. (3) Debiasing is beneficial but model-dependent: the combined budget strategy significantly improves Claude Sonnet 4 by +11.2 pp (p < 0.0001), with directionally positive trends for other models. Only 2 of 20 non-baseline configurations show decreased
agreement. We release our evaluation framework, controlled dataset, and all experimental artifacts.

URL: https://openreview.net/forum?id=QF4lAmG4zc

---

Title: Quantitative Introspection in Language Models: Tracking Emotive States Across Conversation

Abstract: Tracking the internal states of large language models across conversations is important for safety, interpretability, and model welfare, yet current methods are limited. Linear probes and other white-box methods compress high-dimensional representations imperfectly and are harder to apply with increasing model size. Taking inspiration from human psychology, where numeric self-report is a widely used tool for tracking internal states, we ask whether LLMs' own numeric self-reports can track probe-defined emotive states over time. We study four concept pairs (wellbeing, interest, focus, and impulsivity) in 40 ten-turn conversations, operationalizing introspection as the causal informational coupling between a model's self-report and a concept-matched probe-defined internal state. We find that greedy-decoded self-reports collapse outputs to a few uninformative values, but introspective capacity can be unmasked by calculating logit-based self-reports. This metric tracks interpretable internal states (Spearman $\rho = 0.40$--$0.76$; isotonic $R^2 = 0.12$--$0.54$ in LLaMA-3.2-3B-Instruct), follows how those states change over time, and activation steering confirms the coupling is causal. Furthermore, we find that introspection is present at turn 1 but evolves through conversation, and can be selectively improved by steering along one concept to boost introspection for another ($\Delta R^2$ up to $0.30$). Crucially, these phenomena scale with model size in some cases, approaching $R^2 \approx 0.93$ in LLaMA-3.1-8B-Instruct, and partially replicate in other model families. Together, these results position numeric self-report as a viable, complementary tool for tracking internal emotive states in conversational AI systems.

URL: https://openreview.net/forum?id=KKLMDrMLPe

---

Title: Conditional Independence Tests for Constraint-Based Causal Discovery: A Survey

Abstract: Conditional Independence (CI) tests are the statistical engine of constraint-based causal discovery: in algorithms such as PC (Peter-Clark) and FCI (Fast Causal Inference), skeleton pruning and key orientations follow directly from CI decisions. This survey reviews CI testing with emphasis on assumptions, robustness, and scalability in high-dimensional and heterogeneous settings common in biomedical domains. The survey organizes widely used CI methods into six families: partial-correlation, contingency-table, regression, nearest-neighbor, kernel, and machine-learning-based. Special emphasis is provided on the robustness layers that address the limitations of these families. For each family, the survey examines when CI decisions reflect the data-generating distribution and when they fail. By this, we link test-level properties, including power decay with conditioning set size and asymmetric type I/II error consequences, to graph-level errors in skeleton recovery and collider orientation. The survey also compares adoption across major R and Python libraries and summarizes open challenges, including mixed-type CI testing without discretization, small-sample error control, and strategies for improving scalability of CI-testing.

URL: https://openreview.net/forum?id=3jzafJK8Tz

---

Title: Do We Really Need to Approach the Entire Pareto Front in Many-Objective Bayesian Optimisation?

Abstract: Many-objective optimisation, a subset of multi-objective optimisation, involves optimisation problems with more than three objectives. As the number of objectives increases, the number of solutions needed to adequately represent the entire Pareto front typically grows substantially. This makes it challenging, if not infeasible, to design a search algorithm capable of effectively exploring the entire Pareto front. This difficulty is particularly acute in the Bayesian optimisation paradigm, where sample efficiency is critical and only a limited number of solutions (often a few hundred) are evaluated. Moreover, after the optimisation process, the decision-maker eventually selects just one solution for deployment, regardless of how many high-quality, diverse solutions are available. In light of this, we argue an idea that under a very limited evaluation budget, it may be more useful to focus on finding a single solution of the highest possible quality for the decision-maker, rather than aiming to approximate the entire Pareto front as existing many-/multi-objective Bayesian optimisation methods typically do. Bearing this idea in mind, this paper proposes a \underline{s}ingle \underline{p}oint-based \underline{m}ulti-\underline{o}bjective search framework (SPMO) that aims to improve the quality of solutions along a direction that leads to a good tradeoff between objectives. Within SPMO, we present a simple acquisition function, called expected single-point improvement (ESPI), working under both noiseless and noisy scenarios. We show that ESPI can be optimised effectively with gradient-based methods via the sample average approximation (SAA) approach and theoretically prove its convergence guarantees under the SAA. We also empirically demonstrate that the proposed SPMO is computationally tractable and outperforms state-of-the-arts on a wide range of benchmark and real-world problems.

URL: https://openreview.net/forum?id=ZsI4bmqUD8

---

Title: Bigger Isn’t Always Memorizing: Early Stopping Overparameterized Diffusion Models

Abstract: Diffusion probabilistic models have become a cornerstone of modern generative AI, yet the mechanisms underlying their generalization remain poorly understood. In fact, if these models were perfectly minimizing their training loss, they would just generate data belonging to their training set, i.e., memorize, as empirically found in the overparameterized regime. We revisit this view by showing that, in highly overparameterized diffusion models, generalization in natural data domains is progressively achieved during training before the onset of memorization. Our results, ranging from image to language diffusion models, systematically support the empirical law that memorization time is proportional to the dataset size, consistent with a kernel-regression scaling argument for fitting the empirical score at low noise. Generalization vs. memorization is then best understood as a competition between time scales. We show that this phenomenology is recovered in diffusion models learning a simple probabilistic context-free grammar with random rules, where generalization corresponds to the hierarchical acquisition of deeper grammar rules as training time grows, and the generalization cost of early stopping can be characterized. We summarize these results in a phase diagram. Overall, our results support that a principled early-stopping criterion -- scaling with dataset size -- can effectively optimize generalization while avoiding memorization, with direct implications for hyperparameter transfer and privacy-sensitive applications.

URL: https://openreview.net/forum?id=P6r9OxZ1vm

---

Title: REVEAL: Reconstructing Video Amodal Content via Language and Flow Guidance

Abstract: Amodal perception enables humans to perceive entire objects even when parts are occluded, a remarkable cognitive skill that artificial intelligence struggles to replicate. Recent diffusion-based methods have extended amodal completion from images to videos, yet they lack auxiliary priors to guide the reconstruction of heavily occluded content and to maintain temporal consistency over long sequences. We present REVEAL (REconstructing VidEo Amodal content via Language and flow guidance), a unified diffusion-based framework that integrates complementary motion and semantic priors for video amodal completion. A user-provided text query serves a dual role: it drives open-vocabulary video segmentation to obtain visible masks, and it provides semantic guidance for texture reconstruction. For amodal segmentation, we introduce optical flow guidance: by warping visible masks from previous frames to the current frame, the flow-warped mask propagates visible content into occluded regions and approximates the object's current shape, providing a strong shape prior even under simultaneous deformation and occlusion. For texture completion, text guidance constrains the reconstruction of occluded appearance while providing a stable semantic reference that helps maintain visual coherence throughout the reconstruction. We also contribute LAVAT, a new benchmark featuring long video sequences paired with text descriptions, enabling evaluation of text-guided video amodal completion under heavy occlusion. Extensive experiments demonstrate that REVEAL achieves state-of-the-art performance on existing benchmarks while maintaining high-quality temporal consistency over extended sequences. The project page is available at: https://lancof.github.io.

URL: https://openreview.net/forum?id=pbTFy1bNyZ

---

Title: Improving Sampling Efficiency in RLVR through Adaptive Rollout and Response Reuse

Abstract: Large language models (LLMs) have achieved impressive reasoning performance, with reinforcement learning with verifiable rewards (RLVR) emerging as a standard paradigm for post-training. A representative algorithm, group relative policy optimization (GRPO) (Shao et al., 2024), computes advantages by normalizing outcome rewards within response groups, but suffers from a vanishing advantage issue when all responses in a group receive identical rewards. To address this issue, we propose Adaptive Rollout and Response Reuse Policy Optimization (AR3PO), a sampling efficient RLVR algorithm that introduces two novel techniques: adaptive rollout, which dynamically allocates more responses to difficult prompts while saving computation on easier ones, and response reuse, which leverages previously generated correct responses to provide useful training signals. We compare AR3PO with strong RLVR baselines on multiple representative benchmarks using two different families of base models. Across the 7B and 8B models, AR3PO consistently outperforms GRPO and matches or surpasses DAPO (Yu et al., 2025), reducing rollout cost by up to $4.2\times$. On the larger 32B model, AR3PO achieves comparable performance to DAPO at similar training steps while maintaining substantially lower rollout cost.

URL: https://openreview.net/forum?id=LQZ1053D17

---

Title: Federated Learning with Projected Trajectory Regularization

Abstract: Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients, which leads to deteriorated model training performance. Prior works in this line of research mainly focus on utilizing last-step global model parameters/gradients or the linear combinations of the past model parameters/gradients, which do not fully exploit the potential of global information from the model training trajectory. In this paper, we propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data heterogeneity issue, which proposes a unique way to better extract the essential global information from the model training trajectory. Specifically, FedPTR allows local clients or the server to optimize an auxiliary (synthetic) dataset that mimics the learning dynamics of the recent model update and utilizes it to project the next-step model trajectory for local training regularization. We conduct rigorous theoretical analysis for our proposed framework under nonconvex stochastic settings to verify its fast convergence under heterogeneous data distributions. Experiments on various benchmark datasets and non-i.i.d. settings validate the effectiveness of our proposed framework.

URL: https://openreview.net/forum?id=vfCztZvcP3

---

Title: Towards Reasonable Concept Bottleneck Models

Abstract: We propose a novel, flexible, and efficient framework for designing Concept Bottleneck Models (CBMs)
that enables practitioners
to explicitly encode and extend their prior knowledge and beliefs about the concept-concept ($C-C$) and concept-task ($C \to Y$) relationships within the model's reasoning when making predictions.
The resulting $\textbf{C}$oncept $\textbf{REA}$soning $\textbf{M}$odels (CREAMs) architecturally encode arbitrary types of $C-C$ relationships such as mutual exclusivity, hierarchical associations, and/or correlations, as well as potentially sparse $C \to Y$ relationships.
Moreover, CREAM can optionally incorporate a regularized side-channel to complement the potentially {incomplete concept sets}, achieving competitive task performance while encouraging predictions to be concept-grounded. To evaluate CBMs in such settings, we introduce a $C \to Y$ agnostic metric that quantifies interpretability when predictions partially rely on the side-channel. In our experiments, we show that, without additional computational overhead, CREAM models support efficient interventions, can avoid concept leakage, and achieve black-box-level performance under missing concepts. We further analyze how an optional side-channel affects interpretability and intervenability. Importantly, the side-channel enables CBMs to remain effective even in scenarios where only a limited number of concepts are available.

URL: https://openreview.net/forum?id=0TxwJMhXBY

---

Title: Training Verifiably Robust Agents Using Set-Based Reinforcement Learning

Abstract: Reinforcement learning policies parametrized by deep neural networks have achieved
strong performance for continuous control, yet even small input perturbations may lead
to unpredictable behavior. This sensitivity limits their use in safety-critical domains,
where formal verifiability and robustness guarantees are required. Our work addresses
this gap between state-of-the-art adversarial training methods and formal verification to
train verifiably robust agents. Previous works train networks with individual adversarial
perturbations, making them only robust against the specific adversarial attacks used.
In contrast, our approach propagates entire perturbed input sets, enclosing all possible
adversarial attacks within a single network pass. We leverage this to explicitly penalize
the size of the output set (minimizing closed-loop uncertainty) and thereby make the actor
robust against all possible attacks. This is realized by the use of set-based policy gradients,
where each output within the set has a different gradient, thereby balancing the accuracy
and robustness of the network. Doing so, we achieve formal verifiability across different
verification frameworks for up to 9 times larger input perturbations and improve certified
worst-case performance, making them applicable in safety-critical environments.

URL: https://openreview.net/forum?id=DhRPPhSjGA

---

Title: Scaffold-Conditioned Preference Triplets for Controllable Molecular Optimization with Large Language Models

Abstract: Molecular property optimization is central to drug discovery, yet many deep learning methods rely on black-box scoring and offer limited control over scaffold preservation, often producing unstable or biologically implausible edits. While large language models (LLMs) are promising molecular generators, optimization remains constrained by the lack of chemistry-grounded preference supervision and principled data curation.

We introduce \textbf{Scaffold-Conditioned Preference Triplets (SCPT)}, a pipeline that constructs similarity-constrained triplets $\langle\text{scaffold}, \text{better}, \text{worse}\rangle$ via scaffold alignment and chemistry-driven filters for validity, synthesizability, and meaningful property gains. Using these preferences, we align a pretrained molecular LLM as a conditional editor, enabling property-improving edits that retain the scaffold.

Across single- and multi-objective benchmarks, SCPT improves optimization success and property gains while maintaining higher scaffold similarity than competitive baselines. Compared with representative non-LLM molecular optimization methods, SCPT-trained LLMs are better suited to scaffold-constrained and multi-objective optimization. In addition, models trained on single-property and two-property supervision generalize effectively to three-property tasks, indicating promising extrapolative generalization under limited higher-order supervision. SCPT also provides controllable data-construction knobs that yield a predictable similarity--gain frontier, enabling systematic adaptation to diverse optimization regimes.

URL: https://openreview.net/forum?id=KwszuDv9ow

---

Title: DS-STAR: Data Science Agent for Solving Diverse Tasks across Heterogeneous Formats and Open-Ended Queries

Abstract: While large language models (LLMs) have shown promise in automating data science, existing agents often struggle with the complexity of real-world workflows that require exploring multiple sources and synthesizing open-ended insights. In this paper, we introduce DS-STAR, a specialized agent to bridge this gap. Unlike prior approaches, DS-STAR is designed to (1) seamlessly process and integrate data across diverse, heterogeneous formats, and (2) move beyond simple QA to generate comprehensive research reports for open-ended queries. Extensive evaluation shows that DS-STAR achieves state-of-the-art performance on four benchmarks: DABStep, DABStep-Research, KramaBench, and DA-Code. Most notably, it significantly outperforms existing baseline models especially in hard-level QA tasks requiring multi-file processing, and generates high-quality data science reports that are preferred over the best baseline model in over 88% of cases.

URL: https://openreview.net/forum?id=Yz3ZPLzYaU

---

Title: Beyond Freezing the Router: Rank-Aligned Post-Training Quantization for Mixture-of-Experts Models

Abstract: Quantizing Mixture-of-Experts language models remains a challenging problem because quantization noise propagates across layers and distorts downstream expert selection. Although common practice keeps the router in full precision, we show that this strategy is insufficient: quantization-induced errors in expert outputs still shift the logits of the next-layer router, and freezing the router removes the opportunity to compensate for these shifts. Motivated by this finding, we propose RouteQuant, a post-training quantization framework that explicitly embraces router quantization to correct for expert-level distortion. We analyze how quantization alters router rankings and flippings, and provide a theoretical proof showing that deviations in expert outputs are bounded by both expert-selection and gap-preservation errors. These insights motivate two router-alignment objectives: (i) Rank-Aware Jaccard Loss, which aligns the top-$k$ routing sets between full-precision and quantized models, and (ii) Gap Hinge Loss, which preserves the margin between consecutive expert logits to suppress rank flipping. In addition to router alignment, we further introduce Expert-Aware Smoothing Factor, which assigns separate activation smoothing factors to heterogeneous experts. Across OLMoE, DeepSeek-MoE, and Qwen3-MoE, RouteQuant consistently improves perplexity on C4 and WikiText-2 and enhances zero-shot accuracy under W4A4 and W4A8 across diverse downstream tasks, demonstrating the effectiveness of the proposed framework. Code is available at https://anonymous.4open.science/r/route-quant.

URL: https://openreview.net/forum?id=bPsPPI65hf

---

Title: ODC: Orthogonal Drift Correction for Improved Text-to-Image Semantic Alignment at Inference

Abstract: Text-to-image models have achieved remarkable success in generating high-quality images from textual descriptions. However, they often struggle with ``semantic drift,'' where the generated output fails to precisely align with complex or nuanced text prompts. While recent approaches have attempted to address semantic errors regarding attribute binding or object presence, there remains a gap for a more holistic method that addresses these issues by directly refining the text embeddings of the initial user prompt. In this work, we introduce Orthogonal Drift Correction (ODC), an inference-time guidance method designed to mitigate semantic drift without requiring model retraining or additional user inputs. ODC guides image generation through a two-stage process. In the first stage, it identifies the semantic drift by evaluating the initially generated image against the user prompt in a shared vision-language embedding space. It then isolates the component of this drift vector that is orthogonal to the prompt's direction and translates it back into text via a vocabulary-based surrogate mechanism. In the second stage, it produces refined text conditioning for a second generation pass by feeding both the initial text embedding and the re-embedded drift representation into an adaptive rank-reduced concept removal module. Our experiments demonstrate the effectiveness of ODC in enhancing prompt-image alignment, yielding images that more accurately reflect detailed compositional instructions. As a plug-and-play module, ODC offers a practical and efficient method for improving the reliability of state-of-the-art text-to-image models.

URL: https://openreview.net/forum?id=YJBjtfatfK

---

Title: Multi-Dimensional Knowledge Profiling with Large-Scale Literature Database and Hierarchical Retrieval

Abstract: The rapid expansion of research across machine learning, vision, and language has produced a volume of publications that is increasingly difficult to synthesize. Traditional bibliometric tools rely mainly on metadata and offer limited visibility into the semantic content of papers, making it hard to track how research themes evolve over time or how different areas influence one another. To obtain a clearer picture of recent developments, we compile a unified corpus of more than 100,000 papers from 22 major conferences between 2020 and 2025 and construct a multidimensional profiling pipeline to organize and analyze their textual content. By combining topic clustering, LLM-assisted parsing, and structured retrieval, we derive a comprehensive representation of research activity that supports the study of topic lifecycles, methodological transitions, dataset and model usage patterns, and institutional research directions. Our analysis highlights several notable shifts, including the growth of safety, multimodal reasoning, and agent-oriented studies, as well as the gradual stabilization of areas such as neural machine translation and graph-based methods. These findings provide an evidence-based view of how AI research is evolving and offer a resource for understanding broader trends and identifying emerging directions.

URL: https://openreview.net/forum?id=VtqV0zMoah

---

Title: On the Convergence Analysis of Muon

Abstract: The majority of parameters in neural networks are naturally represented as matrices. However, most commonly used optimizers treat these matrix parameters as flattened vectors during optimization, potentially overlooking their inherent structural properties. Recently, an optimizer called Muon has been proposed, specifically designed to optimize matrix-structured parameters. Extensive empirical evidence shows that Muon can significantly outperform traditional optimizers when training neural networks. Nonetheless, the theoretical understanding of Muon’s convergence behavior and the reasons behind its superior performance remain limited. In this work, we present a comprehensive convergence rate analysis of Muon and its comparison with Gradient Descent (GD). We characterize the conditions under which Muon can outperform GD. Our theoretical results reveal that Muon can benefit from the low-rank structure of Hessian matrices, a phenomenon widely observed in practical neural network training. Our experimental results support and corroborate the theoretical findings.

URL: https://openreview.net/forum?id=4nH4CulGaP

---

Title: Wiring the ‘Why’: A Unified Taxonomy and Survey of Abductive Reasoning in LLMs

Abstract: Regardless of its foundational role in human discovery and sense‑making, abductive reasoning—the inference of the most plausible explanation for an observation—has been relatively underexplored in Large Language Models (LLMs). Despite the rapid advancement of LLMs, the exploration of abductive reasoning and its diverse facets has thus far been disjointed rather than cohesive. This paper presents the first survey of abductive reasoning in LLMs, tracing its trajectory from philosophical foundations to contemporary AI implementations. To address the widespread conceptual confusion and disjointed task definitions prevalent in the field, we establish a unified two-stage definition that formally categorizes prior work. This definition disentangles abduction into Hypothesis Generation, where models bridge epistemic gaps to produce candidate explanations, and Hypothesis Selection, where the generated candidates are evaluated and the most plausible explanation is chosen. Building upon this foundation, we present a comprehensive taxonomy of the literature, categorizing prior work based on their abductive tasks, datasets, underlying methodologies, and evaluation strategies. In order to ground our framework empirically, we conduct a compact benchmark study of current LLMs on abductive tasks, together with targeted comparative analyses across task stage (generation vs. selection), model sizes and families, metric choices, and dataset/task characteristics such as domain, context length, and output structure.. Moreover, by synthesizing recent empirical results, we examine how LLM performance on abductive reasoning relates to deductive and inductive tasks, providing insights into their broader reasoning capabilities. Our analysis reveals critical gaps in current approaches—from static benchmark design and narrow domain coverage to narrow training frameworks and limited mechanistic understanding of abductive processes. Finally, we propose research directions spanning richer evaluation frameworks, reinforcement learning for explanatory virtues, multi-agentic architectures, and circuit-level interpretability to advance the field toward more rigorous abductive reasoning capabilities.

URL: https://openreview.net/forum?id=oeVkugH0WB

---

Title: A Survey on Hallucination in Video Understanding: Taxonomy, Causes, and Mitigation Techniques

Abstract: Video Large Language Models (Vid-LLMs) have recently achieved strong performance across a wide range of video understanding tasks, including question answering, captioning, and multimodal reasoning. However, these models frequently produce outputs that are not faithfully grounded in the underlying video content, a phenomenon commonly referred to as hallucination. Compared with hallucination in text-only or image-based models, hallucination in video understanding is further complicated by temporal dynamics, motion interpretation, long-context dependencies, and event-level reasoning. In this survey, we present a comprehensive review of hallucination in Vid-LLMs. We begin with a unified taxonomy, which categorizes different hallucination phenomena, and then organize existing mitigation strategies according to the failure mechanisms they address. We also discuss key open challenges and outline some promising research directions.

URL: https://openreview.net/forum?id=qbO71rVrIG

---

Title: Patch-based Memory Gate Model in Time Series Foundation Model

Abstract: Recently reconstruction-based deep models have been widely used for time series anomaly detection, but as their capacity and generalization capability increase, these models tend to over-generalize, often reconstructing unseen anomalies accurately. Prior works have attempted to mitigate this by incorporating a memory architecture that stores prototypes of normal patterns. Nevertheless, these approaches suffer from high training costs and have yet to be effectively integrated with time series foundation models (TSFMs).
To address these challenges, we propose MOMEMTO, an improved variant of TSFM for anomaly detection, enhanced with a patch-based memory module to mitigate over-generalization. The memory module is designed to capture representative normal patterns from multiple domains and enables a single model to be jointly fine-tuned across multiple datasets through a multi-domain training strategy. MOMEMTO initializes memory items with latent representations from a pre-trained encoder, organizes them into patch-level units, and updates them via an attention mechanism. We evaluate our method using 23 univariate benchmark datasets. Experimental results demonstrate that MOMEMTO, as a single model, achieves higher scores on AUC and VUS metrics compared to baseline methods, and further enhances the performance of its backbone TSFM, particularly in few-shot learning scenarios.

URL: https://openreview.net/forum?id=Fm2ddpR0aw

---

Title: From Uniform to Learned Knots: A Study of Spline-Based Numerical Encodings for Tabular Deep Learning

Abstract: Numerical preprocessing remains a critical component of tabular deep learning, where the representation of continuous features can strongly affect downstream performance. Although this is well understood for classical statistical and machine learning models, the extent to which explicit numerical preprocessing systematically benefits tabular deep learning remains less well understood. In this work, we study this question with a particular focus on spline-based numerical encodings. We investigate three spline families for encoding numerical features, namely B-splines, M-splines, and integrated splines (I-splines), under uniform, quantile-based, target-aware, and learnable-knot (gradient-based) placement. For the learnable-knot variants, we adopt a differentiable knot parameterization that enables stable end-to-end optimization of knot locations jointly with the backbone. We evaluate these numerical encodings on a diverse collection of public regression and classification datasets using MLP, ResNet, and FT-Transformer backbones, and compare them against common numerical preprocessing baselines. Our results show that the effect of numerical encodings depends strongly on the task, the output size of the encoding, and the backbone. For classification, piecewise-linear encoding (PLE) is the most robust choice overall, while spline-based encodings remain competitive. For regression, no single encoding dominates uniformly. Instead, performance depends on the spline family, knot-placement strategy, and the output size of the encoding, with larger gains typically observed for MLP and ResNet than for FT-Transformer. We further find that learnable-knot variants can be optimized stably under the proposed parameterization, but may substantially increase training cost, especially for M-spline and I-spline expansions. Overall, the results show that numerical encodings should be assessed not only in terms of predictive performance, but also in terms of computational overhead. An anonymized implementation is publicly available at https://anonymous.4open.science/r/tdl-numerical-encodings-881C/.

URL: https://openreview.net/forum?id=str7wQt9Qc

---

Title: Deep Neural Nets in Low Dimensions with Sign Activations are Convex Lasso Models

Abstract: We consider neural networks with sign activations, depths ranging from 2 to an arbitrary but finite number of layers, and rectangular architectures (parallel structures with constant but arbitrary and finite width). We prove that training such neural networks with weight regularization on 1-D data is equivalent to solving convex Lasso problems with discrete, explicitly defined dictionary matrices. The Lasso dictionaries grow richer for 3-layer networks compared to 2-layers, but saturate thereafter. We show that a tree architecture overcomes this depth limitation, allowing the dictionary to expand with every layer. The Lasso model provides intuition and insight, including closed-form solution paths for 1-D data with binary, periodic labels and extensions to certain 2-D data. Numerical simulations support theory.

URL: https://openreview.net/forum?id=weh3w6KPs6

---

Title: Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

Abstract: Membership inference attacks (MIAs), which enable adversaries to determine whether specific data points were part of a model's training dataset, have emerged as an important framework to understand, assess, and quantify the potential information leakage associated with machine learning systems. Designing effective MIAs is a challenging task that usually requires extensive manual exploration of model behaviors to identify potential vulnerabilities. In this paper, we introduce AutoMIA -- a novel framework that leverages large language model (LLM) agents to automate the design and implementation of new MIA signal computations. By utilizing LLM agents, we can systematically explore a vast space of potential attack strategies, enabling the discovery of novel strategies. Our experiments demonstrate AutoMIA can successfully discover new MIAs that are specifically tailored to user-configured target model and dataset, resulting in improvements of up to 0.18 in absolute AUC over existing MIAs. This work provides the first demonstration that LLM agents can serve as an effective and scalable paradigm for designing and implementing MIAs with SOTA performance, opening up new avenues for future exploration.

URL: https://openreview.net/forum?id=N3VOIYIqo9

---

Title: RPP: A Certified Poisoned-Sample Detection Framework for Backdoor Attacks under Dataset Imbalance

Abstract: Deep neural networks are highly susceptible to backdoor attacks, yet most defense methods to date rely on balanced data, overlooking the pervasive class imbalance in real-world scenarios that can amplify backdoor threats. This paper presents the first in-depth investigation of how the dataset imbalance amplifies backdoor vulnerability, showing that (i) the imbalance induces a majority-class bias that increases susceptibility and (ii) conventional defenses degrade significantly as the imbalance grows. To address this, we propose Randomized Probability Perturbation (RPP), a certified poisoned-sample detection framework that operates in a black-box setting using only model output probabilities. For any inspected sample, RPP determines whether the input has been backdoor-manipulated, while offering provable within-domain detectability guarantees and a probabilistic upper bound on the false positive rate. Extensive experiments on five benchmarks (MNIST, SVHN, CIFAR-10, TinyImageNet and ImageNet10) covering 10 backdoor attacks and 12 baseline defenses show that RPP achieves significantly higher detection accuracy than state-of-the-art defenses, particularly under dataset imbalance. RPP establishes a theoretical and practical foundation for defending against backdoor attacks in real-world environments with imbalanced data.

URL: https://openreview.net/forum?id=WyPkotlgqQ

---

Title: Low-Rank Filtering & Smoothing for Sequential Deep Learning

Abstract: Learning multiple tasks sequentially requires neural networks to balance retaining knowledge, yet being flexible enough to adapt to new tasks. Regularizing network parameters is a common approach, but it rarely incorporates prior knowledge about task relationships, and limits information flow to future tasks only. We propose a Bayesian framework that treats the network's parameters as the state space of a nonlinear Gaussian model, unlocking two key capabilities: (1) A principled way to encode domain knowledge about task relationships, allowing, e.g., control over which layers should adapt between tasks. (2) A novel application of Bayesian smoothing, allowing task-specific models to also incorporate knowledge from models learned later. This does not require direct access to their data, which is crucial, e.g., for privacy-critical applications. These capabilities rely on efficient filtering and smoothing operations, for which we propose diagonal plus low-rank approximations of the precision matrix in the Laplace approximation (LR-LGF). Empirical results demonstrate the efficiency of LR-LGF and the benefits of the unlocked capabilities.

URL: https://openreview.net/forum?id=1TJXpLHLKG

---

Title: Enhancing Graph Representations with Neighborhood-Contextualized Message-Passing

Abstract: Graph neural networks (GNNs) have become an indispensable tool for analyzing relational data. Classical GNNs are broadly classified into three variants: convolutional, attentional, and message-passing. While the standard message-passing variant is expressive, its typical pair-wise messages only consider the features of the center node and each neighboring node individually. This design fails to incorporate contextual information contained within the broader local neighborhood, potentially hindering its ability to learn meaningful relationships within the entire set of neighboring nodes. To address this limitation, this work first formalizes the concept of neighborhood-contextualization, rooted in a key property of the attentional variant. This then serves as the foundation for generalizing the message-passing variant to the proposed neighborhood-contextualized message-passing (NCMP) framework. To demonstrate its utility, a simple, practical, and efficient method to parametrize and operationalize NCMP is presented, leading to the development of the proposed Soft-Isomorphic Neighborhood-Contextualized Graph Convolution Network (SINC-GCN). Across a diverse set of synthetic and benchmark datasets, SINC-GCN strikes an optimal balance between expressivity and efficiency. Notably, while more complex models incur significant computational overhead, SINC-GCN delivers substantial, statistically significant performance gains over baseline GNN models in graph property prediction tasks while maintaining the highly efficient asymptotic complexity, further underscoring the distinctive utility of neighborhood-contextualization. Overall, the paper lays the foundation for the NCMP framework as a practical path toward enhancing the graph representational power of classical GNNs.

URL: https://openreview.net/forum?id=6nxdSt5pSn

---

Title: From Decorative to Load-Bearing: Task Difficulty Shapes Chain-of-Thought Faithfulness

Abstract: Chain-of-thought (CoT) prompting is increasingly treated as a window into how language models reason, yet this only works if the CoT actually drives the answer. We propose \textbf{continuation-based causal testing}---perturb a reasoning step, truncate, and force the model to continue from the corrupted prefix---to directly measure whether CoT tokens constrain what comes next. Across Gemma-2-9B-IT and Llama-3.1-8B-Instruct on GSM8K, MMLU, and BIG-Bench Hard, we find that \textbf{CoT faithfulness tracks task difficulty}: error propagation rises from 8--22\% on easy MMLU subjects to 41--66\% on hard tasks, varying continuously across domains and replicating across both models. This gradient poses a problem for CoT monitoring---on easy tasks the model ignores its own reasoning, so there is nothing to monitor; on hard tasks it faithfully follows corrupted steps, so errors slip through. Linear probes over hidden states can tell these regimes apart, but activation steering along the same directions cannot flip one into the other, suggesting the signal is readable but not controllable.

URL: https://openreview.net/forum?id=TiZQnKDIHq

---

Title: Measuring Procedural Reuse Judgments with Response Curves

Abstract: Procedural knowledge such as recipes, protocols, and workflows is often adapted rather than
executed directly. Understanding such reuse behavior requires analyzing how alternative
procedures are comparatively evaluated for their applicability.
We introduce a measurement framework for studying reuse judgments, defined as compar-
ative decisions about which of two candidate procedures is preferred for reuse with respect
to a reference procedure under a given context. We construct controlled differences between
candidate procedures along predefined structural axes and aggregate pairwise decisions into
response curves, which characterize how alignment probabilities vary with the magnitude of
differences along each axis.
We instantiate this framework in the domain of cooking recipes and use a large language
model as a controlled decision-maker to generate comparative judgments at scale. The
resulting response curves are stable under sampling variation, vary across structural axes
and reference procedures, form structured but overlapping patterns, and shift systematically
under contextual constraints.
These results indicate that, under this controlled evaluation setup, reuse judgments can-
not be fully captured by a single similarity score, but instead exhibit structured variation
in alignment probability over procedural differences. More broadly, this work proposes a
measurement-oriented perspective on procedural reuse, enabling direct observation of how
reuse judgments vary under controlled conditions.

URL: https://openreview.net/forum?id=ZLciojSpoX

---

Title: On the Fundamental Limitations of Dual Static CVaR Decompositions in Markov Decision Processes

Abstract: It was recently shown that dynamic programming (DP) methods for finding static CVaR-optimal policies in Markov Decision Processes (MDPs) can fail when based on the dual formulation, yet the root cause of this failure remains unclear. We expand on these findings by shifting focus from policy optimization to the seemingly simpler task of policy evaluation. We show that evaluating the static CVaR of a given policy can be framed as two distinct minimization problems. We introduce a set of ``risk-assignment consistency constraints'' that must be satisfied for their solutions to match and we demonstrate that an empty intersection of these constraints is the source of previously observed evaluation errors. Quantifying the evaluation error as the CVaR evaluation gap, we demonstrate that the issues observed when optimizing over the dual-based CVaR DP are explained by the returned policy having a non-zero CVaR evaluation gap. Finally, we leverage our proposed risk-assignment constraints perspective to prove that the search for a single, uniformly optimal policy on the dual CVaR decomposition is fundamentally limited, identifying an MDP where no single policy can be optimal across all initial risk levels.

URL: https://openreview.net/forum?id=Fn7c8eEquz

---

Title: Beyond Imitation: A Framework and Benchmark for LLM-Assisted Peer Review

Abstract: The rapid growth of scientific publishing has strained peer review, particularly in machine learning, raising concerns about declining review quality and increasing reviewer workload. Large language models (LLMs) have been proposed as automated review assistants, yet their evaluation has focused largely on imitating human-written reviews rather than supporting the core functions of peer review. Here, we introduce a verification-centric perspective on LLM-assisted peer review, emphasizing error detection as a critical and resource-intensive task. We present a scalable benchmark that evaluates review systems' ability to identify logical contradictions, constructed through synthetic insertion of errors into conference papers—yielding unambiguous evaluation targets and enabling systematic comparison. We further propose a Multi-Layered Review (MLR) framework that prioritizes detailed manuscript comprehension before review generation, aligning more closely with human reviewing practices while improving token efficiency. Across evaluations, our approach demonstrates strong alignment with human review scores, achieves high error detection performance, and provides complementary perspectives on reviewer focus. At the same time, we corroborate persistent vulnerabilities to adversarial manipulation, underscoring the need for robustness in automated review systems. Our findings highlight the importance of rigorous, error-focused evaluation to guide responsible deployment of LLM-based tools in peer review and other critical scientific workflows.

URL: https://openreview.net/forum?id=7iX2Z2bPFB

---

Title: Depth Over Diversity: Same-Expert Iteration Outperforms Expert Communication in a Translation MoE

Abstract: Mixture-of-Experts (MoE) models achieve scalability through sparse expert routing, but experts process tokens independently.
A natural hypothesis is that enabling expert communication---through learned topologies, message passing, or sequential chains---should improve performance.
We investigate this hypothesis by testing ten communication approaches across seven experimental axes on WMT14 En-De translation.
We find no clear evidence that any communication variant improves over standard MoE, though we note that sample sizes (up to five seeds for the main result, three for most communication variants) limit power for detecting small effects.
Several variants degrade validation perplexity.
We hypothesize that parallel experts lack information asymmetry: they all receive the same input, so inter-expert messages carry information that the routing layer already integrates through soft-ensembling.
We then propose an alternative: rather than combining diverse expert outputs, we apply the \emph{same} expert twice sequentially.
This modification---\textbf{expert depth}---achieves 7.4\% lower validation perplexity, an effect consistent across five seeds and growing over training.
A capacity-matched ablation suggests the improvement stems from iterative refinement rather than increased capacity, as doubling expert width provides no benefit.
On this benchmark, our findings suggest that the value of sparse expert routing may lie in selecting the right expert and giving it deeper processing, rather than in combining multiple expert opinions.

URL: https://openreview.net/forum?id=hPD4MjMfoN

---

Title: PoTRE: Scaling Test-Time Reasoning via Cognitive Heterogeneity

Abstract: While Large Language Models (LLMs) excel at many tasks, they frequently struggle with complex reasoning that requires long-horizon planning and iterative error correction. Furthermore, standard single-stream prompting proves brittle when models encounter novel abstractions or rigorous domain constraints. We introduce PoTRE (Poly-Topological Reasoning Ensembles), a heterogeneous framework that decouples inference into four orthogonal agents: (1) Adversarial Refinement Agent, (2) Hierarchical strategic Planning Agent, (3) Spectrum Search Agent, and (4) Direct Chain Agent. A final Task-Adaptive Aggregation Layer dynamically reconciles these perspectives—via instinct driven selection, semantic synthesis, or neuro-symbolic verification—to produce a robust global solution. We evaluate PoTRE on three frontier benchmarks: ARC-AGI-2, Humanity’s Last Exam (HLE), and PRBench Finance. PoTRE achieves state-of-the-art accuracy of 49.92% on HLE, surpassing the previous best official score. We demonstrate that that architectural heterogeneity is an effective scaling path.

URL: https://openreview.net/forum?id=wApf83NZmh

---

Title: ImpMIA: Leveraging Implicit Bias for Membership Inference Attack

Abstract: Determining which data samples were used to train a model, known as Membership Inference Attack (MIA), is a well-studied and important problem with implications on data privacy. SotA methods (which are black-box attacks) rely on training many auxiliary reference models to imitate the behavior of the attacked model. As such, they rely on assumptions which rarely hold in real-world settings: (i) the attacker knows the training hyperparameters; (ii) all available non-training samples come from the same distribution as the training data; and (iii) the fraction of training data in the evaluation set is known. We show that removing these assumptions significantly harms the performance of black-box attacks. We introduce ImpMIA, a Membership Inference Attack that exploits the Implicit Bias of neural networks. Building on the maximum-margin implicit bias theory, ImpMIA uses the Karush–Kuhn–Tucker (KKT) optimality conditions to identify training samples -- those whose gradients most strongly reconstruct the trained model’s parameters. Our approach is optimization-based, and requires NO training of reference-models, thus removing the need for any knowledge/assumptions regarding the attacked model’s training procedure. While ImpMIA is a white-box attack (a setting which assumes access to model weights), this is becoming increasingly realistic given that many models are publicly available (e.g., via Hugging Face). ImpMIA achieves SotA performance compared to both black and white box attacks in settings where only the model weights are known, and a superset of the training data is available.

URL: https://openreview.net/forum?id=34bnVED6EZ

---

Title: AA-SVD: Anchored and Adaptive SVD for Large Language Model Compression

Abstract: We introduce a fast low-rank factorization-based framework for compressing large language models that enables rapid compression of billion-parameter models without retraining. Unlike existing factorization-based approaches that optimize only on the original inputs, ignoring distribution shifts from upstream compression and thus propagating errors forward, or those that rely only on shifted inputs and risk drifting away from the original outputs, our approach accounts for both. Beyond individual layer compression, we further refine each transformer block end-to-end, minimizing block-level output distortion and allowing compressed layers to jointly compensate for accumulated errors. By anchoring each compressed layer to the original outputs while explicitly modeling input distribution shifts, our method finds a low-rank approximation that maintains functional equivalence with the original model. Experiments on large language models show that our method consistently outperforms existing SVD-based baselines across compression ratios, with the advantage becoming increasingly pronounced at aggressive compression budgets, where competing methods degrade substantially or collapse entirely, offering a practical solution for efficient, large-scale model deployment.

URL: https://openreview.net/forum?id=STfiVfnJtX

---

Title: ARC-Encoder: learning compressed text representations for large language models

Abstract: Recent techniques such as retrieval-augmented generation or chain-of-thought reasoning have led to longer contexts and increased inference costs. Context compression techniques can reduce these costs, but the most effective approaches require fine-tuning the target model or even modifying its architecture. This can degrade its general abilities when not used for this specific purpose. Here we explore an alternative approach: an encoder that compresses the context into continuous representations which replace token embeddings in decoder LLMs. First, we perform a study of training strategies and architecture choices for the encoder. Our findings led to the design of an Adaptable text Representations Compressor, named ARC-Encoder, which outputs $x$-times fewer continuous representations (typically $x \in \{4,8\}$) than text tokens. We evaluate ARC-Encoder across a variety of LLM usage scenarios, ranging from in-context learning to context window extension, on both instruct and base decoders. Results show that ARC-Encoder achieves strong performance on several benchmarks and tasks while improving computational efficiency at inference. Finally, we demonstrate that our models can be adapted to multiple decoders simultaneously, allowing a single encoder to generalize across different decoder LLMs. This makes ARC-Encoder a flexible and efficient solution for portable encoders that can support multiple LLMs with only small MLPs.

URL: https://openreview.net/forum?id=lU1P9dsqfn

---

Title: Personalized Content Restriction for Large Language Models

Abstract: Large Language Models (LLMs) have achieved remarkable success across diverse applications, yet enforcing user-specific and personalized content restrictions remains challenging due to their vast generation space. Existing alignment methods such as supervised fine-tuning (SFT) are often impractical for rapidly changing or highly customized needs that vary across users, applications, and deployment scenarios. In this work, we study the practical problem of personalized content restriction for already-deployed LLMs without any model modification. We propose Suffix Optimization (SOP), a lightweight plug-and-play method that appends a short optimized suffix to any prompt, effectively suppressing a user-specified set of restricted terms while preserving output quality and semantic relevance. To enable systematic evaluation of such personalized safety approaches, we introduce CoReBench, a benchmark comprising 400 prompts designed to elicit 80 restricted terms across 8 categories. Extensive experiments on open-source models (Gemma2-2B, Mistral-7B, Llama-3-8B, and Llama-3.1-8B) and the POE online platform demonstrate that SOP consistently outperforms strong system-prompt baselines, highlighting its strong generalization and real-world practicality.

URL: https://openreview.net/forum?id=ISudz0Jh17

---

Title: Harmonizing Gradient Matching For Fairness

Abstract: Ensuring fairness across demographic groups is critical for machine learning systems deployed in high-stakes applications. Most existing approaches enforce fairness by directly minimizing disparities in predefined fairness metrics between groups, focusing primarily on the final model outcome. However, differences in distributions across groups can lead to heterogeneous optimization signals during training, resulting in imbalanced parameter updates and unstable fairness–performance trade-offs. In this work, we propose Fair Gradient Matching (FairGM), a fairness-aware optimization framework that harmonizes group-conditioned optimization signals. Instead of focusing solely on fairness metric disparities, FairGM aligns gradient signals of the fairness objective across groups at multiple levels of moments, including the zeroth-moment fairness metric itself, the first-moment mean gradients, and the second-moment gradient variances. These regularizations encourage similar optimization behavior across groups and lead to more stable fairness outcomes. To balance predictive performance and fairness objectives, we further formulate training as a multi-objective optimization problem and solve it using a Pareto-based optimization scheme. The resulting framework is compatible with a range of differentiable fairness metrics and gradient-based classifiers, supported by theoretical analysis connecting gradient alignment for fairness. Experiments on synthetic and real-world datasets demonstrate that FairGM achieves favorable fairness–accuracy trade-offs compared with existing fairness-aware learning methods, demonstrating the effectiveness and scalability of our approach.

URL: https://openreview.net/forum?id=HMlK36YWWt

---

Title: Fast Sharpness-escaping Optimization for Long-tailed Learning

Abstract: Deep neural networks often suffer from poor generalization in long-tailed settings. From a loss landscape perspective, this degradation is largely attributed to the tendency of the optimization process to converge into sharp, unstable minima for underrepresented data. We investigate the recently proposed Muon optimizer, providing theoretical evidence that its gradient orthogonalization amplifies updates along directions of negative curvature to facilitate escaping these sharp minima. While effective, the Muon optimizer imposes heavy computational overhead in long-tailed scenarios. To reconcile efficiency with optimization quality in long-tailed learning, we propose Fast Sharpness-escaping Optimization (FaSO). FaSO employs a compositional probabilistic schedule that couples lightweight exploration with increasingly frequent computationally intensive orthogonalized updates. This design targets the escape from sharp minima precisely when it is critical for tail-class generalization. Extensive experiments demonstrate that FaSO outperforms existing methods without incurring substantial computational costs, effectively securing flatter minima for tail classes.

URL: https://openreview.net/forum?id=I19SJW09zn

---

Title: A Player Selection Network for Scalable Game-Theoretic Prediction and Planning

Abstract: While game-theoretic planning frameworks are effective at modeling multi-agent interactions, they require solving large optimization problems where the number of variables increases with the number of agents, resulting in long computation times that limit their use in large-scale, real-time systems. To address this issue, we propose i) PSN Game—a learning-based, game-theoretic prediction and planning framework that reduces game size by learning a Player Selection Network (PSN); and ii) a Goal Inference Network (GIN) that makes it possible to use the PSN in incomplete-information games where other agents’ intentions are unknown to the ego agent. A PSN outputs a player selection mask that distinguishes influential players from less relevant ones, enabling the ego player to solve a smaller, masked game involving only selected players. By reducing the number of players included in the game, PSN shrinks the corresponding optimization problems, leading to faster solve times. The PSN Game framework is more flexible than existing player selection methods as it i) relies solely on observations of players’ past trajectories, without requiring full state, action, or other game-specific information; and ii) requires no online parameter tuning. Experiments in both simulated scenarios and real-world pedestrian trajectory datasets show that PSN is competitive with, and often improves upon, the evaluated explicit game-theoretic selection baselines in i) prediction accuracy and ii) planning safety. Across scenarios, PSN typically selects substantially fewer players than are present in the full game, thereby reducing game size and planning complexity. PSN also generalizes to settings in which agents’ objectives are unknown, via the GIN, without test-time fine-tuning. By selecting only the most relevant players for decision-making, PSN Game provides a practical mechanism for reducing planning complexity that can be integrated into existing multi-agent planning frameworks.

URL: https://openreview.net/forum?id=YvvB78ILSP

---

Title: Causal Scene Narration with Runtime Safety Supervision for Vision-Language-Action Driving

Abstract: Vision-Language-Action (VLA) models for autonomous driving must integrate diverse textual inputs, including navigation commands, hazard warnings, and traffic state descriptions, yet current systems often present these as disconnected fragments, forcing the model to discover on its own which environmental constraints are relevant to the current maneuver. We introduce Causal Scene Narration (CSN), which restructures VLA text inputs through intent-constraint alignment, quantitative grounding, and structured separation, at inference time with zero GPU cost. We complement CSN with Simplex-based runtime safety supervision and training-time alignment via Plackett-Luce DPO with negative log-likelihood (NLL) regularization. A multi-town closed-loop CARLA evaluation shows that CSN improves Driving Score by +31.1\% on original LMDrive and +24.5\% on the preference-aligned variant. A controlled ablation reveals that causal structure accounts for 39.1\% of this gain, with the remainder attributable to information content alone. A perception noise ablation confirms that CSN's benefit is robust to realistic sensing errors. Semantic safety supervision improves Infraction Score, while reactive Time-To-Collision monitoring degrades performance, demonstrating that intent-aware monitoring is needed for VLA systems.

URL: https://openreview.net/forum?id=sJATIdx6jG

---

Title: Graph Coloring via Learning and Metric-Guided Independent Set Extraction

Abstract: Recent advances in Graph Neural Networks (GNNs) have enabled learning-based approaches for a wide range of combinatorial optimization problems. In this paper, we address one such classical and challenging problem—graph coloring. We introduce an algorithmic framework for graph coloring based on repeated extraction of large independent sets using GNNs. The choice of independent sets to be chosen for coloring is further guided by metrics which favor use of minimum number of colors. To improve scalability, we further enhance this framework with a value-aware GNN that operates on reduced graphs, learning effective independent set extraction directly on compact representations before lifting solutions to the original graph. We also present a new formulation for quantum circuit depth optimization as a mixed graph coloring problem. The presented approach is evaluated on standard DIMACS benchmark graphs and citation network datasets. Experimental results show that the proposed algorithm achieves significant performance improvements over several traditional and state-of-the-art solvers and scales effectively to larger graphs. The results for benchmark quantum circuit instances are also encouraging.

URL: https://openreview.net/forum?id=0uvGZZZNxn

---

Title: Arithmetic OOD Failure Unfolds in Stages in Minimal GPTs

Abstract: Arithmetic benchmarks are often reduced to a single held-out score, but that score can conflate qualitatively different failures. We study a controlled minimal GPT trained on exhaustive 2-digit addition, where all local digit transitions are already present in training, and ask why 3-digit generalization still fails. The failure is staged. First, there is a layout barrier: a learned absolute-position model collapses under a pure 3-digit layout shift, and mixed-layout exposure is the only intervention that materially weakens this barrier. Second, after layout repair, the hundreds position behaves like a carry flag rather than a semantic hundreds digit; targeted carry probes reverse the relevant logit margin, whereas a matched extra-data control does not. Third, after carry repair, the main remaining bottleneck is conditional recomposition: high-conditioned tail data outperforms a matched control, high-only data, and tail-only data on all true-3-digit suites, and the same ordering reappears in a larger 2-layer bridge experiment. The residual errors after recomposition are then overwhelmingly tens-only, and a separate 10-seed late-stage study shows that a sign-aware tens repair raises exact match on the hardest thousands-carry suite from 0.664 to 0.822. We therefore provide an experimentally testable decomposition of arithmetic OOD failure into layout, carry-semantics, recomposition, and late tens-residual stages.

URL: https://openreview.net/forum?id=ZDRlkYHW0L

---

Title: iLOCO: Distribution-Free Inference for Feature Interactions

Abstract: Feature importance measures are widely studied and are essential for understanding model behavior, guiding feature selection, and enhancing interpretability. However, many machine learning fitted models involve complex interactions between features. Existing feature importance metrics fail to capture these pairwise or higher-order effects, while existing interaction metrics often suffer from limited applicability or excessive computation; no methods exist to conduct statistical inference for feature interactions. To bridge this gap, we first propose a new model-agnostic metric, interaction Leave-One-Covariate-Out (iLOCO), for measuring the importance of pairwise feature interactions, with extensions to higher-order interactions. Next, we leverage recent advances in LOCO inference to develop distribution-free and assumption-light confidence intervals for our iLOCO metric. To address computational challenges, we also introduce an ensemble learning method for calculating the iLOCO metric and confidence intervals that we show is both computationally and statistically efficient. We validate our iLOCO metric and our confidence intervals on both synthetic and real data sets, showing that our approach outperforms existing methods and provides the first inferential approach to detecting feature interactions.

URL: https://openreview.net/forum?id=pUGr1uA99f

---

Title: Bayes with No Shame: Admissibility Geometries of Predictive Inference

Abstract: Four distinct admissibility geometries govern sequential and distribution-free inference: Blackwell risk dominance over convex risk sets, anytime-valid admissibility within the nonnegative supermartingale cone, marginal coverage validity over exchangeable prediction sets,
and Cesàro approachability (CAA) admissibility, which reaches the risk-set boundary via approachability-style arguments rather than explicit priors. We prove a criterion separation theorem: the four classes of admissible procedures are pairwise non-nested. Each geometry
carries a different certificate of optimality: a supporting-hyperplane prior (Blackwell), a nonnegative supermartingale (anytime-valid), an exchangeability rank (coverage), or a Cesàro steering argument (CAA). Martingale coherence is necessary for Blackwell admissibility and
necessary and sufficient for anytime-valid admissibility within e-processes, but is not sufficient for Blackwell admissibility and is not necessary for coverage validity or CAA-admissibility. All four criteria can be viewed through a common schematic template (minimize Bayesian risk subject to a feasibility constraint), but the decision spaces, partial orders, and performance metrics differ by criterion, making them geometrically incompatible. Admissibility is irreducibly criterion-relative.

URL: https://openreview.net/forum?id=nZVEXyfvyM

---

Title: FastAvatar: Rapid 3D Gaussian Splatting Face Avatar Generation from a Single Image

Abstract: This paper presents a method to infer a 3D face avatar model from a single arbitrarily posed image, using the 3D Gaussian Splatting (3DGS) framework. Inference of a full 3DGS face model from one image is a highly ill-posed problem, requiring the estimation of hundreds of thousands, often well over a million, per-Gaussian appearance and structural parameters. To address this challenge, we draw inspiration from the classical morphable face models literature, in which individual identities are well-described as compact deformations (residuals) with respect to a canonical template face model, thereby easing the learning task. We propose leveraging such a template-plus-residuals strategy, but in the unstructured 3DGS parameter space. Rather than predicting absolute 3DGS parameters from scratch given an input face image, our proposed algorithm, FastAvatar, learns to map a face image to residual parameter values with respect to a canonical 3DGS template learned over prior multi-view face data. We couple the feed-forward prediction with a rapid inference-time latent refinement to maximize appearance fidelity to the observed image. Our evaluations on the Nersemble benchmark demonstrate that FastAvatar can generate 3DGS face models ($\sim$600K parameters) in approximately 3 seconds, with state-of-the-art reconstruction accuracy (24.01 dB PSNR and 0.91 SSIM) compared to existing feed-forward, optimization, and diffusion baselines. Our work demonstrates that residual learning offers a tractable and high-fidelity approach to image synthesis in the popular 3DGS framework.

URL: https://openreview.net/forum?id=4WnbCj0v0K

---

Title: MLLM4TS: Leveraging Vision and Multimodal Language Models for General Time-Series Analysis

Abstract: Effective analysis of time series data presents significant challenges due to the complex temporal dependencies and cross-channel interactions in multivariate data. Inspired by the way human analysts visually inspect time series to uncover hidden patterns, we ask: can incorporating visual representations enhance automated time-series analysis? Recent advances in multimodal large language models have demonstrated impressive generalization and visual understanding capability, yet their application to time series remains constrained by the modality gap between continuous numerical data and discrete natural language. To bridge this gap, we introduce MLLM4TS, a novel framework that leverages multimodal large language models for general time-series analysis by integrating a dedicated vision branch. Each time-series channel is rendered as a horizontally stacked color‑coded line plot in one composite image to capture spatial dependencies across channels, and a temporal‑aware visual patch alignment strategy then aligns visual patches with their corresponding time segments. MLLM4TS fuses fine-grained temporal details from the numerical data with global contextual information derived from the visual representation, providing a unified foundation for multimodal time-series analysis. Extensive experiments on standard benchmarks show that MLLM4TS consistently outperforms its unimodal counterpart across both predictive (e.g., classification) and generative (e.g., anomaly detection and forecasting) tasks, ranking among the top time-series backbones. These results highlight the effectiveness of introducing visual modalities and pretrained models for robust and generalizable time-series analysis.

URL: https://openreview.net/forum?id=bhd6naKDoL

---

Title: CDG-MAE: Cross-view Masked Modeling using Diffusion Generated Views

Abstract: Cross-view masked autoencoding has emerged as a powerful pretext task for learning dense correspondences, which are essential for applications such as video label propagation. The cross-view pretext task is modeled with a masked autoencoder, where a masked target view is reconstructed from an anchor view. However, acquiring effective training data remains a challenge - collecting diverse video datasets is costly, while simple image crops lack the necessary pose variations, underperforming video-based methods. This paper introduces CDG-MAE, a novel MAE-based self-supervised method that uses diverse synthetic views generated from static images via an image-conditioned diffusion model. We present a quantitative method to evaluate the local and global consistency of the generated views to choose the right diffusion model for cross-view self-supervised pretraining. These generated views exhibit substantial changes in pose and perspective, providing a rich training signal that overcomes the limitations of video and crop-based anchors. Furthermore, we enhance the standard single-anchor MAE setting to a multi-anchor masking strategy to increase the difficulty of the pretext task. CDG-MAE substantially narrows the gap to video-based MAE methods, while maintaining the data advantages of image-only MAEs.

URL: https://openreview.net/forum?id=7XIymKIA0v

---

Title: Can Test-time Computation Mitigate Reproduction Bias in Neural Symbolic Regression?

Abstract: Mathematical expressions play a central role in scientific discovery. Symbolic regression aims to automatically discover such expressions from given numerical data. Recently, Neural symbolic regression (NSR) methods that involve Transformers pre-trained on synthetic datasets have gained attention for their fast inference, but they often perform poorly, especially with many input variables. In this study, we analyze NSR from both theoretical and empirical perspectives and show that (1) ordinary token-by-token generation is ill-suited for NSR, as Transformers cannot compositionally generate tokens while validating numerical consistency, and (2) the search space of NSR methods is greatly restricted due to reproduction bias, where the majority of generated expressions are merely copied from the training data. We further examine whether tailored test-time strategies can reduce reproduction bias and show that providing additional information at test time effectively mitigates it. These findings contribute to a deeper understanding of the limitation of NSR approaches and provide guidance for designing more robust and generalizable methods.

URL: https://openreview.net/forum?id=FlavYtUdf1

---

Title: Conformal Data Contamination Tests for In-distribution Data Acquisition

Abstract: The amount of quality data in many machine learning tasks is limited to what is available locally to data owners. The set of quality data can be expanded through trading or sharing with external data agents. However, external data may be contaminated or introduce undesirable sample diversity which can degrade performance of personalized machine learning tasks, as in diagnosis of a rare disease or recommendation systems. Therefore, data buyers need quality guarantees prior to data acquisition. Previous works primarily rely on distributional assumptions about data from different agents, relegating quality checks to post-hoc steps involving costly data valuation procedures. We propose a distribution-free, contamination-aware data-sharing framework that, by inspecting only a small volume of data, identifies external data agents whose data is most valuable for model personalization. To achieve this, we introduce novel two-sample testing procedures, preceding full data acquisition, grounded in rigorous theoretical foundations for conformal outlier detection, to determine whether an agent’s data exceeds a contamination threshold. The proposed tests, termed conformal data contamination tests, remain valid under arbitrary contamination levels while enabling false discovery rate control via the Benjamini-Hochberg procedure. Empirical evaluations across diverse collaborative learning scenarios demonstrate the robustness and effectiveness of our approach. Overall, the conformal data contamination test distinguishes itself as a generic procedure for aggregating data with statistically rigorous quality guarantees.

URL: https://openreview.net/forum?id=7HNbXnhtfR

---

Title: Deep Generative Spatiotemporal Engression for Probabilistic Forecasting of Epidemics

Abstract: Accurate and reliable forecasting of epidemic incidences is critical for public health preparedness, yet it remains a challenging task due to complex nonlinear temporal dependencies and heterogeneous spatial interactions. Often, point forecasts generated by spatiotemporal models are unreliable in assigning uncertainty to future epidemic events. Probabilistic forecasting of epidemics is therefore crucial for providing the best or worst-case scenarios rather than a simple, often inaccurate, point estimate. We present deep spatiotemporal engression methods to generate accurate and reliable probabilistic forecasts on low-frequency epidemic datasets. The proposed methods act as distributional lenses, and out-of-sample probabilistic forecasts are generated by sampling from the trained models. Our frameworks encapsulate lightweight deep generative architectures, wherein uncertainty is quantified endogenously, driven by a pre-additive noise component during model construction. We establish geometric ergodicity and asymptotic stationarity of the spatiotemporal engression processes under mild assumptions on the network weights and pre-additive noise process. Comprehensive evaluations across six epidemiological datasets over three forecast horizons demonstrate that the proposal consistently outperforms several temporal and spatiotemporal benchmarks in both point and probabilistic forecasting. Additionally, we explore the explainability of the proposal to enhance the models' practical application for informed, timely public health interventions.

URL: https://openreview.net/forum?id=7AfAztCd5A

---

Title: Gradient-Based Multi-Objective Deep Learning: Algorithms, Theories, Applications, and Beyond

Abstract: Many modern deep learning applications require balancing multiple objectives that are often conflicting. Examples include multi-task learning, fairness-aware learning, and the alignment of Large Language Models (LLMs). This leads to multi-objective deep learning, which tries to find optimal trade-offs or Pareto-optimal solutions by adapting mathematical principles from the field of Multi-Objective Optimization (MOO). However, directly applying gradient-based MOO techniques to deep neural networks presents unique challenges, including high computational costs, optimization instability, and the difficulty of effectively incorporating user preferences. This paper provides a comprehensive survey of gradient-based techniques for multi-objective deep learning, with a primary focus on supervised learning settings. We systematically categorize existing algorithms based on their outputs: (i) methods that find a single, well-balanced solution, (ii) methods that generate a finite set of diverse Pareto-optimal solutions, and (iii) methods that learn a continuous Pareto set of solutions. In addition to this taxonomy, the survey covers theoretical analyses, key applications, practical resources, and highlights open challenges and promising directions for future research.

URL: https://openreview.net/forum?id=eCUcXXH3PS

---

Title: The Delta Rule Dominates: A Factorial Analysis of Decay in Linear Attention

Abstract: In a controlled factorial study of linear attention decay mechanisms, we find that the delta rule explains more variation than gating granularity, conditioning, or their interaction. We establish this by systematically evaluating all four quadrantsof the {scalar, channel-wise}
×{data-dependent, data-independent} decay space, crossed with the delta rule, yielding 8 controlled variants tested on two datasets (TinyStories, WikiText-103) with 3 random seeds each at 18M, and on TinyStories with 3 seeds at 125M and 5 variants at 42M. All delta variants (ranks 1–4) beat all non-delta variants (ranks 5–8) at 18M (both datasets) and 125M (TinyStories) and the gap consistently larger than within-group spread (6×at 125M). A granularity×delta interaction provides the second key finding: channel-wise decay hurts
without the delta rule but helps with it, suggesting that the delta rule is especially beneficial when multiple decay timescales are present. Within the top-4 delta variants, rankings are scale-dependent: data-independent StaticChannelDelta leads at 18M, while data-dependent
KDA overtakes it at 125M as the top-3 gap compresses to 0.009 nats. A synthetic recall probe reveals that gating granularity also creates a task-dependent tradeoff: scalar+delta solves exact retrieval while channel-wise+delta does not, indicating that channel-wise decay
trades retrieval precision for representational richness.

URL: https://openreview.net/forum?id=FdXB4u27LD

---

Title: Learning to Prompt for Generalizable Instance Segmentation via Bi-Level Optimization

Abstract: The Segment Anything Model has revolutionized image segmentation with its zero-shot capabilities, yet its reliance on manual prompts hinders fully automated deployment. While integrating object detectors as prompt generators offers a pathway to automation, existing pipelines suffer from two fundamental limitations: objective mismatch, where detectors optimized for geometric localization do not correspond to the optimal prompting context required by SAM, and alignment overfitting in standard joint training, where the detector simply memorizes specific prompt adjustments for training samples rather than learning a generalizable policy. To bridge this gap, we introduce BLO-Inst, a unified framework that aligns detection and segmentation objectives by bi-level optimization. We formulate the alignment as a nested optimization problem over disjoint data splits. In the lower level, the SAM is fine-tuned to maximize segmentation fidelity given the current detection proposals on a subset ($D_1$). In the upper level, the detector is updated to generate bounding boxes that explicitly minimize the validation loss of the fine-tuned SAM on a separate subset ($D_2$). This effectively transforms the detector into a segmentation-aware prompt generator, optimizing the bounding boxes not just for localization accuracy, but for downstream mask quality. Extensive experiments demonstrate that BLO-Inst achieves superior performance, outperforming standard baselines on tasks in general and biomedical domains.

URL: https://openreview.net/forum?id=zN1yKIIVxN

---

Title: Learning to Understand Videos From Encoded Bytes

Abstract: We present an approach to understand video from encoded bytes, e.g., mp4s. These compressed videos are 99% smaller than the RGB pixel representations which are currently commonly used for video understanding. Encoded videos are able to compress the pixels by taking advantage of the redundant information across the frames using special encoding, such as key frames and motion residuals to handle this. However, standard video understanding models do not take advantage of this significant compression already available for each video, and instead either heavily subsample the frames or only work on short segments of the video. Here, we present an approach to understanding video from encoded bytes directly. We note that simply applying existing models, e.g., Transformers or State-Space models, to video byte sequences does not work, both due to difficulty in handling very long video byte sequences and easy overfitting. To address these challenges, we design a StateSpace model with sequence parallelism to handle very long byte sequences, reaching 15 million tokens in training, and essentially unlimited tokens in inference. We also propose a multilevel SSM activation fusion that reduces sequence length, which we find also benefits video understanding. We evaluate on six video understanding benchmarks including long, high-fps and video + audio understanding tasks and demonstrate competitive performance, illustrating, for the first time, the feasibility of learning from compressed video byte representations.

URL: https://openreview.net/forum?id=LXCs9GFkst

---

Title: RAWDet-7: A Multi-Scenario Benchmark for Object Detection and Description on Quantized RAW Images

Abstract: Most vision models operate on 8-bit standard RGB (sRGB) images produced by dedicated image sensor processing pipelines designed for human perception rather than machine reasoning. In contrast, RAW images preserve sensor measurements, dynamic range, and fine-grained scene structure that can be critical for downstream understanding. Yet, progress in the RAW-domain vision remains limited by the lack of large-scale, high-quality benchmarks. Thus, we introduce RawDet-7, a multi-scenario benchmark for object detection and object description on quantized RAW images, comprising ~25k training and 7.6k test images consolidated from four datasets across diverse cameras, sensors, bit-depth precision, lighting conditions, and environments. RawDet-7 provides dense, standardized annotations for seven object categories, correcting missing labels, hallucinations, and inconsistencies in prior datasets, especially for small and partially-occluded instances. Beyond detection, we introduce an object-description track with detailed object-level descriptions derived from high-resolution sRGB references, enabling the study of how well different RAW processing pipelines preserve fine-grained semantic, spatial, and contextual information. Finally, RawDet-7 supports controlled benchmarking under simulated 4-bit, 6-bit, and 8-bit quantization.
Across standard detectors and a large grounding model, we show that suitable RAW-aware input mappings make low-bit RAW on par with, and in some settings superior to, sRGB pipelines, while preserving substantially richer object-level descriptions than naive quantization. Dataset and code upon acceptance.

URL: https://openreview.net/forum?id=UHTJrsYieo

---

Title: The GRPO Tax is Smaller Than You Think: A Longitudinal Study of Capability Preservation During Reasoning Training

Abstract: Reinforcement learning methods such as Group Relative Policy Optimization (GRPO) have emerged as a dominant paradigm for training language models to reason, following the success of DeepSeek-R1. A persistent concern is the ``alignment tax'': the degradation of general-purpose capabilities that accompanies post-training optimization for a specific objective. We conduct the first dense-checkpoint longitudinal study of capability evolution during GRPO training across five instruction-tuned models from four families (Qwen, Phi, Gemma, and Llama) at the 1.5B to 3.8B parameter scale. Each model is evaluated at 75 training checkpoints across 13 benchmarks. Our findings reveal that the GRPO tax is substantially smaller than commonly assumed: 85\% of non-target capabilities remain within $\pm$2\% of their baseline values, 8\% improve, and only 7\% degrade. We further observe directional evidence of a cross-family divergence in safety behavior: Qwen and Phi models maintain stable safety refusal rates, while Gemma and Llama models each show one additional refusal failure out of 30 safety prompts ($-$3.3\%), a pattern warranting further investigation with larger evaluation sets. A comparative analysis with Direct Preference Optimization (DPO) reveals that GRPO produces larger variance in capability shifts than DPO, which acts as a more conservative optimization method. These results challenge the prevailing narrative of catastrophic capability loss and provide practitioners with concrete evidence that single-epoch LoRA-based GRPO training largely preserves the instruction-tuned model's general capabilities.

URL: https://openreview.net/forum?id=e0UVcimXdK

---

Title: Toward Unified Source-Free Domain Adaptation via Latent Causal Factors Discovery

Abstract: In the pursuit of transferring a source model to a target domain without access to the source training data, Source-Free Domain Adaptation (SFDA) has been extensively explored across various scenarios, including Closed-set, Open-set, Partial-set, Open-partial-set, and Generalized settings. Existing methods, focusing on specific scenarios, not only address a limited subset of challenges but also necessitate prior knowledge of the target domain, significantly limiting their practical utility and deployability. In light of these considerations, we introduce a more practical yet challenging problem, termed unified SFDA, which comprehensively incorporates all specific scenarios in a unified manner. In this paper, we propose a novel approach latent Causal factors discovery for unified SFDA (CausalDA). In contrast to previous alternatives that emphasize learning the statistical description of reality, we formulate CausalDA from a causality perspective. The objective is to uncover potential causality between latent variables and model decisions, enhancing the reliability and robustness of the learned model against domain shifts. To integrate extensive world knowledge, we leverage a pre-trained vision-language model such as CLIP. This aids in the formation and discovery of latent causal factors in the absence of supervision in the variation of distribution and semantics, coupled with a newly designed information bottleneck with theoretical guarantees. Extensive experiments demonstrate that CausalDA can achieve new state-of-the-art results in distinct SFDA settings, as well as source-free out-of-distribution generalization.

URL: https://openreview.net/forum?id=bc9ayd478S

---

Title: Feedback-Driven Black-Box Safety Alignment Testing of Large Language Models via Reinforcement Learning

Abstract: Large language models (LLMs) are equipped with safety alignment mechanisms to reduce harmful outputs, while systematically evaluating the effectiveness of these safeguards remains challenging.
Existing methods mainly rely on manually curated prompts or stochastic mutation-based search, which provide limited exploration efficiency.
We propose SEAT-RL, a feedback-driven black-box framework that uses deep reinforcement learning (DRL) to generate adversarial prompts against safety-aligned LLMs.
We formulate prompt generation as a sequential decision-making problem, where an agent iteratively refines prompts based on target model feedback.
To improve effectiveness and efficiency, we design (1) an LLM-facilitated action space that enables diverse yet constrained prompt transformations, and (2) a dense, automated reward function to guide exploration toward safety violations.
The learned policy is reusable and transfers across target models without retraining.
Experiments on six representative LLMs show that SEAT-RL discovers substantially more safety failures under the same query budget than existing automated baselines, such as the stochastic search methods powered by genetic algorithms.
SEAT-RL also exhibits stronger stability, cross-model transferability, and robustness against multiple defense mechanisms.
Ablation studies further validate the key design. These results suggest that RL provides an effective framework for black-box red-teaming evaluation of LLM safety alignment.

URL: https://openreview.net/forum?id=GWslY31w2b

---

Title: DeepSpike: Foundation Model-based Pipeline for Large-Scale Spike Sorting of Neural Activity

Abstract: Spike sorting of high-resolution neural recordings is essential for understanding brain activity, but it remains challenging when multiple units are recorded due to their overlapping spike timing, low signal-to-noise ratios and overlapping clusters. Here, we introduce DeepSpike, a self-supervised deep learning model that automates spike sorting and overcomes key limitations of conventional spike sorting methods. Pretrained on large-scale unlabeled spiking events as a reusable self-supervised encoder, it generalizes to new recordings without retraining. DeepSpike uses a self-supervised autoencoder to learn robust low-dimensional spike embeddings that facilitate accurate clustering and effective noise filtering. The model is trained on a new, large-scale dataset consisting of $255M$ spiking events (SpikeVault-255M) derived from real in vivo recordings of about $4560$ minutes duration. The dataset consists of $15M$ ground truth spikes that are manually verified by an expert user. DeepSpike outperformed state-of-the-art spike sorting algorithms in both accuracy and robustness in our experiments on SpikeVault-255M, and two public benchmark datasets. Our results demonstrate that large-scale, self-supervised pre-training yields a powerful and generalizable encoder for automated spike sorting. The Spike Vault-255M dataset and the pre-trained DeepSpike model are made publicly available to facilitate further research and development.

URL: https://openreview.net/forum?id=CJuguscfUy

---

Title: Fair and Private Approximate Kernel Ridge Regression

Abstract: Even though Kernel Ridge Regression (KRR) is a well-studied nonparametric problem, its applicability is hindered by its time and space complexity. One of the most common methods for circumventing this issue is the Nystr\"{o}m approximation method, which approximates the full kernel matrix by a low rank matrix. The implementation of approximate KRR for real-world systems, such as medical data, further raises concerns regarding protection of privacy of the patients, as well as ensuring patients are treated fairly, irrespective of their background. In this paper, we study the problem of private and fair KRR that is scalable. To the best of our knowledge, this is the first work that considers the privacy and fairness aspects of approximate KRR. Building upon well-known techniques in Nystr\"{o}m approximation and differential privacy, we propose a technique for computing the Nystr\"{o}m approximation in such a way each demographic group is represented in the basis. We also ensure that the landmarks do not reveal any information regarding the private dataset.
Further, the coefficients of the KRR problem are learned in a privacy preserving manner. We compare different variations of this framework empirically on a set of real-world datasets.

URL: https://openreview.net/forum?id=bBECP53Wzn

---

Title: LiteHall: A Three-Stage, Modular and Lightweight Pipeline for End-to-End Hallucination Detection

Abstract: Large Language Models (LLMs) are increasingly applied in high-stakes domains such as
medicine and law, where hallucinations can have serious consequences. Existing detection
approaches either depend on costly proprietary LLMs with limited adaptability, or on
monolithic open-source models that require full retraining, struggle with long evidence
contexts, and lack transparency. We introduce LiteHall, a lightweight, fully open-source,
three-stage hallucination detection pipeline designed for modularity, domain adaptability, and interpretability. Each stage leverages a 1.7B-parameter Small Language Model (SLM) trained independently with stage-specific Reinforcement Learning with Verifiable Rewards (RLVR) over a high-quality synthetic corpus of 120K+ examples, enabling efficient specialization without reliance on large monolithic models. To advance rigorous evaluation, we present HaFin500, a fine-grained benchmark of 500 long-form QA pairs spanning 30 fact-seeking domains, annotated with 6K claims and 3.5M evidence tokens. Extensive experiments show that LiteHall consistently surpasses both open-source and proprietary detectors. On out- of-domain benchmarks, LiteHall achieves substantial gains over strong baselines, including +6.4% / +10.0% (Accuracy/F1) against MiniCheck-7B, +6.1% / +4.8% over SAFE (GPT- 3.5-turbo), +11.5% / +13.0% over AlignScore, and +9.8% / +15.2% over FAVA. Even compared to GPT-4o, LiteHall delivers +4.7% / +3.0% improvements in zero-shot mode, while retaining an additional +2.0% / +0.9% advantage when GPT-4o is integrated as a backbone. These results demonstrate that LiteHall not only matches or exceeds in-domain performance but also generalizes robustly out-of-domain, establishing it as a practical,
transparent, and reproducible solution for trustworthy LLM deployments.

URL: https://openreview.net/forum?id=TRR4EUnKmD

---

Title: Towards Scalable and Robust Filtration Learning for Point Clouds via Principal Persistence Measure

Abstract: Topological features in persistent homology extracted via a filtration process have been shown to enhance the performance of machine learning tasks on point clouds. The performance is highly related to the choice of filtration, thereby underscoring the critical significance of filtration learning. However, the current supervised filtration learning method for point clouds can not scale well. We identify that this shortcoming stems from the utilization of Persistence Diagrams (PD) for encoding topological features, such as connected components, rings or voids, etc.
To address this issue, we propose to use Principal Persistence Measure (PPM), an existing statistical approximation of PD, as an alternative representation and adapt existing network for PPM-based filtration learning. Experimental results on point cloud classification tasks demonstrate that our PPM-based framework achieves comparable accuracy to the PD-based approach while offering better scalability and robustness against outliers.

URL: https://openreview.net/forum?id=8NkZxJFQ3I

---

Title: Cross-Fitted Clipped Covariance Estimation with a Data-Driven Tail-Energy Criterion

Abstract: Heavy-tailed data make covariance estimation sensitive to the clipping level: stronger clipping reduces variance but increases bias. We study this tuning problem for the centered clipped covariance estimator in the known-mean setting. We propose QTES, a data-driven rule that selects the clipping level by combining a cross-fitted variance certificate with a held-out estimate of the tail energy removed by clipping. The key observation is that, for Euclidean clipping, the operator-norm bias can be upper bounded by a scalar tail-energy quantity. Under a finite $L_4$ moment condition, together with a mild feasibility condition for the QTES block construction, we prove a uniform finite-sample calibration bound over a positive clipping grid and a finite-sample guarantee for the estimator selected by QTES, relative to the best candidate on that grid under the same bias-variance criterion. This result is a guarantee for data-driven calibration within the clipped family; it does not establish the sharp direct effective-rank rates known for other robust covariance estimators under stronger assumptions. Experiments on clean heavy-tailed and contaminated benchmarks show that QTES improves over uninformed clipping choices and is often close to an oracle-tuned Wei-Minsker benchmark serving as a best-case oracle baseline within the centered clipped family.

URL: https://openreview.net/forum?id=MyNXLdRFJ3

---

Title: Refining Heuristic-Based Bitcoin Address Clustering with Graph Neural Networks

Abstract: Bitcoin’s pseudonymous nature makes it challenging to analyze user-level activity, since a single user may control multiple identifiers (addresses). Existing heuristic-based methods attempt to identify addresses belonging to the same user, but they often produce flat cluster assignments with limited modularity and are prone to errors such as merging different users together. In this work, we propose a method for refining heuristic-obtained clusters by grounding our clustering on contrastive embeddings yielded by graph neural networks. Our contribution is threefold: (i) we release a publicly available dataset of Bitcoin transaction graphs containing a substantial number of clusters; (ii) we propose a methodology for learning address embeddings consistent with heuristics, and back it up with solid theoretical foundations and empirical results; (iii) through hierarchical clustering, we allow a finer analysis of heuristic clusters and provide a quantitative criterion for flagging suspicious merges.

URL: https://openreview.net/forum?id=IDlyp9nz2I

---

Title: When Answers Change: Testing Sensitivity of Language Models as Decision Makers

Abstract: Large language models are typically evaluated on static benchmarks before deployment. Based on their accuracy across a variety of tasks, models are often claimed to exhibit incremental improvements in intelligence. However, this evaluation paradigm poses deployment risks given the growing use of language models in critical domains such as medicine, law, and scientific analysis. In this study, we identify gaps in current evaluation practices and conduct a systematic analysis across four models and seven domains using three types of perturbations: (1) invisible, (2) syntactic, and (3) occlusion; they vary the presentation of information to the model. Invisible perturbations introduce changes imperceptible to humans, syntactic perturbations modify the surface form while preserving semantics (meaning), and occlusion perturbations alter the available information by masking important tokens.

Reliable assistive decision-making involves two key properties: decision invariance and confidence calibration. Decision invariance requires that model predictions remain stable when same questions are asked to the model with minor presentation changes. Confidence calibration measures whether a model appropriately expresses uncertainty when it is unsure about its predictions.

Across seven domains, the four models displayed substantial instability in our experiments. The invisible perturbation and syntactic perturbation for which the models should have been invariant had a flip rate of 2.9% to 18.3%. However, the model's accuracy does not reflect this instability, since the correct-to-incorrect and incorrect-to-correct transitions are nearly equal. With occlusion perturbation, which hides key information from the model, the model remained highly confident (between 80%-90%) when it changed its answer. These results demonstrate that benchmark correctness alone does not capture the reliability of language models and suggest that stability and calibrated uncertainty should be treated as the primary evaluation criteria for deployment in decision-support settings.

URL: https://openreview.net/forum?id=0AeW5ZZ1Ua

---

Title: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Abstract: Reinforcement finetuning (RFT) has shown great potential for enhancing the mathematical reasoning capabilities of large language models (LLMs), but it is often sample- and compute-inefficient, requiring extensive training. In this work, we introduce AdaRFT (Adaptive Curriculum Reinforcement Finetuning), a method that significantly improves both the efficiency and final accuracy of RFT through adaptive curriculum learning. AdaRFT dynamically adjusts the difficulty of training problems based on the model’s recent reward signals, ensuring that the model consistently trains on tasks that are challenging but solvable. This adaptive sampling strategy accelerates learning by maintaining an optimal difficulty range, avoiding wasted computation on problems that are too easy or too hard. AdaRFT requires only a lightweight extension to standard RFT algorithms like Proximal Policy Optimization (PPO), without modifying the reward function or model architecture. Experiments on competition-level math datasets—including AMC, AIME, and IMO-style problems—demonstrate that AdaRFT significantly improves both training efficiency and reasoning performance. We evaluate AdaRFT across multiple data distributions and model sizes, showing that it reduces training time by up to 2
and improves accuracy by a considerable margin, offering a more scalable and effective RFT framework.

URL: https://openreview.net/forum?id=UEhpyq41b9

---

Title: Feedback-Driven Vision-Language Alignment via Sampling-based Visual Projection

Abstract: Vision-language models (VLMs) combine image understanding and language generation, yet they frequently produce descriptions inconsistent with the visual input, leading to hallucinated objects and reduced reliability. We propose Sampling-based Visual Projection (SVP), a training framework that improves vision-language alignment by using a pretrained grounding model as feedback during data generation rather than as supervision. For each unlabeled seed image, the base model generates draft descriptions, a grounding model evaluates those drafts to return spatial feedback, and the VLM generates refined descriptions conditioned on that feedback. SVP selects the most informative candidates and fine-tunes the base model using only these natural-language outputs, effectively distilling spatial reasoning capabilities without injecting explicit grounding tokens or coordinates. Across ten benchmarks, SVP yields broad performance gains, including a 14% average improvement in captioning and a 12% increase in object recall, significantly reducing hallucinations while preserving robust question-answering capabilities.

URL: https://openreview.net/forum?id=vt0bX7QmyX

---

Title: On Convergence of the Alternating Directions Stochastic Gradient Hamiltonian Monte Carlo (SGHMC) Algorithms

Abstract: We study convergence rates of practical Hamiltonian Monte Carlo (HMC) style algorithms where the Hamiltonian motion is approximated with leapfrog integration and where gradients of the log target density are accessed via a stochastic gradient (SG) oracle.
Importantly, our analysis extends to allowing the use of general auxiliary distributions via a novel HMC procedure of alternating directions (AD).

The convergence analysis is based on the investigation of the Dirichlet forms associated with the underlying Markov chain driving the algorithms. For this purpose, we provide a detailed analysis on the error of the leapfrog integrator for Hamiltonian motions when both the kinetic and potential energy functions are in general form. We characterize the explicit dependence of the convergence rates on key parameters such as the problem dimension, functional properties of the target and auxiliary distributions and the quality of the SG oracle. Our analysis also identifies a crucial derivative condition on the log density of the auxiliary distribution, and we show that Gaussians (auxiliaries for standard HMC) as well as common choices of general auxiliaries for ADHMC satisfy this condition.

URL: https://openreview.net/forum?id=TiETXsziE0

---

Title: Benchmarking Tabular Foundation Models for Conditional Density Estimation in Regression

Abstract: Conditional density estimation (CDE) - recovering the full conditional distribution of a response given tabular covariates - is essential in settings with heteroscedasticity, multimodality, or asymmetric uncertainty. Recent tabular foundation models, such as TabPFN and TabICL, naturally produce predictive distributions, but their effectiveness as general-purpose CDE methods has not been systematically evaluated, unlike their performance for point prediction, which is well studied. We benchmark three tabular foundation model variants against a diverse set of parametric, tree-based, and neural CDE baselines on 39 real-world datasets, across training sizes from 50 to 20,000, using six metrics covering density accuracy, calibration, and computation time. Across all sample sizes, foundation models achieve the best CDE loss, log-likelihood, and CRPS on the large majority of datasets tested. Calibration is competitive at small sample sizes but, for some metrics and datasets, lags behind task-specific neural baselines at larger sample sizes, suggesting that post-hoc recalibration may be a valuable complement. In a photometric redshift case study using SDSS DR18, TabPFN exposed to 50,000 training galaxies outperforms all baselines trained on the full 500,000-galaxy dataset. Taken together, these results establish tabular foundation models as strong off-the-shelf conditional density estimators.

URL: https://openreview.net/forum?id=KWsWHpp5Do

---

Title: ROSS-PCA: Robust Sparse PCA via Fully Differentiable Penalties

Abstract: Sparse PCA finds low-dimensional structure that loads on few features. Existing methods
couple learning and feature selection by applying non-differentiable penalties that force
retain-or-zero decisions during optimization. This eliminates features before the optimizer
has established which ones matter, limits scalability to serial coordinate-update solvers, and
fails when components share support. We introduce ROSS-PCA, which decouples learning
and feature selection. It learns by optimizing a fully differentiable objective combining
robust reconstruction, a smooth sparsity penalty, and an orthogonality term. It selects
features post-hoc via cosine-preserving pruning. ROSS-PCA is GPU-accelerated, scaling
sparse PCA to currently inaccessible large datasets. All loss terms are calibrated to be
independent of data dimensionality and scale, giving penalty weights consistent meaning
across datasets. On synthetic benchmarks, ROSS-PCA matches or outperforms established
methods on support recovery with particular advantage under severe outlier contamination.
The smooth formulation of ROSS-PCA enables stability analysis via random initializations
since no feature is eliminated during training. We find that the method’s feature selection is
stable when the generating basis is theoretically identifiable and shows increased uncertainty
precisely when it is not. On the Human Lung Cell Atlas (584,944 × 27,402), gene selection
is 97% consistent across independent random initializations, and consensus gene modules
identified from reproducibility across runs are enriched for known biological pathways.

URL: https://openreview.net/forum?id=CZwpmO1ird

---

Title: Weight-Space Teleportation: Discovering LLM Reasoning via Bilevel Evolution

Abstract: On-policy reinforcement learning can substantially improve language model reasoning, yet all such methods share a fundamental exploration ceiling: reasoning strategies to which the current policy assigns negligible probability receive no gradient signal, regardless of sampling budget. Token-level perturbations cannot bridge this gap because they redistribute probability mass locally rather than transporting the model to qualitatively different regions of the reward landscape. We introduce $\textbf{EGPO}$ ($\textbf{E}$volutionary-$\textbf{G}$uided $\textbf{P}$olicy $\textbf{O}$ptimization) , a bilevel framework that overcomes this ceiling by coupling Differential Evolution (DE) in the weight space of Low-Rank Adaptation (LoRA) adapters with a policy-gradient inner loop. A frozen base model hosts a population of adapters; an outer DE loop mutates their singular values reducing search dimensionality by ${\sim}500\times$ relative to full weight-space evolution while an inner gradient step refines each candidate locally, enabling both the magnitude and the direction of learned features to co-evolve. Population diversity is maintained through fitness sharing, perplexity-based gating, and evolutionary culling. On GSM8K with Qwen2.5-3B-Instruct, EGPO achieves 77.6% pass@1 accuracy (+5.8 percentage points over the sample-matched baseline, $p < 0.001$), with gains that scale with problem difficulty (+9.6pp on problems requiring $\geq$8 reasoning steps) and transfer to the held-out MATH benchmark (+3.7pp).

URL: https://openreview.net/forum?id=NQPkA6YjSh

---

Title: Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think

Abstract: Large Language Models (LLMs) leverage step-by-step reasoning to solve complex problems. Standard evaluation practice involves generating a complete reasoning trace and assessing the correctness of the final answer presented at its conclusion. In this paper, we challenge the reliance on the final answer by posing the following two questions: Does the final answer reliably represent the model's optimal conclusion? Can alternative reasoning paths yield different results? To answer these questions, we analyze intermediate reasoning steps, termed subthoughts, and propose a method based on our findings. Our approach involves segmenting a reasoning trace into sequential subthoughts based on linguistic cues. We start by prompting the model to generate continuations from the end-point of each intermediate subthought. We extract a potential answer from every completed continuation originating from different subthoughts. We find that aggregating these answers by selecting the most frequent one (the mode) often yields significantly higher accuracy compared to relying solely on the answer derived from the original complete trace. Analyzing the consistency among the answers derived from different subthoughts reveals characteristics that correlate with the model's confidence and correctness, suggesting potential for identifying less reliable answers. Our experiments across various LLMs and challenging mathematical reasoning datasets (AIME2024 and AIME2025) show consistent accuracy improvements, with gains reaching up to 13% and 10% respectively.

URL: https://openreview.net/forum?id=jZA2lnmUT7

---

Title: Only relative ranks matter in weight-clustered large language models

Abstract: Large language models (LLMs) contain billions of parameters, yet many exact values are not essential. We show that what matters most is the \emph{relative rank} of weights---whether one connection is stronger or weaker than another---rather than precise magnitudes. To reduce the number of unique weight values, we apply weight clustering to pretrained models, replacing every weight matrix with $K$ shared values from $K$-means. For Llama~3.1-8B-Instruct and SmolLM2-135M, reducing each matrix to only 16--64 distinct values preserves strong accuracy \emph{without retraining}, providing a simple, training-free method to compress LLMs on disk. Optionally fine-tuning only the cluster means (centroids) recovers 30--40\% of the remaining accuracy gap at minimal cost. We then systematically randomize cluster means while keeping assignments fixed. Scrambling the \emph{relative ranks} of the clusters degrades quality sharply---perplexity can increase by orders of magnitude---even when global statistics such as mean and variance are preserved. In contrast, rank-preserving randomizations cause almost no loss at mid and late layers. On the other hand, when many layers are perturbed simultaneously, progressive layer-by-layer replacement reveals that \emph{scale drift}---not rank distortion---is the dominant collapse mechanism; however, an affine correction $w' = aw + b$ with $a > 0$ (which preserves both rank order and overall weight distribution) can substantially delay this drift. This rank-based perspective offers a new lens on model compression and robustness.

URL: https://openreview.net/forum?id=pCrkw7h4su

---

Title: A Spectral Bound on Effective Sharpness for Fisher- Preconditioned Gradient Descent

Abstract: An explicit stability characterization of effective sharpness $\lambda_{\max}(F^{-1}H)$ under Fisher preconditioning is provided, decomposing stability into residual curvature and model misspecification components. When the Gauss-Newton matrix $G$ equals the Fisher $F$ (the correctly specified negative log-likelihood setting), it is shown that the effective sharpness satisfies $S_{\text{eff}} \leq 1 + \epsilon/\mu_{\min}(F)$, where $\epsilon = \|H - G\|_2$ is the spectral norm of the residual curvature and $\mu_{\min}(F)$ is the minimum eigenvalue of the Fisher Information Matrix. When $G \neq F$, a relaxed bound $S_{\text{eff}} \leq 1 + (\epsilon + \delta)/\mu_{\min}(F)$ is established, with $\delta = \|G - F\|_2$ measuring model misspecification, thereby separating the two sources of curvature error. An alignment-aware Rayleigh quotient analysis reveals that the worst-case bound is loose by 1.3--7.1$\times$ due to favorable alignment between the residual curvature $Q$ and the Fisher eigenvectors. Experiments on deep linear networks (55--3,240 parameters, 5 seeds per configuration) verify the general misspecification-aware bound at all tested scales. On a 110-parameter deep linear network where all quantities are computed exactly, the idealized bound is confirmed to hold when $G \approx F$ but is violated when model misspecification is substantial, while the general bound correctly holds at all measured iterations. The experimental range is too limited to draw conclusions about scaling behavior. K-FAC at CIFAR-10 ResNet-18 scale (11.2M parameters) achieves 90.5\% test accuracy (vs. SGD 86.2\%), operating in regions of 41$\times$ higher raw Hessian sharpness while converging stably, consistent with the spectral flattening mechanism, though direct measurement of $S_{\text{eff}}$ at this scale remains intractable.

URL: https://openreview.net/forum?id=EabuvggEbb

---

Title: Relational Representations Mitigation of Catastrophic Forgetting: Graph Attention Networks for Online Continual Learning

Abstract: Online Continual Learning (OCL) aims to enable models to learn from non-stationary data streams while preserving knowledge acquired from previously observed tasks. A fundamental challenge in OCL for image classification is catastrophic forgetting. Existing approaches predominantly rely on Convolutional Neural Networks (CNNs), whose grid-based representations of images emphasize local features but insufficiently model relational structures that may be re-used and built upon even under distribution shifts. In this work, we propose ReReM, a Relational Representations Mitigation framework for OCL that integrates Graph Attention Networks (GATs) with hierarchical features extracted from pretrained vision backbones. Images are transformed into multi-scale graphs, allowing models to learn interactions between semantic regions and enabling continual learning through attention-based message passing over relational structures.
This design allows the model to selectively learn and update learned contextually relevant components for emphasis. To improve graph-level aggregation, we introduce a learnable weighted global pooling mechanism that adaptively prioritizes nodes. Furthermore, we propose a rehearsal duplication strategy that re-balances the influence of past and current data during training, improving knowledge retention without increasing memory storage requirements. Extensive experiments on CIFAR10, CIFAR100, and MiniImageNet demonstrate consistent improvements over existing pretrained CNN-based continual learning methods. Additional evaluations using Vision Transformer (ViT) feature extractors further show that the proposed framework generalizes across backbone architectures. Our results suggest that relational graph representations provide an effective inductive bias for mitigating forgetting in online continual learning.

URL: https://openreview.net/forum?id=jFufBoCNdA

---

Title: Recursive Structure Discovery as an Inductive Bias for Symbolic Regression

Abstract: Symbolic regression (SR) can recover analytic laws from data, but its search space is enormous. Many scientific targets are structurally simple, for example additively or multiplicatively separable, yet most SR pipelines do not exploit this. We introduce a recursive structure discovery step that tests for separability using accurate derivatives from a small neural model trained with second-order updates. The method decomposes $y=f(\mathbf{x})$ into a hierarchy of simpler subfunctions, which we feed to SR as a structure prior. This plug-in reduces search complexity, improves interpretability, and can attach to any SR backend; here we pair it with a deep RL generator. This substantially reduces search complexity, improves interpretability, and remains robust to noise, maintaining reliable separability detection under challenging conditions. On SRBench (Feynman, 120 equations), the structure-aware pipeline achieves state-of-the-art exact recovery, outperforming separability-only, pure RL, and prior hybrid baselines.

URL: https://openreview.net/forum?id=cEvo02f0ZL

---

Title: When Perplexity Lies: Generation-Focused Distillation of Hybrid Sequence Models

Abstract: Converting a pretrained Transformer into a more efficient hybrid model through distillation
offers a promising approach to reducing inference costs. However, achieving high-quality
generation in distilled models requires careful joint design of both the student architecture
and the distillation process. Many prior distillation works evaluate downstream multiple-
choice benchmarks by ranking candidate answers with log-likelihood rather than requiring
autoregressive generation, which can obscure important differences in model quality. For
example, we show that a 7B parameter distilled model that nearly matches its teacher to
within 0.2 pp under log-likelihood scoring actually falls behind by 20.8 pp when the model
must generate answers autoregressively.
We propose a Hybrid Kimi Delta Attention (Hybrid-KDA) architecture paired with GenDis-
till, a multi-stage distillation pipeline, and use generation-based evaluation throughout to
guide design decisions. Applying this approach to Qwen3-0.6B, we systematically ablate six
design axes: training objective, loss masking, training duration, dataset selection, parameter
freezing, and architecture choice. We find that log-likelihood-based evaluation consistently
underestimates the gap between teacher and student, and can in some cases reverse the
ranking of design choices, meaning that conclusions drawn from perplexity-only evaluation
may be misleading. Among the factors we study, dataset selection, completion-only mask-
ing, and freezing attention layers during post-training have the largest impact on generation
quality.
Our best Hybrid-KDA model retains 86–90% of teacher accuracy on knowledge benchmarks
while reducing KV cache memory by up to 75% and improving time-to-first-token by 2–4×
at 128K-token contexts.

URL: https://openreview.net/forum?id=u4sfTcn6Tx

---

Title: Scalable Learning from Probability Measures with Mean Measure Quantization

Abstract: We consider statistical learning problems in which data are observed as a set of probability measures. Optimal transport (OT) is a popular tool to compare and manipulate such objects, but its computational cost becomes prohibitive when the measures have large support. We study a quantization-based approach in which all input measures are approximated by $K$-point discrete measures sharing a common support. We establish consistency of the resulting quantized measures. We further derive convergence guarantees for several OT-based downstream tasks computed from the quantized measures. Numerical experiments on synthetic and real datasets demonstrate that the proposed approach achieves performance comparable to individual quantization while substantially reducing runtime.

URL: https://openreview.net/forum?id=eA4cQkY7Ug

---

Title: Black-Box Reliability Certification for AI Agents via Self-Consistency Sampling and Conformal Calibration

Abstract: Given a black-box AI system and a task, at what confidence level can a practitioner trust the system’s output? We answer with a reliability level—a single number per system–task pair, derived from self-consistency sampling and conformal calibration, that serves as a black-box deployment gate with exact, finite-sample, distribution-free guarantees. Self-consistency sampling reduces uncertainty exponentially; conformal calibration guarantees correctness within 1/(n+1) of the target level, regardless of the system’s errors—made transparently visible through larger answer sets for harder questions. Weaker models earn lower reliability levels (not accuracy—see Definition 2.4): GPT-4.1 earns 94.6% on GSM8K and 96.8% on TruthfulQA, while GPT-4.1-nano earns 89.8% on GSM8K and 66.5% on MMLU. We validate across five benchmarks, five models from three families, and both synthetic and real data. Conditional coverage on solvable items exceeds 0.93 across all configurations; sequential stopping reduces API costs by ∼50%.

URL: https://openreview.net/forum?id=H5jbbiFzrF

---

Title: Securing The Model Context Protocol (MCP): A Dual-Axis Survey with a Mitigation-Oriented Threat Taxonomy

Abstract: Agentic AI systems are increasingly adopting the Model Context Protocol (MCP) to integrate external tools and data sources. However, MCP’s extensibility and multi-agent design introduce a broader and more complex attack surface. This exposes these systems to a diverse set of security risks. Prior research provides only a partial view of these risks, focusing on either temporal aspects (MCP system lifecycle) or spatial aspects (MCP components) in isolation, without capturing how the associated threats emerge, evolve, and propagate across agentic workflows. This missing linkage obscures causal chains across components and lifecycle phases, hindering timely root cause localization and effective prevention of cross-phase threat cascades. To address this gap, we present a spatio-temporal aligned taxonomy that maps each threat to both its affected MCP component and the lifecycle phase in which it arises. This dual-axis view highlights where threats occur, when they emerge, and how they propagate across stages of deployment and operation. Grounded in this taxonomy, our survey catalogs over 50 MCP-specific threats and classifies each by lifecycle phase and affected component, providing, to our knowledge, one of the broadest phase-by-component mappings of MCP security risks to date. Each threat is also cross-referenced to STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, and Elevation of Privilege) and MAESTRO (Modeling Adversarial Events for Systematic Threat Representation and Organization) to align with widely adopted threat-modeling frameworks for agentic AI, covering core security properties (via STRIDE) and layers specific to agentic AI (via MAESTRO). We then pair each taxonomy category with actionable controls, explicit verification checks, and runtime signals, supporting structured control prioritization, verification checks, and runtime monitoring across design-time and runtime. We further validate this control-oriented view through a compact benchmark of verifiable checks in a real MCP deployment, showing how representative failures can be detected, localized, and tied to concrete audit artifacts. Together, the taxonomy and its control mappings provide an evidence-informed framework for organizing MCP security priorities, verification checks, and reviewable security evidence across the MCP lifecycle. The benchmark implementation, scenario definitions, verification scripts, and reproduction instructions are available at https://anonymous.4open.science/r/mcp-security-benchmark-22C9/

URL: https://openreview.net/forum?id=Aqn9Wdr2wN

---

Title: TS-Reasoner: Aligning Time Series Foundation Models with LLM Reasoning

Abstract: Time series reasoning is crucial to decision-making in diverse domains, including finance, energy usage, traffic, weather, and scientific discovery. While existing time series foundation models (TSFMs) can capture low-level dynamic patterns and provide accurate forecasting, further analysis usually requires additional background knowledge and sophisticated reasoning, which are lacking in most TSFMs but can be achieved through Large Language Models (LLMs). On the other hand, without expensive post-training, LLMs often struggle with the numerical understanding of time series data. Although it is intuitive to integrate the two types of models, developing effective training recipes that align the two modalities for reasoning tasks is still an open challenge.
To this end, we propose TS-Reasoner that aligns the latent representations of TSFMs with the textual inputs of LLMs for downstream understanding/reasoning tasks.
Specifically, we propose a simple yet effective method to curate diverse, synthetic pairs of time series and textual captions for alignment training. We then develop a two-stage training recipe that applies instruction finetuning after the alignment pretraining. Unlike existing works that train an LLM to take time series as inputs, we leverage a pretrained TSFM and freeze it during training.
Extensive experiments on several benchmarks demonstrate that TS-Reasoner not only outperforms a wide range of prevailing LLMs, Vision Language Models (VLMs), and Time Series LLMs, but also achieves this with remarkable data efficiency, e.g., using less than half the training data.

URL: https://openreview.net/forum?id=d6TD0f2xXq

---

Title: A Late Semantic Repair Pathway in I-JEPA’s Visual World Model

Abstract: We study the internal mechanism by which the vision-only I-JEPA model converts masked
visual inputs into robust global representations. Using a suite of mechanistic interpretability
tools on a fixed ImageNet-pretrained checkpoint, we find a depth-structured transition from
local occlusion sensitivity into a late semantic repair regime. The dominant causal bottleneck
lies in MLP expansion states around encoder layer 29, where patching only visible-context
tokens substantially restores the clean representation, while patching only masked tokens
helps very little. Segmentation-guided object masks reveal that this late pathway is more
strongly engaged by missing semantic object structure than by matched-area background
occlusions, even under strict zero-overlap background-token controls. Late attention outputs
at layers 30–31 add a narrower but nonredundant final-stage rescue on top of the MLP
bottleneck; a best three-site attention-plus-MLP pathway is best for medium and full token
budgets and generalizes across object sizes, balanced class slices up to sixteen breeds, and
a second segmentation data set (Pascal VOC).

URL: https://openreview.net/forum?id=Ifi0P8im0c

---

Reply all

Reply to author

Forward

0 new messages