Weekly TMLR digest for Feb 15, 2026

25 views
Skip to first unread message

TMLR

unread,
Feb 15, 2026, 12:00:11 AMFeb 15
to tmlr-annou...@googlegroups.com


New certifications
==================

Survey Certification: A Survey on Federated Fine-Tuning of Large Language Models

Yebo Wu, Chunlin Tian, Jingguang Li, He Sun, KaHou Tam, Zhanting Zhou, Haicheng Liao, Jing Xiong, Zhijiang Guo, Li Li, Cheng-zhong Xu

https://openreview.net/forum?id=rnCqbuIWnn

---


Survey Certification: A Survey of Token Compression for Efficient Multimodal Large Language Models

Kele Shao, Keda TAO, Kejia Zhang, Sicheng Feng, Mu Cai, Yuzhang Shang, Haoxuan You, Can Qin, Yang Sui, Huan Wang

https://openreview.net/forum?id=G2od9JVHkE

---


J2C Certification: Diffusion posterior sampling for simulation-based inference in tall data settings

Julia Linhart, Gabriel Cardoso, Alexandre Gramfort, Sylvain Le Corff, Pedro L. C. Rodrigues

https://openreview.net/forum?id=cdhfoS6Gyo

---


Survey Certification: The Five Ws of Multi-Agent Communication: Who Talks to Whom, When, What, and Why - A Survey from MARL to Emergent Language and LLMs

Jingdi Chen, Hanqing Yang, Zongjun Liu, Carlee Joe-Wong

https://openreview.net/forum?id=LGsed0QQVq

---


J2C Certification: BiSSL: Enhancing the Alignment Between Self-Supervised Pretraining and Downstream Fine-Tuning via Bilevel Optimization

Gustav Wagner Zakarias, Lars Kai Hansen, Zheng-Hua Tan

https://openreview.net/forum?id=GQAGlqOpyA

---


J2C Certification: The Internal Growth Function: A More General PAC Framework for Scenario Decision Making

Guillaume O Berger, Raphael Jungers

https://openreview.net/forum?id=HqPKJSAkrp

---


J2C Certification: Segmentation From Attention: Training-Free Layer Selection and One-Shot Tuning for Segmentation in VLMs

Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal, James J. Little

https://openreview.net/forum?id=a5lAwubXro

---


Accepted papers
===============


Title: A simple connection from loss flatness to compressed neural representations

Authors: Shirui Chen, Stefano Recanatesi, Eric Todd SheaBrown

Abstract: Despite extensive study, the fundamental significance of sharpness---the trace of the loss Hessian at local minima---remains unclear. While often associated with generalization, recent work reveals inconsistencies in this relationship. We explore an alternative perspective by investigating how sharpness relates to the geometric structure of neural representations in feature space. Specifically, we build from earlier work by Ma and Ying to broadly study compression of representations, defined as the degree to which neural activations concentrate when inputs are locally perturbed. We introduce three quantitative measures: the Local Volumetric Ratio (LVR), which captures volume contraction through the network; the Maximum Local Sensitivity (MLS), which measures maximum output change normalized by the magnitude of input perturbations; and Local Dimensionality, which captures uniformity of compression across directions.

We derive upper bounds showing that LVR and MLS are mathematically constrained by sharpness: flatter minima necessarily limit these compression metrics. These bounds extend to reparametrization-invariant sharpness (measures unchanged under layer rescaling), addressing a key limitation of standard sharpness. We introduce network-wide variants (NMLS, NVR) that account for all layer weights, providing tighter and more stable bounds than prior single-layer analyses. Empirically, we validate these predictions across feedforward, convolutional, and transformer architectures, demonstrating consistent positive correlation between sharpness and compression metrics. Our results suggest that sharpness fundamentally quantifies representation compression rather than generalization directly, offering a resolution to contradictory findings on the sharpness-generalization relationship and establishing a principled mathematical link between parameter-space geometry and feature-space structure. Code is available at \url{https://github.com/chinsengi/sharpness-compression}.

URL: https://openreview.net/forum?id=GgpQbU9bFR

---

Title: When Does LoRA Reuse Work? Theoretical Limits and Mechanisms for Recycling LoRAs Without Data Access

Authors: Mei-Yen Chen, Thi Thu Uyen Hoang, Michael Hahn, M. Saquib Sarfraz

Abstract: Reusing low-rank adapters (LoRAs) by merging or routing is a common strategy for adapting
large language models to new tasks, especially when training data is unavailable but many
fine-tuned LoRAs are accessible. While the availability of publicly shared LoRA weights has
inspired new algorithms for composing them to solve new tasks, recent findings highlight
limitations in LoRA’s ability to integrate new knowledge. This work investigates when LoRA
reuse can be successful for compositional factual and reasoning tasks. Through theoretical
analysis in a simplified setup and experiments on a controlled synthetic two-hop reasoning task
with extensions to math word problems, cross-lingual code generation, and history/geography
QA, we show that data-agnostic methods, such as parameter averaging and dynamic selection,
often fail to combine knowledge from logically disjoint fine-tuning datasets. This challenge is
particularly pronounced when the relevant knowledge is underrepresented during pretraining.
However, reuse can succeed when fine-tuning datasets share solution templates, such as
reasoning patterns or reusable code, which serve as bridges among tasks. Our results suggest
that LoRA reuse relies more on shallow pattern matching than on logical integration of
existing knowledge. This mechanism-based perspective offers practical guidance for curating
datasets and designing systems that enable LoRA reuse to overcome data-access limitations.
Findings indicate that future research should focus on the mechanisms enabling effective
adapter reuse rather than solely on developing new reuse algorithms.

URL: https://openreview.net/forum?id=lVqUJlsnRy

---

Title: DiffusionRollout: Uncertainty-Aware Rollout Planning in Long-Horizon PDE Solving

Authors: Seungwoo Yoo, Juil Koo, Daehyeon Choi, Minhyuk Sung

Abstract: We propose DiffusionRollout, a novel selective rollout planning strategy for autoregres-
sive diffusion models, aimed at mitigating error accumulation in long-horizon predictions of
physical systems governed by partial differential equations (PDEs). Building on the recently
validated probabilistic approach to PDE solving, we further explore its ability to quantify
predictive uncertainty and demonstrate a strong correlation between prediction errors and
standard deviations computed over multiple samples—supporting their use as a proxy for
the model’s predictive confidence. Based on this observation, we introduce a mechanism that
adaptively selects step sizes during autoregressive rollouts, improving long-term prediction
reliability by reducing the compounding effect of conditioning on inaccurate prior outputs.
Extensive evaluation on long-trajectory PDE prediction benchmarks validates the effective-
ness of the proposed uncertainty measure and adaptive planning strategy, as evidenced by
lower prediction errors and longer predicted trajectories that retain a high correlation with
their ground truths.

URL: https://openreview.net/forum?id=OCzcGOzgzz

---

Title: On-Policy Policy Gradient Reinforcement Learning Without On-Policy Sampling

Authors: Nicholas E. Corrado, Josiah P. Hanna

Abstract: On-policy reinforcement learning (RL) algorithms are typically characterized as algorithms that perform policy updates using i.i.d. trajectories collected by the agent's current policy. However, after observing only a finite number of trajectories, such on-policy sampling may produce data that fails to match the expected on-policy data distribution. This \textit{sampling error} leads to high-variance gradient estimates that yield data inefficient on-policy learning. Recent work in the policy evaluation setting has shown that non-i.i.d.\@, off-policy sampling can produce data with lower sampling error w.r.t. the expected on-policy distribution than on-policy sampling can produce~\citep{zhong2022robust}. Motivated by this observation, we introduce an adaptive, off-policy sampling method to reduce sampling error during on-policy policy gradient RL training. Our method, Proximal Robust On-Policy Sampling (PROPS), reduces sampling error by collecting data with a \textit{behavior policy} that increases the probability of sampling actions that are under-sampled w.r.t. the current policy. We empirically evaluate PROPS on both continuous-action MuJoCo benchmark tasks as well as discrete-action tasks and demonstrate that (1) PROPS decreases sampling error throughout training and (2) increases the data efficiency of on-policy policy gradient algorithms.

URL: https://openreview.net/forum?id=nCoyFp8uO1

---

Title: A Survey on Federated Fine-Tuning of Large Language Models

Authors: Yebo Wu, Chunlin Tian, Jingguang Li, He Sun, KaHou Tam, Zhanting Zhou, Haicheng Liao, Jing Xiong, Zhijiang Guo, Li Li, Cheng-zhong Xu

Abstract: Large Language Models (LLMs) have demonstrated impressive success across various tasks. Integrating LLMs with Federated Learning (FL), a paradigm known as FedLLM, offers a promising avenue for collaborative model adaptation while preserving data privacy. This survey provides a systematic and comprehensive review of FedLLM. We begin by tracing the historical development of both LLMs and FL, summarizing relevant prior research to set the context. Subsequently, we delve into an in-depth analysis of the fundamental challenges inherent in deploying FedLLM. Addressing these challenges often requires efficient adaptation strategies; therefore, we conduct an extensive examination of existing Parameter-Efficient Fine-tuning (PEFT) methods and explore their applicability within the FL framework. To rigorously evaluate the performance of FedLLM, we undertake a thorough review of existing fine-tuning datasets and evaluation benchmarks. Furthermore, we discuss FedLLM's diverse real-world applications across multiple domains. Finally, we identify critical open challenges and outline promising research directions to foster future advancements in FedLLM. This survey aims to serve as a foundational resource for researchers and practitioners, offering valuable insights into the rapidly evolving landscape of federated fine-tuning for LLMs. It also establishes a roadmap for future innovations in privacy-preserving AI. We actively maintain a GitHub repo to track cutting-edge advancements in this field.

URL: https://openreview.net/forum?id=rnCqbuIWnn

---

Title: A Survey of Token Compression for Efficient Multimodal Large Language Models

Authors: Kele Shao, Keda TAO, Kejia Zhang, Sicheng Feng, Mu Cai, Yuzhang Shang, Haoxuan You, Can Qin, Yang Sui, Huan Wang

Abstract: Multimodal large language models (MLLMs) have made remarkable strides, largely driven by their ability to process increasingly long and complex contexts, such as high-resolution images, extended video sequences, and lengthy audio input. While this ability significantly enhances MLLM capabilities, it introduces substantial computational challenges, primarily due to the quadratic complexity of self-attention mechanisms with numerous input tokens. To mitigate these bottlenecks, token compression has emerged as an auspicious and critical approach, efficiently reducing the number of tokens during both training and inference. In this paper, we present the first systematic survey and synthesis of the burgeoning field of multimodal long context token compression. Recognizing that effective compression strategies are deeply tied to the unique characteristics and redundancies of each modality, we categorize existing approaches by their primary data focus, enabling researchers to quickly access and learn methods tailored to their specific area of interest: (1) image-centric compression, which addresses spatial redundancy in visual data; (2) video-centric compression, which tackles spatio-temporal redundancy in dynamic sequences; and (3) audio-centric compression, which handles temporal and spectral redundancy in acoustic signals. Beyond this modality-driven categorization, we further dissect methods based on their underlying mechanisms, including transformation-based, similarity-based, attention-based, and query-based approaches. By providing a comprehensive and structured overview, this survey aims to consolidate current progress, identify key challenges, and inspire future research directions in this rapidly evolving domain.

URL: https://openreview.net/forum?id=G2od9JVHkE

---

Title: Understanding Guidance Scale in Diffusion Models from a Geometric Perspective

Authors: Zhiyuan Zhan, Liuzhuozheng Li, Masashi Sugiyama

Abstract: Conditional diffusion models have become a leading approach for generating condition-consistent samples, such as class-specific images. In practice, the guidance scale is a key hyperparameter in conditional diffusion models, used to adjust the strength of the guidance term. While empirical studies have demonstrated that appropriately choosing the scale can significantly enhance generation quality, the theoretical understanding of its role remains limited. In this work, we analyze the probabilistic guidance term from a geometric view under the linear manifold assumption and, based on this analysis, construct a geometric guidance model that enables tractable theoretical study. To address regularity issues arising from multi-modal data, we introduce a mollification technique that ensures well-posed dynamics. Our theoretical results show that increasing the guidance scale improves alignment with the target data manifold, thereby enhancing generation performance. We further extend our framework to nonlinear manifolds, and empirical results on real-world datasets validate the effectiveness of the proposed model and are consistent with our theories.

URL: https://openreview.net/forum?id=nfHimL6g8G

---

Title: Bayesian Network Structure Discovery Using Large Language Models

Authors: Yinghuan Zhang, Yufei Zhang, Parisa Kordjamshidi, Zijun Cui

Abstract: Understanding probabilistic dependencies among variables is central to analyzing complex systems. Traditional structure learning methods often require extensive observational data or are limited by manual, error-prone incorporation of expert knowledge. Recent studies have explored using large language models (LLMs) for structure learning, but most treat LLMs as auxiliary tools for pre-processing or post-processing, leaving the core learning process data-driven. In this work, we introduce a unified framework for Bayesian network structure discovery that places LLMs at the center, supporting both data-free and data-aware settings. In the data-free regime, we introduce \textbf{PromptBN}, which leverages LLM reasoning over variable metadata to generate a complete directed acyclic graph (DAG) in a single call. PromptBN effectively enforces global consistency and acyclicity through dual validation, achieving constant $\mathcal{O}(1)$ query complexity. When observational data are available, we introduce \textbf{ReActBN} to further refine the initial graph. ReActBN combines statistical evidence with LLM by integrating a novel ReAct-style reasoning with configurable structure scores (e.g., Bayesian Information Criterion). Experiments demonstrate that our method outperforms prior data-only, LLM-only, and hybrid baselines, particularly in low- or no-data regimes and on out-of-distribution datasets.

URL: https://openreview.net/forum?id=G4mrO8LVix

---

Title: Holistic Continual Learning under Concept Drift with Adaptive Memory Realignment

Authors: Alif Ashrafee, Jędrzej Kozal, Michał Woźniak, Bartosz Krawczyk

Abstract: Traditional continual learning methods prioritize knowledge retention and focus primarily on mitigating catastrophic forgetting, implicitly assuming that the data distribution of previously learned tasks remains static. This overlooks the dynamic nature of real-world data streams, where concept drift permanently alters previously seen data and demands both stability and rapid adaptation. We introduce a holistic framework for continual learning under concept drift that simulates realistic scenarios by evolving task distributions. As a baseline, we consider Full Relearning (FR), in which the model is retrained from scratch on newly labeled samples from the drifted distribution. While effective, this approach incurs substantial annotation and computational overhead. To address these limitations, we propose Adaptive Memory Realignment (AMR), a lightweight alternative that equips rehearsal-based learners with a drift-aware adaptation mechanism. AMR selectively removes outdated samples of drifted classes from the replay buffer and repopulates it with a small number of up-to-date instances, effectively realigning memory with the new distribution. This targeted resampling matches the performance of FR while reducing the need for labeled data and computation by orders of magnitude. To enable reproducible evaluation, we introduce four concept drift variants of standard vision benchmarks: Fashion-MNIST-CD, CIFAR10-CD, CIFAR100-CD, and Tiny-ImageNet-CD, where previously seen classes reappear with shifted representations. Comprehensive experiments on these datasets using several rehearsal-based baselines show that AMR consistently counters concept drift, maintaining high accuracy with minimal overhead. These results position AMR as a scalable solution that reconciles stability and plasticity in non-stationary continual learning environments. Full implementation of our framework and concept drift benchmark datasets are available at: https://github.com/AlifAshrafee/CL-Under-Concept-Drift.

URL: https://openreview.net/forum?id=1drDlt0CLM

---

Title: Diffusion posterior sampling for simulation-based inference in tall data settings

Authors: Julia Linhart, Gabriel Cardoso, Alexandre Gramfort, Sylvain Le Corff, Pedro L. C. Rodrigues

Abstract: Identifying the parameters of a non-linear model that best explain observed data is a core task across scientific fields. When such models rely on complex simulators, evaluating the likelihood is typically intractable, making traditional inference methods such as MCMC inapplicable. Simulation-based inference (SBI) addresses this by training deep generative models to approximate the posterior distribution over parameters using simulated data. In this work, we consider the tall data setting, where multiple independent observations provide additional information, allowing sharper posteriors and improved parameter identifiability.
Building on the flourishing score-based diffusion literature, F-NPSE (Geffner et al., 2023) estimates the tall data posterior by composing individual scores from a neural network trained only for a single context observation. This enables more flexible and simulation-efficient inference than alternative approaches for tall datasets in SBI.
However, it relies on costly Langevin dynamics during sampling. We propose a new algorithm that eliminates the need for Langevin steps by explicitly approximating the diffusion process of the tall data posterior. Our method retains the advantages of compositional score-based inference while being significantly faster and more stable than F-NPSE. We demonstrate its improved performance on toy problems and standard SBI benchmarks, and showcase its scalability by applying it to a complex real-world model from computational neuroscience.

URL: https://openreview.net/forum?id=cdhfoS6Gyo

---

Title: A Survey on Deep Learning Approaches for Tabular Data Generation: Utility, Alignment, Fidelity, Privacy, Diversity, and Beyond

Authors: Mihaela C. Stoian, Eleonora Giunchiglia, Thomas Lukasiewicz

Abstract: Generative modelling has become the standard approach for synthesising tabular data. However, different use cases demand synthetic data to comply with different requirements to be useful in practice. In this survey, we review deep generative modelling approaches for tabular data from the perspective of five types of requirements: utility of the synthetic data, alignment of the synthetic data with domain-specific knowledge, statistical fidelity of the synthetic data distribution compared to the real data distribution, privacy-preserving capabilities, and sampling diversity. We group the approaches along two levels of granularity: (i) based on the requirements they address and (ii) according to the underlying model they utilise. Additionally, we summarise the appropriate evaluation methods for each requirement, the relationships among the requirements, and the specific characteristics of each model type. Finally, we discuss future directions for the field, along with opportunities to improve the current evaluation methods. Overall, this survey can be seen as a user guide to tabular data generation: helping readers navigate available models and evaluation methods to find those best suited to their needs.

URL: https://openreview.net/forum?id=RoShSRQQ67

---

Title: Forget Less, Retain More: A Lightweight Regularizer for Rehearsal-Based Continual Learning

Authors: Lama Alssum, Hasan Abed Al Kader Hammoud, Motasem Alfarra, Juan C Leon Alcazar, Bernard Ghanem

Abstract: Deep neural networks suffer from catastrophic forgetting, where performance on previous tasks degrades after training on a new task. This issue arises due to the model’s tendency to overwrite previously acquired knowledge with new information. We present a novel approach to address this challenge, focusing on the intersection of memory-based methods and regularization approaches. We formulate a regularization strategy, termed Information Maximization (IM) regularizer, for memory-based continual learning methods, which is based exclusively on the expected label distribution, thus making it class-agnostic. As a consequence, IM regularizer can be directly integrated into various rehearsal-based continual learning methods, reducing forgetting and favoring faster convergence. Our empirical validation shows that, across datasets and regardless of the number of tasks, our proposed regularization strategy consistently improves baseline performance at the expense of a minimal computational overhead. The lightweight nature of IM ensures that it remains a practical and scalable solution, making it applicable to real-world continual learning scenarios where efficiency is paramount. Finally, we demonstrate the data-agnostic nature of our regularizer by applying it to video data, which presents additional challenges due to its temporal structure and higher memory requirements. Despite the significant domain gap, our experiments show that IM regularizer also improves the performance of video continual learning methods.

URL: https://openreview.net/forum?id=CJw1ZjkJMG

---

Title: Moment Constrained Optimal Transport for Control Applications

Authors: Thomas Le Corre, Ana Busic, Sean P. Meyn

Abstract: This paper concerns the application of techniques from optimal transport (OT) to mean field control, in which the probability measures of interest in OT correspond to empirical distributions associated with a large collection of controlled agents. The control objective of interest motivates a one-sided relaxation of OT, in which the first marginal is fixed and the second marginal is constrained to a “moment class”: a set of probability measures defined by generalized moment constraints. This relaxation is particularly interesting for control problems as it enables the coordination of agents without the need to know the desired distribution beforehand. The inclusion of an entropic regularizer is motivated by both computational considerations, and also to impose hard constraints on agent behavior. A computational approach inspired by the Sinkhorn algorithm is proposed to solve this problem. This new approach to distributed control is illustrated with an application of charging a fleet of electric vehicles while satisfying grid constraints. An online version is proposed and applied in a case study on the ElaadNL dataset containing 10,000 electric vehicle charging sessions in the Netherlands. This empirical validation demonstrates the applicability of the proposed approach to optimizing flexibility while respecting grid constraints.

URL: https://openreview.net/forum?id=2hAtSpnat9

---

Title: Concept Flow Models: Anchoring Concept-Based Reasoning with Hierarchical Bottlenecks

Authors: Ya Wang, Adrian Paschke

Abstract: Concept Bottleneck Models (CBMs) enhance interpretability by projecting learned features into a human-understandable concept space. Recent approaches leverage vision-language models to generate concept embeddings, reducing the need for manual concept annotations. However, these models suffer from a critical limitation: as the number of concepts approaches the embedding dimension, information leakage increases, enabling the model to exploit spurious or semantically irrelevant correlations and undermining interpretability. In this work, we propose Concept Flow Models (CFMs), which replace the flat bottleneck with a hierarchical, concept-driven decision tree. Each internal node in the hierarchy focuses on a localized subset of discriminative concepts, progressively narrowing the prediction scope. Our framework automatically constructs decision hierarchies from visual embeddings, distributes semantic concepts at each hierarchy level, and trains differentiable concept weights through probabilistic tree traversal. Extensive experiments on diverse benchmarks demonstrate that CFMs match the predictive performance of flat CBMs, while substantially reducing effective concept usage and information leakage. Furthermore, CFMs yield stepwise decision flows that enable transparent and auditable model reasoning.

URL: https://openreview.net/forum?id=TNYLf65I3I

---

Title: Semantic-aware Adversarial Fine-tuning for CLIP

Authors: Jiacheng Zhang, Jinhao Li, Hanxun Huang, Sarah Monazam Erfani, Benjamin I. P. Rubinstein, Feng Liu

Abstract: Recent studies have shown that CLIP model's adversarial robustness in zero-shot classification tasks can be enhanced by adversarially fine-tuning its image encoder with adversarial examples (AEs), which are generated by minimizing the cosine similarity between images and a hand-crafted template (e.g., ''A photo of a {label}''). However, it has been shown that the cosine similarity between a single image and a single hand-crafted template is insufficient to measure the similarity for image-text pairs. Building on this, in this paper, we find that the AEs generated using cosine similarity may fail to fool CLIP when the similarity metric is replaced with semantically enriched alternatives, making the image encoder fine-tuned with these AEs less robust. To overcome this issue, we first propose a semantic-ensemble attack to generate semantic-aware AEs by minimizing the average similarity between the original image and an ensemble of refined textual descriptions. These descriptions are initially generated by a foundation model to capture core semantic features beyond hand-crafted templates and are then refined to reduce hallucinations. To this end, we propose Semantic-aware Adversarial Fine-Tuning (SAFT), which fine-tunes CLIP's image encoder with semantic-aware AEs. Extensive experiments show that SAFT outperforms current methods, achieving substantial improvements in zero-shot adversarial robustness across 16 datasets. Our code is available at: https://github.com/tmlr-group/SAFT.

URL: https://openreview.net/forum?id=SzZOBzueK0

---

Title: Byzantine-Robust Gossip: Insights from a Dual Approach

Authors: Renaud Gaucher, Hadrien Hendrikx, Aymeric Dieuleveut

Abstract: Distributed learning has many computational benefits but is vulnerable to attacks from a subset of devices transmitting incorrect information. This paper investigates Byzantine-resilient algorithms in a decentralized setting, where devices communicate directly in a peer-to-peer manner within a communication network. We leverage the so-called dual approach for decentralized optimization and propose a Byzantine-robust algorithm. We provide convergence guarantees in the average consensus subcase, discuss the potential of the dual approach beyond this subcase, and re-interpret existing algorithms using the dual framework. Lastly, we experimentally show the soundness of our method.

URL: https://openreview.net/forum?id=wrLiUpfk4s

---

Title: Learning to Defer with an Uncertain Rejector via Conformal Prediction

Authors: Yizirui Fang, Eric Nalisnick

Abstract: Learning to defer (L2D) aims to optimize human-AI collaboration by allocating prediction tasks to either a machine learning model or a human expert, depending on which is most likely to be correct. This allocation decision is governed by a rejector: a meta-model that routes inputs based on estimated success probabilities. In practice, a poorly fit or otherwise misspecified rejector can jeopardize the entire L2D workflow due to its crucial role in allocating prediction tasks. In this work, we perform uncertainty quantification for the rejector. We use conformal prediction to allow the rejector to output prediction sets or intervals instead of just the binary outcome of ‘defer’ or not. On tasks ranging from image to hate speech classification, we demonstrate that the uncertainty in the rejector translates to safer decisions via two forms of selective prediction.

URL: https://openreview.net/forum?id=SZQJ8K2DUe

---

Title: CEPAE: Conditional Entropy-Penalized Autoencoders for Time Series Counterfactuals

Authors: Tomas Garriga, Gerard Sanz, Eduard Serrahima de Cambra, Axel Brando

Abstract: The ability to accurately perform counterfactual inference on time series is crucial for decision-making in fields like finance, healthcare, and marketing, as it allows us to understand the impact of events or treatments on outcomes over time. In this paper, we introduce a new counterfactual inference approach tailored to time series data impacted by market events, which arises from an industrial context. Utilizing the abduction-action-prediction procedure and the Structural Causal Model framework, we begin employing methods based on variational autoencoders and adversarial autoencoders, both previously used in counterfactual works although not in time series settings. Then, we present the Conditional Entropy-Penalized Autoencoder (CEPAE), a novel autoencoder-based approach for counterfactual inference, which employs an entropy penalization loss over the latent space to achieve disentangled data representations. We validate our approach both theoretically and experimentally on synthetic, semi-synthetic, and real-world datasets, showing that CEPAE outperforms the other approaches in the evaluated metrics.

URL: https://openreview.net/forum?id=X6lrzqOtQo

---

Title: Benchmarking Missing Data Imputation Methods in Socioeconomic Surveys

Authors: Siyi Sun, David Antony Selby, Yunchuan Huang, Ayush Patnaik, Sebastian Josef Vollmer, Seth Flaxman, Anisoara Calinescu

Abstract: Missing data imputation is a core challenge in socioeconomic surveys, where data is often longitudinal, hierarchical, high-dimensional, not independent and identically distributed, and missing under complex mechanisms. Socioeconomic datasets like the Consumer Pyramids Household Survey (CPHS)-the largest continuous household survey in India since 2014, covering 174,000 households-highlight the importance of robust imputation, which can reduce survey costs, preserve statistical power, and enable timely policy analysis. This paper systematically evaluates these methods under three missingness mechanisms: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR), across five missingness ratios ranging from 10% to 50%. We evaluate imputation performance on both continuous and categorical variables, assess the impact on downstream tasks, and compare the computational efficiency of each method. Our results indicate that classical machine learning methods such as MissForest and HyperImpute remain strong baselines with favorable trade-offs between accuracy and efficiency, while deep learning methods perform better under complex missingness patterns and higher missingness ratios, but face scalability challenges. We ran experiments on CPHS and multiple synthetic survey datasets, and found consistent patterns across them. Our framework aims to provide a reliable benchmark for structured socioeconomic surveys, and addresses the critical gap in reproducible, domain-specific evaluation of imputation methods. The open-source code is provided.

URL: https://openreview.net/forum?id=HLhi9xhRw6

---

Title: Implicit Probabilistic Reasoning Does Not Reflect Explicit Answers in Large Language Models

Authors: Manuel Mondal, Ljiljana Dolamic, Gérôme Bovet, Philippe Cudre-Mauroux, Julien Audiffren

Abstract: The handling of probabilities in the form of uncertainty or partial information is an essential task for LLMs in many settings and applications. A common approach to evaluate an LLM's probabilistic reasoning capabilities is to assess its ability to answer questions pertaining to probability through the use of multiple-choice questions (MCQs). However, this paradigm, which we refer to as explicit probabilistic reasoning, has been shown in the literature to yield significant limitations (e.g., sensitivity to answer ordering). In this work, we introduce an alternative approach, named implicit probabilistic reasoning, which evaluates the models' ability to integrate probabilistic reasoning into their text generation process. To achieve this, we rephrase MCQs as text-completion scenarios with a determined set of outcomes and compare the model's next-token probability assignments to the true likelihood of the outcomes. In line with previous work, we find that models exhibit solid performance in their explicit probabilistic reasoning (i.e., answers to MCQs). However, during text completion (i.e., implicit probabilistic reasoning), where the same information must be taken into account to generate text, the models' predictions often significantly diverge from the known ground truth. For instance, our evaluation method reveals that implicit probabilistic reasoning is improperly influenced by many factors, such as independent prior events, partial observations about a result, or statistical background information. All of these issues can cause erroneous results to be produced in text generation, which are not detected by conventional MCQ-based evaluation.

URL: https://openreview.net/forum?id=HaaAY4ZXPa

---

Title: CARINOX: Inference-time Scaling with Category-Aware Reward-based Initial Noise Optimization and Exploration

Authors: Seyed Amir Kasaei, Ali Aghayari, Arash Marioriyad, Niki Sepasian, Shayan Baghayi Nejad, MohammadAmin Fazli, Mahdieh Soleymani Baghshah, Mohammad Hossein Rohban

Abstract: Text-to-image diffusion models, such as Stable Diffusion, can produce high-quality and diverse images but often fail to achieve compositional alignment, particularly when prompts describe complex object relationships, attributes, or spatial arrangements. Recent inference-time approaches address this by optimizing or exploring the initial noise under the guidance of reward functions that score text–image alignment—without requiring model fine-tuning. While promising, each strategy has intrinsic limitations when used alone: optimization can stall due to poor initialization or unfavorable search trajectories, whereas exploration may require a prohibitively large number of samples to locate a satisfactory output. Our analysis further shows that neither single reward metrics nor ad-hoc combinations reliably capture all aspects of compositionality, leading to weak or inconsistent guidance. To overcome these challenges, we present Category-Aware Reward-based Initial Noise Optimization and EXploration (CARINOX), a unified framework that combines noise optimization and exploration with a principled reward selection procedure grounded in correlation with human judgments. Evaluations on two complementary benchmarks—covering diverse compositional challenges—show that CARINOX raises average alignment scores by +16% onT2I-CompBench++ and +11% on the HRS benchmark, consistently outperforming state-of-the-art optimization and exploration-based methods across all major categories, while preserving image quality and diversity.

URL: https://openreview.net/forum?id=XB1cwXHV0c

---

Title: Steering Large Reasoning Models towards Concise Reasoning via Flow Matching

Authors: Yawei Li, Benjamin Bergner, Yinghan Zhao, Vihang Prakash Patil, Bei Chen, Cheng Wang

Abstract: Large Reasoning Models (LRMs) excel at complex reasoning tasks, but their efficiency is often hampered by overly verbose outputs. Prior steering methods attempt to address this issue by applying a single, global vector to hidden representations—an approach grounded in the restrictive \textit{linear representation hypothesis}. In this work, we introduce FlowSteer, a nonlinear steering method that goes beyond uniform linear shifts by learning a complete \textit{transformation between the distributions} associated with verbose and concise reasoning. This transformation is learned via \textit{Flow Matching} as a velocity field, enabling precise, input-dependent control over the model's reasoning process. By aligning steered representations with the distribution of concise-reasoning activations, FlowSteer yields more compact reasoning than the linear shifts. Across diverse reasoning benchmarks, FlowSteer demonstrates strong task performance and token efficiency compared to leading inference-time baselines. Our work demonstrates that modeling the full distributional transport with generative techniques offers a more effective and principled foundation for controlling LRMs.

URL: https://openreview.net/forum?id=qwcJMdGerK

---

Title: The Five Ws of Multi-Agent Communication: Who Talks to Whom, When, What, and Why - A Survey from MARL to Emergent Language and LLMs

Authors: Jingdi Chen, Hanqing Yang, Zongjun Liu, Carlee Joe-Wong

Abstract: Multi-agent sequential decision-making underpins many real-world systems, from autonomous vehicles and robotics to collaborative AI assistants. In dynamic and partially observable environments, effective communication is essential for reducing uncertainty and enabling coordination. Although research on multi-agent communication (MA-Comm) spans diverse paradigms, we organize this survey explicitly around the Five Ws of communication: who communicates with whom, what is communicated, when communication occurs, and why communication is beneficial. This lens provides a coherent structure for synthesizing diverse approaches and exposing shared design principles across paradigms. Within Multi-Agent Reinforcement Learning (MARL), early work relied on hand-designed or implicit communication protocols, followed by trainable, end-to-end mechanisms optimized for reward and control. While effective, these approaches often yield task-specific and weakly interpretable communication, motivating research on Emergent Language (EL), where agents develop more structured or symbolic protocols through interaction. EL methods, however, still face challenges in grounding, generalization, and scalability, which have driven recent interest in large language models (LLMs) as a means to leverage natural language priors for reasoning, planning, and coordination in open-ended multi-agent settings. This progression motivates our survey: we analyze how communication paradigms evolve in response to the limitations of earlier approaches and how MARL, EL, and LLM-based systems address complementary aspects of multi-agent communication. This paper provides a unified survey of MA-Comm across MARL, EL, and LLM-based multi-agent systems. Organized around the Five Ws, we examine how different paradigms motivate, structure, and operationalize communication, reveal cross-paradigm trade-offs, and identify open challenges in communication, coordination, and learning. By offering systematic comparisons and design-oriented insights, this survey helps the community extract effective communication design patterns and supports the development of hybrid systems that combine learning, language, and control to meet diverse task, scalability, and interpretability requirements.

URL: https://openreview.net/forum?id=LGsed0QQVq

---

Title: Weakly-Supervised Disentangled Representation Learning via Filter-Based Adaptive Swapping

Authors: Zhenyu Zong, Qidi Wang, Simon Yu, Hongpeng Cao, Yanbing Mao, Han Zhao, Lui Sha, Huajie Shao

Abstract: Disentangled representation learning (DRL) aims to uncover semantically meaningful latent factors from observed data, thereby improving both interpretability and generalization of machine learning (ML) models. Despite remarkable progress, unsupervised DRL cannot achieve complete disentanglement without inductive biases or supervision. To address this challenge, existing approaches either rely on full supervision, which demands extensive manual labeling, or weak supervision, which involves complex training strategies that often result in unstable training. To address these limitations, we propose Filter-VAE, a weakly supervised variational autoencoder (VAE) that introduces a filter-based adaptive swapping strategy to learn stable and meaningful disentangled representations. Specifically, a relevance filter removes semantically meaningless latent factors, while an adaptive swapping filter exchanges those latent factors that have reached stability. With these two filters, Filter-VAE adaptively swaps only stable and semantically aligned latent factors, leading to robust and meaningful representations. We evaluate Filter-VAE on three standard benchmarks and our created traffic sign dataset in two downstream tasks: disentanglement and adversarial robustness. Experimental results demonstrate that Filter-VAE achieves strong disentanglement performance with reduced supervision and delivers remarkable robustness against diverse adversarial attacks and corruptions. The code is released at https://github.com/ZY-Zong/Filter-VAE.git.

URL: https://openreview.net/forum?id=K69rKKozZU

---

Title: Graph Coarsening using Game Theoretic Approach

Authors: Sonali Raj, Manoj Kumar, Sumit Kumar, Ruchir Gupta, Amit Kumar Jaiswal

Abstract: Graph coarsening is a method for reducing the size of an original graph while preserving its structural and feature-related properties. In graph machine learning, it is often employed as a preprocessing step to improve efficiency and scalability when handling large graph
datasets. In this study, we address the challenge of coarsening an original graph into a coarsened graph that retains these characteristics. We propose a Cooperative-Based Graph Coarsening (CGC) algorithm, which leverages cooperative game theory as a framework
for combinatorial optimization, aiming to minimize the total Dirichlet energy of the graph through localized optimizations. We prove that the proposed coarsening game is a potential game that guarantees convergence to a stable coarsened graph. Tests on real-world datasets
demonstrate that CGC algorithm surpasses prior state-of-the-art techniques in terms of coarsened graph accuracy and achieves reduced time complexity. These results highlight the potential of game-theoretic approaches in the advancement of graph coarsening techniques.

URL: https://openreview.net/forum?id=5vLBjQJCln

---

Title: Pave Your Own Path: Graph Gradual Domain Adaptation on Fused Gromov-Wasserstein Geodesics

Authors: Zhichen Zeng, Ruizhong Qiu, Wenxuan Bao, Tianxin Wei, Xiao Lin, Yuchen Yan, Tarek F. Abdelzaher, Jiawei Han, Hanghang Tong

Abstract: Graph neural networks, despite their impressive performance, are highly vulnerable to distribution shifts on graphs.
Existing graph domain adaptation (graph DA) methods often implicitly assume a mild shift between source and target graphs, limiting their applicability to real-world scenarios with large shifts.
Gradual domain adaptation (GDA) has emerged as a promising approach for addressing large shifts by gradually adapting the source model to the target domain via a path of unlabeled intermediate domains.
Existing GDA methods exclusively focus on independent and identically distributed (IID) data with a predefined path, leaving their extension to non-IID graphs without a given path an open challenge.
To bridge this gap, we present Gadget, the first GDA framework for non-IID graph data.
First (theoretical foundation), the Fused Gromov-Wasserstein (FGW) distance is adopted as the domain discrepancy for non-IID graphs, based on which, we derive an error bound on node, edge and graph-level tasks, showing that the target domain error is proportional to the length of the path.
Second (optimal path), guided by the error bound, we identify the FGW geodesic as the optimal path, which can be efficiently generated by our proposed algorithm.
The generated path can be seamlessly integrated with existing graph DA methods to handle large shifts on graphs, improving state-of-the-art graph DA methods by up to 6.8% in accuracy on real-world datasets.

URL: https://openreview.net/forum?id=dTPBqTKGPs

---

Title: Clus-UCB: A Near-Optimal Algorithm for Clustered Bandits

Authors: Aakash Gore, Prasanna Chaporkar

Abstract: We study a stochastic multi-armed bandit setting where arms are partitioned into known clusters, such that the parameters of arms within a cluster differ by at most a known threshold. While the clustering structure is known a priori, the arm parameters are unknown. We derive an asymptotic lower bound on the regret that improves upon the classical bound of Lai & Robbins (1985). We then propose Clus-UCB, an efficient algorithm that closely matches this lower bound asymptotically by exploiting the clustering structure and introducing a new index to evaluate an arm, which depends on other arms within the cluster. In this way, arms share information among each other. We present simulation results of our algorithm and compare its performance against KL-UCB and other well-known algorithms for bandits with dependent arms. We discuss the robustness of the proposed algorithm under misspecified prior information, address some limitations of this work, and conclude by outlining possible directions for future research.

URL: https://openreview.net/forum?id=QDMvPO9WJT

---

Title: MetaSeal: Defending Against Image Attribution Forgery Through Content-Dependent Cryptographic Watermarks

Authors: Tong Zhou, Ruyi Ding, Gaowen Liu, Charles Fleming, Ramana Rao Kompella, Yunsi Fei, Xiaolin Xu, Shaolei Ren

Abstract: The rapid growth of digital and AI-generated images has amplified the need for secure and verifiable methods of image attribution. While digital watermarking offers more robust protection than metadata-based approaches—which can be easily stripped—current watermarking techniques remain vulnerable to forgery, creating risks of misattribution that can damage the reputations of AI model developers and the rights of digital artists. These vulnerabilities arise from two key issues: (1) content-agnostic watermarks, which, once learned or leaked, can be transferred across images to fake attribution, and (2) reliance on detector-based verification, which is unreliable since detectors can be tricked. We present MetaSeal, a novel framework for content-dependent watermarking with cryptographic security guarantees to safeguard image attribution. Our design provides (1) forgery resistance, preventing unauthorized replication and enforcing cryptographic verification; (2) robust, self-contained protection, embedding attribution directly into images while maintaining resilience against benign transformations; and (3) evidence of tampering, making malicious alterations visually detectable. Experiments demonstrate that MetaSeal effectively mitigates forgery attempts and applies to both natural and AI-generated images, establishing a new standard for secure image attribution.

URL: https://openreview.net/forum?id=8i3ErmCfdJ

---

Title: Diversity Sampling Regularization for Multi-Domain Generalization

Authors: Lakpa Tamang, Mohamed Reda Bouadjenek, Sunil Aryal, Richard Dazeley

Abstract: Domain Generalization (DG) seeks to create models that can successfully generalize to new,
unseen target domains without the need for target domain data during training. Traditional
approaches often rely on data augmentation or feature mixing techniques, such as MixUp;
however, these methods may fall short in capturing the essential diversity within the feature
space, resulting in limited robustness against domain shifts. In this research, we revisit the
importance of diversity in DG tasks and propose a simple yet effective method to improve DG
performance through diversity-sampling regularization. Specifically, we calculate entropy
values for input data to assess their prediction uncertainty, and use these values to guide
sampling through Determinantal Point Process (DPP), which prioritizes selecting data sub-
sets with high diversity. By incorporating DPP-based diversity sampling as a regularization
strategy, our framework enhances the standard Empirical Risk Minimization (ERM) objec-
tive, promoting the learning of domain-agnostic features without relying on explicit data aug-
mentation. We empirically validate the effectiveness of our method on standard DG bench-
marks, including PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet, and through
extensive experiments show that it consistently improves generalization to unseen domains
and outperforms widely used baselines and S.O.T.A without relying on any task-specific
heuristics.

URL: https://openreview.net/forum?id=nXqMt7X2RX

---

Title: The Speed-up Factor: A Quantitative Multi-Iteration Active Learning Performance Metric

Authors: Hannes Kath, Thiago S. Gouvêa, Daniel Sonntag

Abstract: Machine learning models excel with abundant annotated data, but annotation is often costly and time-intensive.
Active learning (AL) aims to improve the performance-to-annotation ratio by using query methods (QMs) to iteratively select the most informative samples.
While AL research focuses mainly on QM development, the evaluation of this iterative process lacks appropriate performance metrics.
This work reviews eight years of AL evaluation literature and formally introduces the speed-up factor, a quantitative multi-iteration QM performance metric that indicates the fraction of samples needed to match random sampling performance.
Using four datasets from diverse domains and seven QMs of various types, we empirically evaluate the speed-up factor and compare it with state-of-the-art AL performance metrics.
The results confirm the assumptions underlying the speed-up factor, demonstrate its accuracy in capturing the described fraction, and reveal its superior stability across iterations.

URL: https://openreview.net/forum?id=q6hRb6fETo

---

Title: BiSSL: Enhancing the Alignment Between Self-Supervised Pretraining and Downstream Fine-Tuning via Bilevel Optimization

Authors: Gustav Wagner Zakarias, Lars Kai Hansen, Zheng-Hua Tan

Abstract: Models initialized from self-supervised pretraining may suffer from poor alignment with downstream tasks, limiting the extent to which subsequent fine-tuning can adapt relevant representations acquired during the pretraining phase. To mitigate this, we introduce BiSSL, a novel bilevel training framework that enhances the alignment of self-supervised pretrained models with downstream tasks by explicitly incorporating both the pretext and downstream tasks into a preparatory training stage prior to fine-tuning. BiSSL solves a bilevel optimization problem in which the lower-level adheres to the self-supervised pretext task, while the upper-level encourages the lower-level backbone to align with the downstream objective. The bilevel structure facilitates enhanced information sharing between the tasks, ultimately yielding a backbone model that is more aligned with the downstream task, providing a better initialization for subsequent fine-tuning. We propose a general training algorithm for BiSSL that is compatible with a broad range of pretext and downstream tasks. We demonstrate that our proposed framework significantly improves accuracy on the vast majority of a broad selection of image-domain downstream tasks, and that these gains are consistently retained across a wide range of experimental settings. In addition, exploratory alignment analyses further underpin that BiSSL enhances downstream alignment of pretrained representations.

URL: https://openreview.net/forum?id=GQAGlqOpyA

---

Title: The Internal Growth Function: A More General PAC Framework for Scenario Decision Making

Authors: Guillaume O Berger, Raphael Jungers

Abstract: This paper introduces a new PAC framework for scenario decision-making problems.
Scenario decision making consists in making a decision that satisfies a probabilistic constraint (also called a chance constraint) from finitely many sampled realizations (called scenarios) of the constraint.
PAC bounds are sufficient conditions on the number of samples to guarantee with high confidence that the sample-based decision satisfies the true constraint with a prescribed probability.
Existing PAC bounds rely on intrinsic properties of the problem, such as convexity (Calafiore and Campi, 2005), finite VC dimension (Alamo et al., 2009) or existence of a compression scheme (Margellos et al., 2014).
While powerful in some applications, these PAC bounds can be vacuous (or infinite) when the properties are not satisfied.
In this paper, we propose a new PAC framework, leading to PAC bounds that are not vacuous for a strictly larger class of scenario decision-making problems.
This bound is based on the novel notion of ``internal growth'', which adapts the notion of ``growth function'' from classical machine learning (Vapnik and Chervonenkis, 1968) to scenario decision making.
We also relate this notion to other novel properties of the system, such as the $k$-VC dimension.
Furthermore, we show a partial converse result: namely, that for the family of stable monotone scenario decision algorithms, the algorithm is PAC if \emph{and only if} it satisfies our criterion.
Finally, we demonstrate the usefulness of our framework, and compare with existing approaches, on practical problems.

URL: https://openreview.net/forum?id=HqPKJSAkrp

---

Title: Bootstrapping Task Spaces for Self-Improvement

Authors: Minqi Jiang, Andrei Lupu, Yoram Bachrach

Abstract: Progress in many task domains emerges from repeated revisions to previous solution attempts. Training agents that can reliably self-improve over such sequences at inference-time is a natural target for reinforcement learning (RL), yet the naive approach assumes a fixed maximum iteration depth, which can be both costly and arbitrary. We present Exploratory Iteration (ExIt), a family of autocurriculum RL methods that directly exploits the recurrent structure of self-improvement tasks to train LLMs to perform multi-step self-improvement at inference-time while only training on the most informative single-step iterations. ExIt grows a task space by selectively sampling the most informative intermediate, partial histories encountered during an episode for continued iteration, treating these starting points as new self-iteration task instances to train a self-improvement policy. ExIt can further pair with explicit exploration mechanisms to sustain greater task diversity. Across several domains, encompassing competition math, multi-turn tool-use, and machine learning engineering, we demonstrate that ExIt strategies, starting from either a single or many task instances, can produce policies exhibiting strong inference-time self-improvement on held-out task instances, and the ability to iterate towards higher performance over a step budget extending beyond the average iteration depth encountered during training.

URL: https://openreview.net/forum?id=k2VsgUxC6X

---

Title: Segmentation From Attention: Training-Free Layer Selection and One-Shot Tuning for Segmentation in VLMs

Authors: Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal, James J. Little

Abstract: Large-scale vision-language models (VLMs), trained on extensive datasets of image-text pairs, exhibit strong multimodal understanding capabilities by implicitly learning associations between textual descriptions and image regions. This emergent ability enables zero-shot object detection and segmentation, using techniques that rely on text-image attention maps, without necessarily training on abundant labeled segmentation datasets. However, performance of such methods depends heavily on prompt engineering and manually selected layers or head choices for the attention layers. In this work, we propose a training-free entropy-based measure, InfoScore, to identify the best image-text attention layers for segmentation, providing a more flexible and scalable solution for training-free open-vocabulary segmentation, reducing the additional burden of hyperparamter search. We empirically show that our training-free selection strategy is superior to naive selection strategies. Additionally, we demonstrate that instead of solely relying on text prompts, fine-tuning the image-text attention layer with a single visual example of each class significantly improves segmentation without the need of additional parameters or decoders. Moreover, we show that our methods and findings are general and can be applied across various vision-language models (VLMs).

URL: https://openreview.net/forum?id=a5lAwubXro

---

Title: Improving Detection of Rare Nodes in Hierarchical Multi-Label Learning

Authors: Isaac Xu, Martin Gillis, Ayushi Sharma, Benjamin Misiuk, Craig J. Brown, Thomas Trappenberg

Abstract: In hierarchical multi-label classification, a persistent challenge is enabling model predictions to reach deeper levels of the hierarchy for more detailed or fine-grained classifications. This difficulty partly arises from the natural rarity of certain classes (or hierarchical nodes) and the hierarchical constraint that ensures child nodes are almost always less frequent than their parents. To address this, we propose a weighted loss objective for neural networks that combines node-wise imbalance weighting with focal weighting components, the latter leveraging modern quantification of ensemble uncertainties. By emphasizing rare nodes rather than rare observations (data points), and focusing on uncertain nodes for each model output distribution during training, we observe improvements in recall by up to a factor of five on benchmark datasets, along with statistically significant gains in $F_{1}$ score. We also show our approach aids convolutional networks on challenging tasks, as in situations with suboptimal encoders or limited data.

URL: https://openreview.net/forum?id=hf4zEWWIvE

---

Title: An Efficient Subset Selection Strategy Using Text-Guided Data Attribution to Mitigate Simplicity Bias

Authors: Kumar Shubham, Pranav Sastry, Prathosh AP

Abstract: The effectiveness of deep learning models heavily relies on the quality and diversity of their training data. However, datasets collected from different sources often introduce simplicity biases, where a models rely on easily learnable but non-predictive (spurious) features for its predictions. While existing debiasing techniques focus on model robustness, they leave the data untouched. However, as data becomes increasingly valuable, identifying and mitigating bias directly at the data level has become increasingly important. Recently, data attribution has emerged as a promising tool for uncovering issues in training data, yet its vulnerability to simplicity bias has received limited attention. In this work, we propose a novel data deletion framework that combines Neural Tangent Kernel (NTK)-based data attribution with textual descriptions of bias to identify and remove training samples that do not significantly affect model performance. We first demonstrate that NTK-based data attribution methods can themselves be influenced by spurious features. Subsequently, to mitigate this, we use available metadata or, when unavailable, a vision-language model to annotate a small validation set and extract a textual description of the bias. Based on this description and the attribution score, we identify the subset of training data that are semantically aligned with the spurious feature and affect the generalization of the model. Removing these samples from the training dataset and training model on the new subset improves the average and worst-group accuracy of the model, outperforming existing attribution-based baselines.

URL: https://openreview.net/forum?id=zZ5YundT95

---

Title: QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design

Authors: Benjamin Schneider, Dongfu Jiang, Chao Du, Tianyu Pang, Wenhu Chen

Abstract: Long video understanding has emerged as a crucial capability in real-world applications
such as meeting summarization, video surveillance, educational lecture analysis, and content
moderation. However, it remains computationally prohibitive for VideoLLMs, primarily due
to two bottlenecks: 1) sequential video decoding, the process of converting the raw bit stream
to RGB frames can take up to a minute for hour-long video inputs, and 2) costly prefilling
of up to several million tokens for LLM inference, resulting in high latency and memory
use. To address these challenges, we propose QuickVideo, a system-algorithm co-design
that substantially accelerates long video understanding to support real-time downstream
applications. It comprises three key innovations: QuickCodec, a parallelized CPU-based
video decoder that achieves 2–3× speedup by splitting videos into keyframe-aligned intervals
processed concurrently. QuickPrefill, a memory-efficient prefilling method using KV-cache
pruning to support more frames with less GPU memory; and an overlapping scheme
that overlaps CPU video decoding with GPU inference. Together, these components reduce
the time required to process a long video input by a minute, enabling fast, efficient video
understanding even on limited hardware. Experiments show that QuickVideo generalizes
across durations and sampling rates, making long video processing feasible in practice.

URL: https://openreview.net/forum?id=Rpcxgzcsuc

---

Title: Large Language Model-based Data Science Agent: A Survey

Authors: Ke Chen, Peiran Wang, Yaoning Yu, Xianyang Zhan, Haohan Wang

Abstract: The rapid advancement of Large Language Models (LLMs) has driven novel applications across diverse domains, with LLM-based agents emerging as a crucial area of exploration. This survey presents a comprehensive analysis of LLM-based agents designed for data science tasks, summarizing insights from recent studies. From the agent perspective, we discuss the key design principles, covering agent roles, execution, knowledge, and reflection methods. From the data science perspective, we identify key processes for LLM-based agents, including data preprocessing, model development, evaluation, visualization, etc. Our work offers two key contributions: (1) a comprehensive review of recent developments in applying LLM-based agents to data science tasks; (2) a dual-perspective framework that connects general agent design principles with the practical workflows in data science.

URL: https://openreview.net/forum?id=ZT5SJQN0CS

---

Title: Model-Free Learning with Heterogeneous Dynamical Systems: A Federated LQR Approach

Authors: Han Wang, Leonardo Felipe Toso, Aritra Mitra, James Anderson

Abstract: We study a model-free federated linear quadratic regulator (LQR) problem where M agents with unknown, distinct yet similar dynamics collaboratively learn an optimal policy to minimize an average quadratic cost while keeping their data private. To exploit the similarity of the agents' dynamics, we propose to use federated learning (FL) to allow the agents to periodically communicate with a central server to train policies by leveraging a larger dataset from all the agents. With this setup, we seek to understand the following questions: (i) Is the learned common policy stabilizing for all agents? (ii) How close is the learned common policy to each agent's own optimal policy? (iii) Can each agent learn its own optimal policy faster by leveraging data from all agents? To answer these questions, we propose the federated and model-free algorithm FedLQR. Our analysis overcomes numerous technical challenges, such as heterogeneity in the agents’ dynamics, multiple local updates, and stability concerns. We show that FedLQR produces a common policy that, at each iteration, is stabilizing for all agents. Moreover, we prove that when learning each agent's optimal policy, FedLQR achieves a sample complexity reduction proportional to the number of agents M in a low-heterogeneity regime, compared to the single-agent setting.

URL: https://openreview.net/forum?id=WSRQeCUc3g

---

Title: Unlocking [CLS] Features for Continual Post-Training

Authors: Murat Onur Yildirim, Elif Ceren Gok Yildirim, Joaquin Vanschoren

Abstract: Continual learning requires models to integrate new classes or domains over time while preserving previously acquired knowledge. Within this paradigm, foundation models often achieve strong performance, but they still remain subject to the stability–plasticity trade-off, where excessive plasticity leads to forgetting of prior knowledge, and excessive stability constrains the adaptation. This necessitates an effective post-training strategy that introduces minimal yet functional modifications. To address this challenge, we first introduce a new parameter-efficient fine-tuning module ‘Learn and Calibrate’, or LuCA, designed to acquire task-specific knowledge through an adapter-calibrator couple, enabling well-refined feature representations. Then, for each task, we deploy a sparse LuCA module on top of the last classification token [CLS] just before the classifier, which we refer to as ‘Token-level Sparse Calibration and Adaptation’, or TOSCA. By leaving the generalization capabilities of the foundation models intact and adapting exclusively via the last token, our approach achieves a harmonious balance between stability and plasticity while reducing both training and inference complexity. We demonstrate that TOSCA yields state-of-the-art performance while introducing 8 times fewer parameters compared to prior methods.

URL: https://openreview.net/forum?id=OWfWyj6krc

---

Title: Relative Geometry of Neural Forecasters: Linking Accuracy and Alignment in Learned Latent Geometry

Authors: Deniz Kucukahmetler, Maximilian Jean Hemmann, Julian Mosig von Aehrenfeld, Maximilian Amthor, Christian Deubel, Nico Scherf, Diaaeldin Taha

Abstract: Neural networks can accurately forecast complex dynamical systems, yet how they internally represent underlying latent geometry remains poorly understood. We study neural forecasters through the lens of representational alignment, introducing anchor-based, geometry-agnostic relative embeddings that remove rotational and scaling ambiguities in latent spaces. Applying this framework across seven canonical dynamical systems—ranging from periodic to chaotic—we reveal reproducible family-level structure: multilayer perceptrons align with other MLPs, recurrent networks with RNNs, while transformers and echo-state networks achieve strong forecasts despite weaker alignment. Alignment generally correlates with forecasting accuracy, yet high accuracy can coexist with low alignment.
Relative geometry thus provides a simple, reproducible foundation for comparing how model families internalize and represent dynamical structure.

URL: https://openreview.net/forum?id=t4stf5Gafz

---


New submissions
===============


Title: Efficient Text-Attributed Graph Learning through Selective Annotation and Graph Alignment

Abstract: In the realm of Text-attributed Graphs (TAGs), traditional graph neural networks (GNNs) often fall short due to the complex textual information associated with each node. Recent methods have improved node representations by leveraging large language models (LLMs)
to enhance node text features, but these approaches typically require extensive annotations or fine-tuning across all nodes, which is both time-consuming and costly. To overcome these challenges, we introduce GAGA, an efficient framework for TAG representation learning.
GAGA reduces annotation time and cost by focusing on annotating only representative nodes and edges. It constructs an annotation graph that captures the topological relationships among these annotations. Furthermore, GAGA employs a two-level alignment module to effectively integrate the annotation graph with the TAG, aligning their underlying structures. Experiments show that GAGA achieves classification accuracy on par with or surpassing state-of-the-art methods while requiring only 1% of the data to be annotated, demonstrating its high efficiency.

URL: https://openreview.net/forum?id=UBIPauyTYp

---

Title: Joint Encoding of KV-Cache Blocks for Scalable LLM Serving

Abstract: Modern large language models (LLMs) drive interactive AI systems but are bottlenecked by the memory-heavy growth of key-value (KV) caches, which limits real-time throughput under concurrent loads. Existing KV-cache compression methods rely on rigid heuristics, disrupt tensor layouts, or require specialized compute, hindering scalability and deployment.

We propose joint encoding of KV-cache blocks, which fuses similar blocks across requests and input chunks into shared representations while preserving standard cache structure. This alleviates the KV-cache memory bottleneck, supporting high-concurrency serving without specialized hardware. Theoretically, we analyze the rate-distortion tradeoff of fused cache blocks under a Poisson process model. Empirically, our method achieves up to 4.38$\times$ KV-cache compression with negligible accuracy loss across diverse LLMs and benchmarks, outperforming recent structured and adaptive compression baselines. In real LLM serving, joint encoding improves the token throughput by $\sim$40\% on a single-machine vLLM benchmark, demonstrating substantial gains in inference throughput. Code is available at
\href{https://anonymous.4open.science/r/kv_joint_encoding-55B0/}{\nolinkurl{kv_joint_encoding-55B0}}.

URL: https://openreview.net/forum?id=xh7IfhHtDW

---

Title: Scaling Large Language Models with Fully Sparse Activations

Abstract: Activation sparsity can reduce the inference cost of large language models (LLMs) by lowering both compute and memory traffic. Yet most existing approaches sparsify only FFN intermediate states, leaving substantial portions of inference effectively dense. We study how to scale fully sparsely activated LLMs, in which every activation participating in linear transformations is sparse. We focus on two questions: how to train such models effectively, and how activation sparsity affects model quality as scale increases. We develop a pre-training recipe that enables effective training fully sparsely activated LLMs from scratch, including using squared ReLU as activation function, top-K sparsification and a straight-through estimator for the remaining linear layers. Extensive experiments spanning model sizes, training-token budgets, and target sparsity levels reveal that its performance gap to dense baselines narrows with model scale, increases nonlinearly with sparsity, while remaining largely insensitive to the training-token budget. Finally, we investigate post-training activation sparsification of pre-trained dense models via both training-free techniques and supervised fine-tuning, and observe a similar trend as pre-training experiments: larger models are more robust to sparsification, and exhibit increasingly sparse activation patterns. Overall, our results provide practical training recipes and empirical guidance for building and scaling LLMs with fully sparse activations.

URL: https://openreview.net/forum?id=MntjMCroiE

---

Title: Latent learning: episodic memory complements parametric learning by enabling flexible reuse of experiences

Abstract: When do machine learning systems fail to generalize, and what mechanisms could improve their generalization? Here, we draw inspiration from cognitive science to argue that one weakness of parametric machine learning systems is their failure to exhibit \emph{latent learning}---learning information that is not relevant to the task at hand, but that might be useful in a future task. Using controlled, synthetic benchmarks, we show how this perspective links failures ranging from the reversal curse in language modeling to new findings on agent-based navigation. We then highlight how cognitive science points to episodic memory as a potential part of the solution to these issues. Correspondingly, we show that a system with an oracle retrieval mechanism can use learning experiences more flexibly to generalize better across many of these challenges---thus motivating episodic memory as an important direction for research in AI. We also identify some of the essential components for effectively using retrieval, including the importance of \emph{within-experience} in-context learning for acquiring the ability to use information \emph{across} retrieved experiences. In summary, our results illustrate one possible contributor to the relative data inefficiency of current machine learning systems compared to natural intelligence, and help to understand how retrieval methods might complement parametric learning to improve generalization. We close by discussing some of the links between our work and findings in cognitive science and neuroscience---including a possible perspective on hippocampal contributions to generalization---and the broader implications.

URL: https://openreview.net/forum?id=RuWGeX5ZiB

---

Title: Robust Answers, Fragile Logic: Probing the Decoupling Hypothesis in LLM Reasoning

Abstract: While Chain-of-Thought (CoT) prompting has become a cornerstone for complex reasoning in Large Language Models (LLMs), the faithfulness of the generated reasoning remains an open question. We investigate the Decoupling Hypothesis: that correct answers often mask fragile, post-hoc rationalizations that are not causally tied to the model's prediction. To systematically verify this, we introduce MATCHA, a novel Answer-Conditioned Probing framework. Unlike standard evaluations that focus on final output accuracy, MATCHA isolates the reasoning phase by conditioning generation on the model's predicted answer, allowing us to stress-test the stability of the rationale itself. Our experiments reveal a critical vulnerability: under imperceptible input perturbations, LLMs frequently maintain the correct answer while generating inconsistent or nonsensical reasoning - effectively being ``Right for the Wrong Reasons''. Using LLM judges to quantify this robustness gap, we find that multi-step and commonsense tasks are significantly more susceptible to this decoupling than logical tasks. Furthermore, we demonstrate that adversarial examples generated by MATCHA transfer non-trivially to black-box models. Our findings expose the illusion of CoT robustness and underscore the need for future architectures that enforce genuine answer-reasoning consistency rather than mere surface-level accuracy.

URL: https://openreview.net/forum?id=pMhTFUdM4G

---

Title: FreeEyeglass: Training-free and Target-mask-free Eyeglass Transfer for Facial Videos

Abstract: The rise of e-commerce and short-video platforms has fueled demand for realistic video-based virtual try-on. Unlike virtual try-on of clothing, which has been actively studied to date, virtual try-on of eyeglasses is uniquely challenging: they physically interact with facial geometry and strongly affect facial identity, making the faithful preservation of unedited regions especially important. Existing generative editing approaches, such as GAN- and diffusion-based methods, lack reconstruction objectives and often rely on inpainting, which fails to ensure identity consistency. We argue that semantic editing requires not only plausible generation but also faithful reconstruction, making autoencoder-based latent spaces particularly suitable. We introduce a training-free, reference-guided framework for video eyeglass transfer built on Diffusion Autoencoders (DiffAE). By blending semantic features in the encoder and incorporating spatial-temporal self-attention, our method achieves realistic, identity-preserving, and temporally consistent results, and points to the potential of autoencoder-based latent spaces for local video editing. Our implementations and datasets will be released upon acceptance.

URL: https://openreview.net/forum?id=6aFRoQcm3H

---

Title: Test-Time Adaptation of Vision-Language Models with Low-Rank Pseudo-Consistency

Abstract: While test-time adaptation (TTA) methods enable vision-language models (VLMs) to adapt under distribution shifts, they typically rely on simple feature transformations following frozen encoders while learning from potentially noisy pseudo-labels. This approach may limit adaptation under significant domain shifts. In this paper, we propose PseudoAdapter, a novel TTA framework for VLMs that introduces low-rank adapters into early layers of the encoder to enable domain-specific feature adaptation while maintaining generalization. To ensure effective learning from noisy and low-confidence predictions, PseudoAdapter combines confidence-calibrated pseudo-labelling with unsupervised consistency learning across augmented views. We further extend our approach with PseudoAdapter+, which integrates selective teacher supervision to improve adaptation with minimal overhead. Extensive evaluations on four out-of-distribution and ten cross-domain benchmarks demonstrate that our method outperforms prior state-of-the-art TTA approaches by an average of 6.84\% and 3.25\%, respectively. Ablation studies confirm the effectiveness of each proposed component.

URL: https://openreview.net/forum?id=GDw4pvX9aG

---

Title: Unifying Understanding and Generation in Vision-Language Models: Advances, Challenges, and Opportunities

Abstract: Significant advancements in vision-language models have predominantly followed two divergent trajectories: autoregressive architectures optimized for visual understanding and diffusion-based frameworks designed for high-fidelity generation. However, this separation hinders the development of truly versatile multimodal agents. Unifying these capabilities is a critical step toward Artificial General Intelligence, as recent findings suggest that effective understanding and generation can mutually reinforce each other. This survey provides a comprehensive overview of the emerging field of unified vision-language models and proposes a systematic taxonomy based on the core visual representation mechanism: \textit{continuous} versus \textit{discrete} visual tokens. For continuous visual tokens, we analyze how models bridge the semantic-visual gap by categorizing integration strategies into Serial Coupling, where LLMs act as planners, and Parallel Coupling, which enables bidirectional interaction. regarding discrete visual tokens, we contrast Autoregressive approaches that treat images as a foreign language against emerging Discrete Diffusion paradigms known for their global consistency and parallel decoding. Beyond architectural analysis, we provide a curated compilation of datasets and benchmarks essential for training and evaluation. Finally, we critically discuss open challenges such as tokenization trade-offs, training stability, and scalability, while outlining future directions for building seamless, omni-capable multimodal systems.

URL: https://openreview.net/forum?id=AIMmeOrVFL

---

Title: "COMPLEXITY-DEEP: A Language Model Architecture with Mu-Guided Attention and Token-Routed MLP"

Abstract: We present COMPLEXITY-DEEP, a language model (LLM) architecture developed from scratch, introducing three original contributions: (1) Token-Routed MLP, a dynamic per-token routing mechanism inspired by Mixture of Experts but without requiring auxiliary load balancing loss, (2) Mu-Guided Attention, where a latent state μ from the previous layer guides K, Q, and V projections, creating a bidirectional information flow between attention and dynamics, and (3) a PiD-style adaptive controller that stabilizes training through dynamic scaling. We provide formal theoretical analysis proving perfect load balance, capacity equivalence with dense models at 1/n compute cost, gradient-driven expert orthogonalization, and establish connections between Mu-Guidance and predictive coding theory. Our 1.5B parameter implementation, trained on 33B tokens from FineWeb-Edu, demonstrates the viability of this architecture with stable convergence (loss 3.78, perplexity 43.7). Evaluation on standard benchmarks shows performance consistent with model size, with supervised fine-tuning achieving 30% on MMLU (+5% above random) and 23% on ARC-Challenge.

URL: https://openreview.net/forum?id=jZq6EVboC6

---

Title: Robust Cross-Domain Alignment

Abstract: The Gromov-Wasserstein (GW) distance is an effective measure of alignment between distributions supported on distinct ambient spaces. Calculating essentially the mutual departure from isometry, it has found vast usage in domain translation and network analysis. It has long been shown to be vulnerable to contamination in the underlying measures. All efforts to introduce robustness in GW have been inspired by similar optimal transport (OT) techniques, which predominantly advocate partial mass transport or unbalancing. In contrast, the cross-domain alignment problem, being fundamentally different from OT, demands specific solutions to tackle diverse applications and contamination regimes. Deriving from robust statistics, we discuss three contextually novel techniques to robustify GW and its variants. For each method, we explore metric properties and robustness guarantees along with their co-dependencies and individual relations with the GW distance. For a comprehensive view, we empirically validate their superior resilience to contamination under real machine learning tasks against state-of-the-art methods.

URL: https://openreview.net/forum?id=0mchjaZZi4

---

Title: POPS: Recovering Unlearned Multi-Modality Knowledge in MLLMs with Prompt-Optimized Parameter Shaking

Abstract: Multimodal Large Language Models (MLLMs) have demonstrated impressive performance on cross-modal tasks by jointly training on large-scale textual and visual data, where privacy-sensitive examples could be unintentionally encoded, raising concerns about privacy or copyright violation. To this end, Multi-modality Machine Unlearning (MMU) was proposed as a mitigation that can effectively force MLLMs to forget private information. However, the robustness of such unlearning methods is not fully exploited when the model is published and accessible to malicious users. In this paper, we propose a novel adversarial strategy, namely Prompt-Optimized Parameter Shaking (POPS), aiming to recover the supposedly unlearned multi-modality knowledge from the MLLMs. Our method elicits the victim MLLMs to generate potential private examples via prompt-suffix optimization, and then exploits these synthesized outputs to fine-tune the models so they disclose the true private information. The experiments on the different MMU benchmarks reveal substantial weaknesses in the existing MMU algorithms. Our POPS can even achieve a near-complete recovery of supposedly erased sensitive information on the unlearned MLLMs, exposing fundamental vulnerabilities that challenge the foundational robustness of representative MMU-based privacy protections.

URL: https://openreview.net/forum?id=wMiEcH84l9

---

Title: Learning Where It Matters: Responsible and Interpretable Text-to-Image Generation with Background Consistency

Abstract: Text-to-image diffusion models have achieved remarkable progress, yet they still struggle to produce unbiased and responsible outputs. A promising direction is to manipulate the bottleneck space of the U-Net (the $h$-space), which provides \textit{interpretability} and \textit{controllability}. However, existing methods rely on learning attributes from the entire image, entangling them with spurious features and offering no corrective mechanisms at inference. This uniform reliance leads to poor subject alignment, fairness issues, reduced photorealism, and incoherent backgrounds in scene-specific prompts. To address these challenges, we propose two complementary innovations for training and inference. First, we introduce a spatially focused concept learning framework that disentangles target attributes into concept vectors by suppressing target attribute features within the multi-head cross-attention (MCA) modules and attenuating the encoder output (i.e., $h$-vector) to ensure the concept vector exclusively captures target attribute features. In addition, we introduce a spatially weighted reconstruction loss to emphasize regions relevant to the target attribute. Second, we design an inference-time strategy that improves background consistency by enhancing low-frequency components in the $h$-space. Experiments demonstrate that our approach improves fairness, subject fidelity, and background coherence while preserving visual quality and prompt alignment, outperforming state-of-the-art $h$-space methods. The code is included in the supplementary material.

URL: https://openreview.net/forum?id=sCOJGbJwAJ

---

Title: Expected Free Energy-based Planning as Variational Inference

Abstract: Planning under uncertainty requires agents to balance goal achievement with information gathering. Active inference addresses this through the Expected Free Energy (EFE), a cost function that unifies instrumental and epistemic objectives. However, existing EFE-based methods typically employ specialized optimization procedures that are difficult to extend or analyze. In this paper, we show that EFE-based planning can be formulated as standard variational inference on a generative model augmented with epistemic priors. Our main result demonstrates that minimizing a Variational Free Energy functional with appropriately chosen priors yields a decomposition into expected plan costs (the EFE) plus a complexity term. This formulation reinforces theoretical consistency with the Free Energy Principle by casting planning as the same inferential process that governs perception and learning. We validate our approach on a T-maze task, demonstrating that the epistemic priors are sufficient for inducing information-seeking behavior.

URL: https://openreview.net/forum?id=Kzm8I1oS1s

---

Title: CE-LoRA: Computation-Efficient LoRA Fine-Tuning for Large Language Models

Abstract: Large Language Models (LLMs) demonstrate exceptional performance across various tasks but demand substantial computational resources even for fine-tuning computation. Although Low-Rank Adaptation (LoRA) significantly alleviates memory consumption during fine-tuning, its impact on computational cost reduction is limited. This paper identifies the computation of activation gradients as the primary bottleneck in LoRA's backward propagation and introduces the Computation-Efficient LoRA (CE-LoRA) algorithm, which enhances computational efficiency while preserving memory efficiency. CE-LoRA leverages two key techniques: Approximated Matrix Multiplication, which replaces dense multiplications of large and complete matrices with sparse multiplications involving only critical rows and columns, and the Double-LoRA technique, which reduces error propagation in activation gradients. Theoretically, CE-LoRA converges at the same rate as LoRA,$\mathcal{O}(1/\sqrt{T})$, where $T$ is the number of iterations. Empirical evaluations confirm that CE-LoRA significantly reduces computational costs compared to LoRA without notable performance degradation.

URL: https://openreview.net/forum?id=kwE16U73HH

---

Title: Gradual Binary Search and Dimension Expansion : A general method for activation quantization in LLMs

Abstract: Large language models (LLMs) have become pivotal in artificial intelligence, demonstrating strong capabilities in reasoning, understanding, and generating data. However, their deployment on edge devices is hindered by their substantial size, often reaching several billion parameters. Quantization is a widely used method to reduce memory usage and inference time, however LLMs present unique challenges due to the prevalence of outliers in their activations. In this work, we leverage the theoretical advantages of Hadamard matrices over random rotation matrices to push the boundaries of quantization in LLMs. We demonstrate that Hadamard matrices are more effective in reducing outliers, which are a significant obstacle in achieving low-bit quantization. Our method based on a gradual binary search enables 3-bit quantization for weights, activations, and key-value (KV) caches, resulting in a 40% increase in accuracy on common benchmarks compared to SoTA methods. We extend the use of rotation matrices to support non-power-of-2 embedding dimensions, similar to the Qwen architecture, by employing the Paley's algorithm. Our experimental results on multiple models family like Mistral, LLaMA, and Qwen demonstrate the effectiveness of our approach, outperforming existing methods and enabling practical 3-bit quantization.

URL: https://openreview.net/forum?id=fno6W7qwhT

---

Title: Rethinking Memory Mechanisms of Foundation Agents in the Second Half: A Survey

Abstract: The research of artificial intelligence is undergoing a paradigm shift from prioritizing model innovations over benchmark scores towards emphasizing problem definition and rigorous real-world evaluation. As the field enters the ''second half,'' the central challenge becomes real utility in long-horizon, dynamic, and user-dependent environments, where agents face context explosion and must continuously accumulate, manage, and selectively reuse large volumes of information across extended interactions. Memory, with hundreds of papers released this year, therefore emerges as the critical solution to fill the utility gap. In this survey, we provide a unified view of foundation agent memory along three dimensions: memory substrate (internal and external), cognitive mechanism (episodic, semantic, sensory, working, and procedural), and memory subject (agent- and user-centric). We then analyze how memory is instantiated and operated under different agent topologies and highlight learning policies over memory operations. Finally, we review evaluation benchmarks and metrics for assessing memory utility, and outline various open challenges and future directions.

URL: https://openreview.net/forum?id=XycbogUAeJ

---

Title: Improving LLM Unlearning Robustness via Random Perturbations

Abstract: Here, we show that current LLM unlearning methods inherently reduce models' robustness, causing them to misbehave even when a single non-adversarial forget-token is present in the retain-query. Toward understanding underlying causes, we propose a novel theoretical framework that reframes the unlearning process as a backdoor attack and defense problem: we formulate how the forgetting process inadvertently learn to align forget-tokens (backdoor triggers) with the target-representations (target labels). As a result, forget-tokens act as backdoor triggers that, when activated in retain-queries, cause disruptions in unlearned models' behaviors, similar to successful backdoor attacks. The sense that, LLM unlearning methods themselves poison the model, make it more vulnerable to forget-tokens, and hide rather than erase target knowledge, describes their true mechanism. To mitigate the vulnerability caused by the forgetting process, we reinterpret the retaining process as a backdoor defense and propose Random Noise Augmentation (RNA), a lightweight, model and method-agnostic approach with theoretical guarantees for improving the robustness of unlearned models. Extensive experiments demonstrate that RNA significantly improves the robustness of unlearned models while preserving forget and retain performances. This backdoor attack-defense framework offers insights into the mechanism of unlearning that can shed light on future research directions for improving unlearning robustness.

URL: https://openreview.net/forum?id=QYw192hTdH

---

Title: LibMoE: A Library for Comprehensive Research on Mixture of Experts in Large Language Models

Abstract: Mixture of experts (MoE) architectures have become a cornerstone for scaling up and are a key component in most large language models such as GPT-OSS, DeepSeek-V3, Llama-4, and Gemini-2.5. However, systematic research on MoE remains severely constrained by the prohibitive computational costs of training and evaluation, restricting large-scale studies accessible to most researchers. We introduce LibMoE, a unified framework for reproducible, efficient, and extensible MoE research that supports both pretraining and sparse-upcycling regimes. Beyond unified implementations, the framework provides transparent analytical tools for probing routing and expert dynamics. Leveraging this foundation, we conduct a comprehensive analysis along three dimensions: (i) routing dynamics, covering expert selection patterns, routing stability and optimality, and how routing entropy reveals task specialization and expert diversity; (ii) the effect of lightweight initialization on load balancing, demonstrating how subtle changes in router initialization shape early expert utilization; and (iii) training regime differences, revealing how sparse upcycling and full pretraining exhibit distinct routing patterns and stability profiles. By lowering the barrier to entry and standardizing evaluation, along with our comprehensive analysis, LibMoE broadens access to MoE research and establishes a reliable benchmark to guide future innovations.

URL: https://openreview.net/forum?id=PB2ju8tq0n

---

Title: What Drives Success in Physical Planning with Joint-Embedding Predictive World Models?

Abstract: A long-standing challenge in AI is to develop agents capable of solving a wide range of physical tasks and generalizing to new, unseen tasks and environments. A popular recent approach involves training a world model from state-action trajectories and subsequently use it with a planning algorithm to solve new tasks. Planning is commonly performed in the input space, but a recent family of methods has introduced planning algorithms that optimize in the learned representation space of the world model, with the promise that abstracting irrelevant details yields more efficient planning. In this work, we characterize models from this family as JEPA-WMs and investigate the technical choices that make algorithms from this class work. We propose a comprehensive study of several key components with the objective of finding the optimal approach within the family. We conducted experiments using both simulated environments and real-world robotic data, and studied how the model architecture, the training objective, and the planning algorithm affect planning success. We combine our findings to propose a model that outperforms two established baselines, DINO-WM and V-JEPA-2-AC, in both navigation and manipulation tasks.
Code, data and checkpoints are available in supplementary material.

URL: https://openreview.net/forum?id=cHZn5Gdh8e

---

Title: Flow Matching for Probabilistic Monocular 3D Human Pose Estimation

Abstract: Recovering 3D human poses from a monocular camera view is a highly ill-posed problem due to the depth ambiguity. Earlier studies on 3D human pose lifting from 2D often contain incorrect-yet-overconfident 3D estimations. To mitigate the problem, emerging probabilistic approaches treat the 3D estimations as a distribution, taking into account the uncertainty measurement of the poses. Falling in a similar category, we proposed FMPose, a probabilistic 3D human pose estimation method based on the flow matching generative approach.
Conditioned on the 2D cues, the flow matching scheme learns the optimal transport from a simple source distribution to the plausible 3D human pose distribution via continuous normalizing flows. The 2D lifting condition is modeled via graph convolutional networks, leveraging the learnable connections between human body joints as the graph structure for feature aggregation. Compared to diffusion-based methods, the FMPose with optimal transport produces faster and more accurate 3D pose generations. Experimental results show major improvements of our FMPose over current state-of-the-art methods on three common benchmarks for 3D human pose estimation, namely Human3.6M, MPI-INF-3DHP and 3DPW.

URL: https://openreview.net/forum?id=UlpH4XBLR4

---

Title: AlignSAE: Concept-Aligned Sparse Autoencoders

Abstract: Large Language Models (LLMs) encode factual knowledge within hidden parametric spaces that are difficult to inspect or control. While Sparse Autoencoders (SAEs) can decompose hidden activations into more fine-grained, interpretable features, they often struggle to reliably align these features with human-defined concepts, resulting in entangled and distributed feature representations. To address this, we introduce AlignSAE, a method that aligns SAE features with a predefined ontology through a "pre-train, then post-train" curriculum. After an initial unsupervised training phase, we apply supervised post-training to bind specific concepts to dedicated latent slots while preserving the remaining capacity for general reconstruction. This separation creates an interpretable interface where specific concepts can be inspected and controlled without interference from unrelated features. Empirical results demonstrate that AlignSAE enables precise causal interventions, such as reliable "concept swaps", by targeting single, semantically aligned slots, and further supports multi-hop reasoning and a mechanistic probe of grokking-like generalization dynamics.

URL: https://openreview.net/forum?id=I9UjKxW4nq

---

Title: Inference-Time Computations for LLM Reasoning and Planning: A Benchmark and Insights

Abstract: We examine the reasoning and planning capabilities of large language models (LLMs) in solving complex tasks. Recent advances in inference-time techniques demonstrate the potential to enhance LLM reasoning without additional training by exploring intermediate steps during inference. Notably, OpenAI's latest reasoning models show promising performance through use of multi-step reasoning and verification. Here, we explore how scaling inference-time techniques can improve reasoning and planning, focusing on understanding the tradeoff between computational cost and performance. To this end, we construct a comprehensive benchmark, known as *Sys2Bench*, and perform extensive experiments evaluating existing inference-time techniques on eleven diverse tasks across five categories, including arithmetic reasoning, logical reasoning, common sense reasoning, algorithmic reasoning, and planning.
*Sys2Bench* provides a unified framework for revealing the strengths and limitations of current inference-time methods, setting the stage for more principled and scalable approaches to LLM reasoning.

URL: https://openreview.net/forum?id=budZJyCK8G

---

Title: TOAST: Transformer Optimization using Adaptive and Simple Transformations

Abstract: Foundation models achieve State-of-the-art (SOTA) performance across different tasks, but their size and computational demands raise concerns about accessibility and sustainability. Existing efficiency methods often require additional retraining or fine-tuning, limiting their practicality. Recent findings suggest that deep neural networks exhibit internal representation similarities. While such similarities across different models have been exploited for enabling techniques such as model stitching and merging, intra-network redundancy remains underexplored as a source for efficiency gains. In this paper, we introduce TOAST, a framework that exploits these redundancies to approximate entire transformer blocks with lightweight closed-form mappings, such as linear transformation or even the identity, without any additional training. Across SOTA pretrained vision models (e.g., ViT, DINOv2, DeiT) and datasets ranging from MNIST to ImageNet-1k, TOAST reduces parameters and computation while preserving, and in some cases improving, downstream performance. These results show that large portions of transformer depth can be replaced by trivial functions, opening a new perspective on efficient foundation models.

URL: https://openreview.net/forum?id=fSwMCsBtTG

---

Title: Minimisation of Quasar-Convex Functions Using Random Zeroth-Order Oracles

Abstract: This paper explores the performance of a random Gaussian smoothing zeroth-order (ZO) scheme for minimising quasar-convex (QC) and strongly quasar-convex (SQC) functions in both unconstrained and constrained settings. For the unconstrained problem, we establish the ZO algorithm's convergence to a global minimum along with its complexity when applied to both QC and SQC functions. For the constrained problem, we introduce the new notion of proximal-quasar-convexity and prove analogous results to the unconstrained case. Specifically, we derive complexity bounds and prove convergence of the algorithm to a neighbourhood of a global minimum whose size can be controlled under a variance reduction scheme. Beyond the theoretical guarantees, we demonstrate the practical implications of our results on several machine learning problems where quasar-convexity naturally arises, including linear dynamical system identification and generalised linear models.

URL: https://openreview.net/forum?id=rRp9zZBKkZ

---

Title: VidHal: Benchmarking Hallucinations in Vision LLMs

Abstract: Vision Large Language Models (VLLMs) are widely acknowledged to be prone to hallucinations. Existing research addressing this problem has primarily been confined to image inputs, with sparse exploration of their video-based counterparts. Furthermore, current evaluation methods fail to capture nuanced errors in generated responses, which are often exacerbated by the rich spatiotemporal dynamics of videos. To address these two limitations, we introduce VidHal, a benchmark specially designed to evaluate video-based hallucinations in VLLMs. VidHal is constructed by bootstrapping video instances across a wide range of common temporal aspects. A defining feature of our benchmark lies in the careful creation of captions representing varying levels of hallucination associated with each video. To enable fine-grained evaluation, we propose a novel caption ordering task requiring VLLMs to rank captions by hallucinatory extent. We conduct extensive experiments on VidHal and comprehensively evaluated a broad selection of models, including both open-source and proprietary ones. Our results uncover significant limitations in existing VLLMs with respect to video-based hallucination generation. Through our benchmark, we aim to inspire further research on i) holistic understanding of VLLM capabilities, particularly regarding hallucination, and ii) advancing VLLMs to alleviate this problem.

URL: https://openreview.net/forum?id=7ccWCDbdM1

---

Title: ViscoReg: Neural Signed Distance Functions via Viscosity Solutions

Abstract: Implicit Neural Representations (INRs) that learn Signed Distance Functions (SDFs) from point cloud data represent the state-of-the-art for geometrically accurate 3D scene reconstruction. However, training these Neural SDFs often involves enforcing the Eikonal equation, an ill-posed equation that also leads to unstable gradient flows. Numerical Eikonal solvers have relied on viscosity approaches for regularization and stability. Motivated by this well-established theory, we introduce ViscoReg, a novel regularizer for Neural SDF methods that provably stabilizes training. Empirically, ViscoReg outperforms state-of-the-art approaches such as SIREN, DiGS, StEik, and HotSpot across most metrics on ShapeNet, Surface Reconstruction Benchmark, 3D scene reconstruction and reconstruction from real scans. We also establish novel generalization error estimates for Neural SDFs in terms of the training error, using the theory of viscosity solutions. Our empirical and theoretical results provide confidence in the general applicability of our method.

URL: https://openreview.net/forum?id=DWnMkBU4sF

---

Title: Scientific Theory of a Black-Box: A Life Cycle-Scale XAI Framework Based on Constructive Empiricism

Abstract: Explainable AI (XAI) offers a growing number of algorithms that aim to answer specific questions about black-box models. What is missing is a principled way to consolidate explanatory information about a fixed black-box model into a persistent, auditable artefact, that accompanies the black-box throughout its life cycle. We address this gap by introducing the notion of a scientific theory of a black box (SToBB). Grounded in Constructive Empiricism, a SToBB fulfils three obligations: (i) empirical adequacy with respect to all available observations of black-box behaviour, (ii) adaptability via explicit update commitments that restore adequacy when new observations arrive, and (iii) auditability through transparent documentation of assumptions, construction choices, and update behaviour. We operationalise these obligations as a general framework that specifies an extensible observation base, a traceable hypothesis class, algorithmic components for construction and revision, and documentation sufficient for third-party assessment. Explanations for concrete stakeholder needs are then obtained by querying the maintained record through interfaces, rather than by producing isolated method outputs. As a proof of concept, we instantiate a complete SToBB for a neural-network classifier on a tabular task and introduce the Constructive Box Theoriser (CoBoT) algorithm, an online procedure that constructs and maintains an empirically adequate rule-based surrogate as observations accumulate. Together, these contributions position SToBBs as a life cycle-scale, inspectable point of reference that supports consistent, reusable analyses and systematic external scrutiny.

URL: https://openreview.net/forum?id=kAjPN8pSTK

---

Title: VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has demonstrated success in enhancing LLM reasoning capabilities, but remains limited to single-turn interactions without tool integration. While recent **A**gentic **R**einforcement **L**earning with **T**ool use (ARLT) approaches have emerged to address multi-turn tool interactions, existing works develop task-specific codebases that suffer from fragmentation, synchronous execution bottlenecks, and limited extensibility across domains. These inefficiencies hinder broader community adoption and algorithmic innovation. We introduce **VerlTool**, a unified and modular framework that addresses these limitations through systematic design principles. VerlTool provides four key contributions: **(1)** upstream alignment with VeRL ensuring compatibility and simplified maintenance, **(2)** unified tool management via standardized APIs supporting diverse modalities including code execution, search, SQL databases, and vision processing, **(3)** asynchronous rollout execution achieving near 2
speedup by eliminating synchronization bottlenecks, and **(4)** a comprehensive evaluation demonstrating competitive performance across 6 ARLT domains. Our framework formalizes ARLT as multi-turn trajectories with multi-modal observation tokens (text/image/video), extending beyond single-turn RLVR paradigms. We train and evaluate models on mathematical reasoning, knowledge QA, SQL generation, visual reasoning, web search, and software engineering tasks, achieving results comparable to specialized systems while providing a unified training infrastructure. The modular plugin architecture enables rapid tool integration requiring only lightweight Python definitions, significantly reducing development overhead and providing a scalable foundation for tool-augmented RL research.

URL: https://openreview.net/forum?id=g2LCOW43Md

---

Title: Learning-Augmented Robust Algorithmic Recourse

Abstract: Algorithmic recourse provides individuals who receive undesirable outcomes from machine learning systems with minimum-cost improvements to achieve a desirable outcome. However, machine learning models often get updated, so the recourse may not lead to the desired outcome. The robust recourse framework chooses recourses that are less sensitive to adversarial model changes, but this comes at a higher cost. To address this, we initiate the study of learning-augmented algorithmic recourse and evaluate the extent to which a designer equipped with a prediction of the future model can reduce the cost of recourse when the prediction is accurate (consistency) while also limiting the cost even when the prediction is inaccurate (robustness). We propose a novel algorithm, study the robustness-consistency trade-off, and analyze how prediction accuracy affects performance.

URL: https://openreview.net/forum?id=IFssttzxnP

---

Title: Cluster-Dags as Powerful Background Knowledge For Causal Discovery

Abstract: Finding cause-effect relationships is of key importance in science. Causal discovery aims to recover a graph from data that succinctly describes these cause-effect relationships. However, current methods face several challenges, especially when dealing with high-dimensional data and complex dependencies. Incorporating prior knowledge about the system can aid causal discovery. In this work, we leverage Cluster-DAGs as a prior knowledge framework to warm-start causal discovery. We show that Cluster-DAGs offer greater flexibility than existing approaches based on tiered background knowledge and introduce two modified constraint-based algorithms, Cluster-PC and Cluster-FCI, for causal discovery in the fully and partially observed setting, respectively. Empirical evaluation on simulated data demonstrates that Cluster-PC and Cluster-FCI outperform their respective baselines without prior knowledge.

URL: https://openreview.net/forum?id=gSSmvVDKxB

---

Title: Existing Adversarial Large Language Model Unlearning Evaluations Are Inconclusive

Abstract: Unlearning seeks to remove sensitive knowledge from large language models, with success often judged through adversarial evaluations. In this work, we critically examine these evaluation practices and reveal key limitations that undermine their reliability. First, we show that adversarial evaluations introduce new information into the model, potentially masking true unlearning performance by re-teaching the model during evaluation. Second, we show that evaluation outcomes vary significantly across tasks, undermining the generalizability of current evaluation methods. Collectively, these issues suggest that existing evaluations risk mischaracterizing unlearning success (or failure). To address this, based on our empirical findings, we propose two principles—*minimal information injection* and *downstream task awareness*—for future evaluations.

URL: https://openreview.net/forum?id=Zxx1I4aJlm

---

Title: Augmented Mixup Procedure for Privacy-Preserving Collaborative Training

Abstract: Mixup, introduced by Zhang et al., is a regularization technique for training neural networks that generates convex combinations of input samples and their corresponding labels. Motivated by this approach, Huang et al. proposed InstaHide, an image encryption method designed to preserve the discriminative properties of data while protecting original information during collaborative training across multiple parties. However, recent studies by Carlini et al., Luo et al., and Chen et al. have demonstrated that attacks exploiting the linear system generated by the mixup procedure can compromise the security guarantees of InstaHide. To address this vulnerability, we propose a modified mixing procedure that introduces perturbations into samples before forming convex combinations, making the associated linear inverse problem ill-conditioned for adversaries. We present a theoretical worst-case security analysis and empirically evaluate the performance of our method in mitigating such attacks. Our results indicate that robust attack mitigation can be achieved by increasing the perturbation level, without causing a significant reduction in classification accuracy. Furthermore, we compare the performance of our approach with that of InstaHide on standard benchmark datasets, including MNIST, CIFAR-10, CIFAR-100, and Tiny-ImageNet.

URL: https://openreview.net/forum?id=1SrZyNgmpY

---

Title: Reranker Optimization via Geodesic Distances on k-NN Manifolds

Abstract: Current neural reranking approaches for retrieval-augmented generation (RAG) rely on cross-
encoders or large language models (LLMs), requiring substantial computational resources and
exhibiting latencies of 3-5 seconds per query. We propose Maniscope, a geometric reranking
method that computes geodesic distances on k-nearest neighbor (k-NN) manifolds constructed
over retrieved document candidates. This approach combines global cosine similarity with
local manifold geometry to capture semantic structure that flat Euclidean metrics miss.
Evaluating on eight BEIR benchmark datasets (1,233 queries), this method outperforms
HNSW graph-based baseline on the three hardest datasets (NFCorpus: +7.0%, TREC-
COVID: +1.6%, AorB: +2.8% NDCG@3) while being 3.2x faster (4.7 ms vs 14.8 ms average).
Compared to cross-encoder rerankers, it achieves within 2% accuracy at 10-45x lower latency.
On TREC-COVID, LLM-Reranker provides only +0.5% NDCG@3 improvement over our
method at 840x higher latency, positioning it as a practical alternative for real-time RAG
deployment. The approach requires O(N D + M^2 D + M k log k) complexity where M << N ,
enabling sub-10 ms latency. We plan to release code and data in an open-source repository.

URL: https://openreview.net/forum?id=HvzgEt51f2

---

Title: Random features for Grassmannian kernel approximation with bounded rank-one projections

Abstract: We propose a family of random feature maps for scalable kernel machines defined over low-dimensional subspaces in high dimensions, \ie over the Grassmannian manifold. This is typically useful in a machine learning context when data classes or clusters are well represented by the span of a few data points. Classical Grassmannian kernels such as the \emph{projection} or \emph{Binet–Cauchy} kernels require constructing full Gram matrices for practical applications, leading to prohibitive computational and memory costs for large subspace datasets in high dimensions. We address this limitation by computing specific random features of subspaces. These combine random rank-one projections of the subspace projection matrices with bounded non-linear transforms---periodic or binary---to tame the resulting heavy-tailed distribution.
We show that, in the random feature space, inner products approximate well-defined, rotation-invariant Grassmannian kernels, \ie depending only on the principal angles of the considered subspaces. Provided the number of random features is large compared to the subspace intrinsic dimension, we show that this approximation holds uniformly over all subspaces of fixed dimensions with high probability.
When the non-linear transform is periodic, the approximated kernel admits a closed-form expression with a tunable behaviour bridging inverse Binet–Cauchy and Gaussian-type regimes, while the binarised feature has no known closed-form kernel but lends itself to even more compactly represented one-bit subspace features. Moreover, we show how structured rank-one projections, leveraging randomised fast Fourier transforms, further reduce the random feature computational complexity without sacrificing accuracy in practical experiments.
We demonstrate the practicality of these techniques with synthetic experiments and classification tasks on the ETH-80 dataset representing visual object images from different viewpoints. The proposed random features recover Grassmannian geometry with high accuracy while reducing computation, memory, and storage requirements. This demonstrates that rank-one embeddings offer a practical and scalable alternative to classical Grassmannian kernels.

URL: https://openreview.net/forum?id=wq18dZJ2pA

---

Title: From Representation to Causation: A Three-Tier Framework and Open-Source Benchmark for Mechanistic Interpretability

Abstract: Interpretability research often conflates whether information is merely encoded within a model or whether it causally drives behavior. We introduce MechInterp3, a failure-aware framework that disentangles these properties into a three-tier hierarchy: (Tier-1) linear
encoding, (Tier-2) probe accessibility, and (Tier-3) causal responsibility. By applying this framework to six transformer architectures across four tasks, we reveal that standard causal interventions "silently fail” in approximately 50% of model-task combinations due to weak
behavioral contrast. This produces mathematically ill-conditioned estimates that undermine causal claims. Our systematic evaluation reveals three critical findings. First, we identify a pervasive tier dissociation where models with near-perfect probe accuracy often show zero or negative causal recovery, most notably in GPT-2 sentiment processing (−0.34 recovery). Second, we demonstrate that observational methods, such as attention weights and gradient attribution, are uninformative of causal structure, showing near-zero correlation ($\rho$ < 0.1) with intervention effects. Third, we discover that tasks requiring relational reasoning, such as NLI, induce more stable and localized causal circuits than surface-level tasks, despite having weaker linear representations. We release MechInterp3 as an open-source library to establish a rigorous statistical foundation for the study of machine intelligence.

URL: https://openreview.net/forum?id=thmHvIG4Xv

---

Title: ProJo4D: Progressive Joint Optimization for Sparse-View Inverse Physics Estimation

Abstract: Neural rendering has advanced significantly in 3D reconstruction and novel view synthesis, and integrating physics into these frameworks opens new applications such as physically accurate digital twins for robotics and XR.
However, the inverse problem of estimating physical parameters from visual observations remains challenging.
Existing physics-aware neural rendering methods typically require dense multi-view videos, making them impractical for scalable, real-world deployment.
Under sparse-view settings, the sequential optimization strategies employed by current approaches suffer from severe error accumulation: inaccuracies in initial 3D reconstruction propagate to subsequent stages, degrading physical state and material parameter estimates.
On the other hand, simultaneous optimization of all parameters fails due to the highly non-convex and often non-differentiable nature of the problem.
We propose ProJo4D, a progressive joint optimization framework that gradually expands the set of jointly optimized parameters. This design enables physics-informed gradients to refine geometry while avoiding the instability of direct joint optimization over all parameters.
Evaluations on synthetic and real-world datasets demonstrate that ProJo4D substantially outperforms prior work in 4D future state prediction and physical parameter estimation, achieving up to 10$\times$ improvement in geometric accuracy while maintaining computational efficiency.

URL: https://openreview.net/forum?id=pqvVrqlXCZ

---

Title: Retrieval as a Decision: Training-Free Adaptive Gating for Efficient RAG

Abstract: Retrieval-Augmented Generation (RAG) improves factuality but retrieving for every query often hurts quality while inflating tokens and latency. We propose Training-free Adaptive Retrieval Gating (\textbf{TARG}), a single-shot policy that decides when to retrieve using only a short, no-context draft from the base model. From the draft’s prefix logits, TARG computes lightweight uncertainty scores—mean token entropy, a margin signal derived from the top-1/top-2 logit gap via a monotone link, or small-$N$ variance across a handful of stochastic prefixes—and triggers retrieval only when the score exceeds a threshold. The gate is model-agnostic, adds only tens to hundreds of draft tokens, and requires no additional training or auxiliary heads. On NQ-Open, TriviaQA, and PopQA, TARG consistently pushes the accuracy–efficiency frontier: compared with Always-RAG\footnote{\textsc{Always-RAG}: retrieve for every query; \textsc{Never-RAG}: never retrieve.}, TARG matches or improves EM/F1 while reducing retrieval by 70–90\% and cutting end-to-end latency, and it remains close to Never-RAG in overhead. A central empirical finding is that under modern instruction-tuned LLMs the margin signal is a robust default (entropy compresses as backbones sharpen), with small-$N$ variance offering a conservative, budget-first alternative. We provide ablations over gate type and prefix length and use a $\Delta$-latency view to make budget trade-offs explicit.

URL: https://openreview.net/forum?id=L8gYtUZfVU

---

Title: ImageNot: A contrast with ImageNet preserves model rankings

Abstract: We introduce ImageNot, a dataset constructed explicitly to be drastically different than ImageNet while matching its scale. ImageNot is designed to test the external validity of deep learning progress on ImageNet. We show that key model architectures developed for ImageNet over the years rank identically to how they rank on ImageNet when trained from scratch and evaluated on ImageNot. Moreover, the relative improvements of each model over earlier models strongly correlate in both datasets. Our work demonstrates a surprising degree of external validity in the relative performance of image classification models when trained and evaluated on an entirely different dataset. This stands in contrast with absolute accuracy numbers that typically drop sharply even under small changes to a dataset.

URL: https://openreview.net/forum?id=YVbhMerXv9

---

Title: The Weakest Link: A Nodal Tension Model for Local Network Resilience

Abstract: The resilience of networked systems, defined by their ability to withstand targeted disruptions between a source and a target, is a critical concern in fields from ecology to infrastructure management. While spectral methods offer global insights, characterising the specific vulnerability of targeted pathways requires a more direct approach. In this paper, we frame this problem of local resilience through the powerful lens of linear duality, adopting the classic dual of the maximum $s-t$ flow problem and interpreting it through a novel physical analogy of "Nodal Tension". Our main theoretical results establish that (1) the model's optimal value is exactly equal to the capacity of the minimum $s-t$ cut, and (2) an optimal vertex solution exists where all node potentials are integer-valued ($\{0,1\}$), thus revealing the precise combinatorial structure of the cut. We validate these theorems computationally against standard algorithms. We then apply our model to a real-world conservation problem: assessing the connectivity of a grizzly bear corridor in the Canadian Rocky Mountains. The analysis reveals a novel ecological insight: the corridor's weakest link is not a remote bottleneck, but the local perimeter of the source protected area itself. This "null signal" for a classic choke point challenges conventional conservation paradigms and demonstrates our model's utility in generating non-obvious, actionable scientific discoveries. Our work provides a complete polyhedral characterisation of local network resilience, offering a computationally efficient and scientifically interpretable tool. Code to reproduce all results is available at https://anonymous.4open.science/r/tmlr-ldnr

URL: https://openreview.net/forum?id=mNAyZmizQi

---

Title: Graph State Networks (GSNs): Persistent Nodewise Selective State Space Models

Abstract: Temporal graphs are often observed as streams of timestamped interactions, where accurate prediction requires retaining and selectively using historical information nodes. Existing temporal graph models either (i) recompute representations from a sliding neighborhood/history at query time, or (ii) maintain a memory module but offer limited control and limited theory for what is retained over long horizons. We propose Graph State Networks (GSNs), a bucketed temporal-graph framework that maintains a persistent hidden state per node and updates it online using a content- and time-dependent selective state space update. Concretely, GSNs store node states in an explicit id-indexed state table and for each bucket, read the current state, update it with a time-aware Mamba-like mechanism, and commit the state back via an exponential moving average controlled by commit-rate $\alpha$. This commit mechanism provides an explicit "retention dial'' and enables a clean analysis of forgetting. We develop a capacity/recall theory for persistent node memory and show that, under an affine approximation of blank-bucket dynamics, the influence of a single past event decays geometrically at a rate governed by $\alpha$ and the induced linearized update. Empirically, GSNs are competitive on standard dynamic link prediction benchmarks. We validate the theory with controlled synthetic write--wait--read probes: measured influence is close to exponential in delay, and fitting short-delay dynamics predicts long horizon recall across commit rates.

URL: https://openreview.net/forum?id=zMEuBQfeT6

---

Title: Concept Realization Manifolds for Multi-Concept Activation and its (Dis)Entanglement in Large Language Models

Abstract: This work extends the Bias-CAV framework by introducing Concept Realization Manifolds (CRMs) as a geometric foundation for analyzing multi-concept activations and their entanglement in large language models. A theoretical framework is presented that reframes concepts as operational geometric regularities rather than latent variables. Multi-Concept Activation Subspaces (MCAS) are introduced to jointly model multiple bias-related concepts, addressing limitations of single-concept approaches identified in prior work. The operational limits of disentanglement are formally characterized through the Irreducible Measure Entanglement Theorem, which establishes that while directional entanglement can be reduced or removed, measure entanglement (activation distribution overlap) may persist due to data correlations and model optimization objectives. Conditional disentanglement methods are developed to operationalize partial concept separation. A comprehensive terminology hierarchy is established, including Concept Entanglement Fields, Conditional Concept Manifolds, and Intersectional Concept Regions. The framework is applied to bias analysis through multi-concept intervention mechanisms with formal fidelity guarantees. Examination of layer-wise entanglement patterns reveals structured relationships between concepts across transformer layers. Multi-axis evaluation demonstrates that MCAS reduces cross-dimension spillover effects by 2.4--3.6× compared to baseline methods in the evaluated settings, addressing concerns about unintended consequences in targeted bias mitigation. For practitioners, the framework provides operational methods for analyzing intersectional bias patterns (e.g., gender $\times$ profession interactions) and improving model interpretability through conditional disentanglement in the tested scenarios, even when perfect concept separation is theoretically impossible.

URL: https://openreview.net/forum?id=U8YU8dvm4A

---

Title: Discrete Diffusion in Large Language and Multimodal Models: A Survey

Abstract: In this work, we provide a systematic survey of Discrete Diffusion Language Models (dLLMs) and Discrete Diffusion Multimodal Language Models (dMLLMs). Unlike autoregressive (AR) models, dLLMs and dMLLMs adopt a multi-token, parallel decoding paradigm using full attention and a denoising-based generation strategy. This paradigm naturally enables parallel generation, fine-grained output control, and dynamic perception. These capabilities are previously difficult to achieve with AR models. A growing number of industrial-scale proprietary d(M)LLMs, as well as a large number of open-source academic d(M)LLMs, have demonstrated performance comparable to their autoregressive counterparts, while achieving up to 10$\times$ acceleration in inference speed. These developments position discrete diffusion models as a promising alternative to intelligence based on the traditional autoregressive approach. In this work, we present a comprehensive overview of the research in the dLLM and dMLLM domains. We trace the historical development of dLLMs and dMLLMs, formalize the underlying mathematical frameworks, list commonly-used modeling methods, and categorize representative models. We further analyze key techniques for training, inference, quantization. We also discuss the trustworthy issues and summarize emerging applications across language, vision-language, and biological domains and etc.. We conclude by discussing future directions for research and deployment.

URL: https://openreview.net/forum?id=0DsqnkP8Cp

---

Title: MixtureVitae: Open Web-Scale Pretraining Dataset With High Quality Instruction and Reasoning Data Built from Permissive-First Text Sources

Abstract: We present MixtureVitae, an open‑access pretraining corpus built to minimize legal risk while providing strong downstream performance. MixtureVitae follows a permissive‑first, risk‑mitigated sourcing strategy that combines public‑domain and permissively licensed text (e.g., CC‑BY/Apache) with carefully justified low‑risk additions (e.g., government works and EU TDM‑eligible sources). MixtureVitae adopts a simple, single-stage pretraining recipe that integrates a large proportion of permissive synthetic instruction and reasoning data—signals typically introduced during post-training and generally scarce in permissive web corpora. We categorize all sources into a three-tier scheme that reflects varying risk levels and provide shard-level provenance metadata to enable risk-aware usage. In controlled experiments using the open‑sci‑ref training protocol (fixed architectures and hyperparameters; 50B and 300B token budgets across 130M–1.7B parameters), models trained on MixtureVitae consistently outperform other permissive datasets across a suite of standard benchmarks, and at the 1.7B-parameters/300B-tokens setting, they surpass FineWeb‑Edu and approach DCLM late in training. Performance is particularly strong on MMLU and on math and code benchmarks: a 1.7B model pretrained on 300B MixtureVitae tokens matches or exceeds a strong 1.7B instruction‑tuned baseline on GSM8K, HumanEval, and MBPP, despite using over 36$\times$ fewer tokens (300B vs. $\approx$11T). Supported by a thorough decontamination analysis, these results show that permissive‑first data with high instruction and reasoning density, tiered by licensing and provenance-related risk, can provide a practical and risk-mitigated foundation for training capable LLMs, reducing reliance on broad web scrapes without sacrificing competitiveness.

URL: https://openreview.net/forum?id=SyCcUNUUMf

---

Title: Knowing When Not to Answer: Mitigating Social Bias in LLMs via Epistemic Abstention

Abstract: The growing application of Large Language Models (LLMs) to social contexts has led to an increase in unjustifiable social group attributions through their own stereotype-based responses; especially when responding to questions where there is little evidence to support a response or ambiguity to context. The lack of sufficient evidence often leads models to hallucinate socially grounded inferences, undermining fairness and trust. In this work, we attempt to mitigate social bias under ambiguity via epistemic uncertainty. We further introduce BHARATBBQ-R, a rationale-augmented extension of BHARATBBQ that explicitly annotates evidential sufficiency or absence. We propose \textbf{EPIK} (\textbf{E}pistemic \textbf{P}runing under \textbf{I}mplicit \textbf{K}nowledge), an epistemic calibration framework for detecting contextual insufficiency and enforceing principled abstention in case of inadequate evidences. Our framework enforces principled abstention in cases of inadequate evidence, while maintaining the performance for unambiguous cases. Prior bias mitigation technique focuses on suppressing stereotypes or debiasing representations; our proposed framework reframes biased behavior as a failure of epistemic humility. Experiments across five open-source LLMs show that EPIK substantially reduces the bias score for ambiguous contexts (from 1.41–1.52 to 0.86–0.98), while maintaining accuracy on unambiguous instances. From results, we establish the epistemic calibration enables selective suppression of stereotype-driven inference without indiscriminately refusing valid social reasoning.

URL: https://openreview.net/forum?id=UT5E31pYob

---

Title: Learning Structured Set Utility Functions with Contrastive Element Representations

Abstract: Learning utility functions over sets of elements is central to many machine learning and decision-making tasks such as feature selection, sensor placement, and content recommendation, where the goal is to evaluate and select an optimal subset of elements that provide the largest utility. These utility functions often exhibit desirable properties like monotonicity and submodularity over sets, but are typically expensive to evaluate and may lack an explicit analytical form. Moreover, the utility of a set can vary depending on certain contextual variables, further complicating the learning task. In this work, we propose a unified framework for modeling and learning contextual set functions with monotone submodular structure from data using deep networks equipped with structural regularization. Our key insight is to decompose the set function into two learnable components: (i) a context-conditioned contrastive embedding network that maps elements to a shared latent space based on performance and contextual similarity, and (ii) an aggregation network that predicts set-level utility from the sum of embeddings with a submodular norm-based regularization term encouraging the learned function to exhibit diminishing returns. This combination improves utility prediction for unseen sets and contexts and enables greedy subset selection, which admits near-optimality guarantees. We evaluate our framework on a wide variety of real-world contextual subset selection tasks such as content recommendation, document summarization, and sensor selection demonstrating consistent improvements in utility prediction compared to baselines and stronger subset selection performance under context shifts.

URL: https://openreview.net/forum?id=SZ8mOziJBx

---

Title: Forcing and Diagnosing Failure Modes of Fourier Neural Operators Across Diverse PDE Families

Abstract: Fourier Neural Operators (FNOs) have shown strong performance in learning solution maps of partial differential equations (PDEs). Still, their robustness under distribution shifts, long-horizon rollouts, and structural perturbations remains poorly understood. We present a systematic stress-testing framework that probes failure modes of FNOs across five qualitatively different PDE families: dispersive, elliptic, multi-scale fluid, financial, and chaotic systems. Rather than optimizing in-distribution accuracy, we design controlled stress tests — including parameter shifts, boundary or terminal-condition changes, resolution extrapolation with spectral analysis, and iterative rollouts — to expose vulnerabilities such as spectral bias, compounding integration errors, and overfitting to restricted boundary regimes. Our large-scale evaluation (1,000 trained models) reveals that distribution shifts in parameters or boundary conditions can inflate errors by more than an order of magnitude, while resolution changes primarily concentrate error in high-frequency modes. Input perturbations generally do not amplify error, though worst-case scenarios (e.g., localized Poisson perturbations) remain challenging. These findings provide a comparative failure-mode atlas and actionable insights for improving robustness in operator learning.

URL: https://openreview.net/forum?id=0S1LWZHQYn

---

Title: ABCDE: Agentic-Based Controlled Dynamic Erasure for Intent-Aware Safety Reasoning

Abstract: Concept erasure has emerged as a central mechanism for safety alignment in text-conditioned generative models, yet most existing approaches implicitly adopt an unconditional suppression paradigm in which target concepts are removed whenever they appear, regardless of contextual intent.
This formulation conflates benign and harmful concept usage, leading to systematic over-suppression that unnecessarily censors policy-compliant content and degrades model utility.
We argue that safety intervention should instead be framed as a decision problem grounded in contextual language understanding, rather than as a purely mechanistic removal operation.
Based on this perspective, we introduce {Intent-Aware Concept Erasure} (ICE), a decision-centric formulation that explicitly separates the question of {whether} a concept should be suppressed from {how} suppression is realized, enabling context-sensitive intervention policies that preserve benign usage while maintaining safety guarantees.
To operationalize this formulation, we present ABCDE, an agentic framework that infers stable intervention decisions from semantic context and realizes them through minimal prompt rewriting with closed-loop output feedback.
Experiments on a paired benchmark designed to isolate contextual intent demonstrate that ABCDE substantially reduces unnecessary interventions while preserving strong safety effectiveness, outperforming unconditional concept erasure baselines.

URL: https://openreview.net/forum?id=IFjPhMcXJB

---

Title: Probing the Impact of Scale on Data-Efficient, Generalist Transformer World Models for Atari

Abstract: Developing generalist systems that retain human-like data efficiency is a central challenge. While world models (WMs) offer a promising path, existing research often conflates architectural mechanisms with the independent impact of model \emph{scale}. In this work, we use a minimalist transformer world model to analyze scaling behaviors on the Atari 100k benchmark, using fixed offline datasets derived from a presupposed expert policy. Our results reveal that environments fundamentally fall into distinct scaling regimes, even when constrained by identical offline data budgets and model capacities. For individual tasks, some environments naturally allow models to pass the interpolation threshold, yielding monotonic improvements in the overparameterized regime, while others remain trapped in the classical regime, where larger world models degrade fidelity. Conversely, in the unified setting, i.e., a single transformer trained on a suite of 26 Atari environments, we uncover a novel phenomenon that we term \emph{positive regularization}: joint training stabilizes scaling dynamics, ensuring monotonic gains across all environments, regardless of their distinct inherent scaling regimes. Finally, we demonstrate that improved fidelity translates directly to downstream control, with policies learned entirely within the simulated dynamics achieving a median expert-random-normalized score of 0.770. Our findings suggest that future progress lies as much in precise scaling strategies as in architectural innovation.

URL: https://openreview.net/forum?id=wVcvqtKaMY

---

Title: Scalable Equilibrium Propagation via Intermediate Error Signals for Deep Convolutional CRNNs

Abstract: Equilibrium Propagation (EP) is a biologically inspired local learning rule first proposed for convergent recurrent neural networks (CRNNs), in which synaptic updates depend only on neuron states from two distinct phases. EP estimates gradients that closely align with those computed by Backpropagation Through Time (BPTT) while significantly reducing computational demands, positioning it as a potential candidate for on-chip training in neuromorphic architectures. However, prior studies on EP have been constrained to shallow architectures, as deeper networks suffer from the vanishing gradient problem, leading to convergence difficulties in both energy minimization and gradient computation. To address the vanishing gradient problem in deep EP networks, we propose a novel EP framework that incorporates intermediate error signals to enhance information flow and convergence of neuron dynamics. This is the first work to integrate knowledge distillation and local error signals into EP, enabling the training of significantly deeper architectures. Our proposed approach achieves state-of-the-art performance on the CIFAR-10 and CIFAR-100 datasets, showcasing its scalability on deep VGG architectures. These results represent a significant advancement in the scalability of EP, paving the way for its application in real-world systems.

URL: https://openreview.net/forum?id=iXFmzKpPNA

---

Title: Enhancing Interpretability: A Versatile Clue-Based Framework for Faithful In-Depth Interpretations

Abstract: Despite the state-of-the-art performance of deep neural networks, they are susceptible to bias and malfunction in unforeseen situations. Moreover, the complex computations underlying their reasoning are not human-understandable, hindering the development of trust and the validation of decisions. Local interpretation methods seek to provide explanations for individual model decisions with two key goals: faithfulness to the model and human-understandability. However, existing approaches often suffer from performance loss, limited applicability to pre-trained models, and unfaithful explanations. Seeking more faithful interpretations, we introduce a novel definition, called Distinguishing Clue, which sets of input regions that uniquely promote specific network decisions, detected through our Local Attention Perception (LAP) module. Our innovative training scheme allows LAP to learn these clues without relying on expert annotations. It also provides means for both general and expert knowledge injection. The system is usable for training networks from scratch, enhancing their interpretability, and interpreting networks that have already been trained. We demonstrate the superiority of the proposed method by evaluating it on different architectures across two datasets, including ImageNet. The proposed framework offers more valid and faithful-to-the-model interpretations than the commonly used explainer methods.

URL: https://openreview.net/forum?id=70STejAuwx

---

Title: Zero-Shot Model Search via Text-to-Logit Matching

Abstract: With the increasing number of publicly available models, there are pre-trained, online models for many tasks that users require. In practice, users cannot find the relevant models as current search methods are text-based using the documentation which most models lack of. This paper presents ProbeLog, a method for retrieving classification models that can recognize a target concept, such as "Dog", without access to model metadata or training data. Specifically, ProbeLog computes a descriptor for each output dimension (logit) of each model, by observing its responses to a fixed set of inputs (probes). Similarly, we compute how the target concept is related to each probe. By measuring the distance between the probe responses of logits and concepts, we can identify logits that recognize the target concept. This enables zero-shot, text-based model retrieval ("find all logits corresponding to dogs"). To prevent hubbing, we calibrate the distances of each logit, according to other closely related concepts. We demonstrate that ProbeLog achieves high retrieval accuracy, both in ImageNet and real-world fine-grained search tasks, while being scalable to full-size repositories. Importantly, further analysis reveals that the retrieval order is highly correlated with model and logit accuracies, thus allowing ProbeLog to find suitable and accurate models for users tasks in a zero-shot manner.

URL: https://openreview.net/forum?id=m4Qmst7iH7

---

Title: EMG-JEPA: Towards Scalable and Generalizable sEMG-Based Hand Pose Estimation via Self-Supervised Learning

Abstract: This work introduces EMG-JEPA, a Joint Embedding Predictive Architecture (JEPA) designed to improve generalization for hand pose estimation from surface electromyography (sEMG) signals. Collecting labeled sEMG data for hand pose estimation is costly, as it requires synchronizing the sEMG recordings with motion capture systems to obtain precise joint-angle annotations. To mitigate the dependency on such expensive labels, EMG-JEPA uses self-supervised learning to derive transferable representations from unlabeled sEMG signals, which can then be fine-tuned for downstream hand pose estimation. We analyze the effectiveness of EMG-JEPA on data collected from three wrist-worn devices, providing signals with 8, 16, and 110 channels. Our results show that EMG-JEPA can improve cross-user hand pose estimation, particularly in high-channel-density settings, reducing joint-angle error by up to 3.55% and 5.13% for the 16- and 110-channel setups, respectively. Further, results from the 8-channel setup suggest a channel-density threshold (≈16 channels), below which JEPA-based pretraining offers limited gains. Overall, our study identifies key design choices for developing a JEPA for sEMG, offering a scalable approach to reduce labeled data requirements.

URL: https://openreview.net/forum?id=H4PM2SsSor

---

Title: Large Language Models as Interfaces to Structured Data: A Survey

Abstract: Structured data, including tables, relational databases, and knowledge graphs, underpins a wide range of scientific, industrial, and decision-making workflows. Although large language models (LLMs) are primarily trained on unstructured text, recent work has demonstrated their effectiveness in tasks involving structured data, such as table reasoning, natural language to SQL translation, data transformation, and automated analytics. These developments indicate that LLMs can function as a general interface between natural language inputs, structured representations, and executable operations.
This survey presents a theory-oriented overview of LLMs for structured data. We introduce an abstract formulation that characterizes structured data tasks by the structured state, the query or control input, the output space, and the execution environment. Based on this formulation, which we revisit throughout the taxonomy and evaluation sections, we propose a taxonomy that organizes existing methods according to the functional role of the LLM, including encoding, reasoning, translation, planning, and agent-based execution, as well as by representation strategies and learning signals. This taxonomy highlights shared design principles across different task settings and clarifies methodological trade-offs.
We examine evaluation protocols, generalization properties, and failure modes specific to structured data tasks, with an emphasis on faithfulness, schema robustness, and execution correctness. Finally, we outline open research directions for LLM-based structured data systems, including challenges related to scalability, symbolic and neural integration, and learning with execution-based supervision. The survey aims to provide a unified conceptual framework and a reference point for future research on large language models applied to structured data.

URL: https://openreview.net/forum?id=2z8fcjrN5Q

---

Title: A Mechanistic View of Catastrophic Overfitting

Abstract: Adversarial Training (AT) suffers from a critical failure mode known as Catastrophic Overfitting (CO), where robustness to weak single-step adversaries does not translate to strong multi-step adversaries. Despite progress in mitigating CO, its underlying mechanisms remain poorly understood. In this work, we address two central questions: (1) Why does CO appear? and (2) What role do the number of Projected Gradient Descent (PGD) steps and PGD initialization play in CO? Using mathematically tractable models, we reveal a phase transition in the adversarial budget $\epsilon$, above which non-robust solutions become optimal. Furthermore, we show that CO exists for any well separated dataset, any number of PGD steps $S$, $\epsilon$ as small as desired, and randomized initialization. Our insights align with empirical observations in the community and help explain the difficulties in avoiding CO at larger scales. We believe our results deepen the understanding of CO and provide a foundation for developing future-proof solutions.

URL: https://openreview.net/forum?id=BQEZ3ZZBt3

---

Title: LLMs Can Leverage Graph Structural Information in Text-Attributed Graphs

Abstract: A recurring claim in recent LLM-as-predictor work on text-attributed graphs (TAGs) is that in-context learning (ICL) benefits mainly from the textual attributes of neighboring nodes (often via homophily), while general-purpose LLMs cannot reliably exploit graph structure—especially edge direction and local topology. This paper re-evaluates that claim by asking a focused question: Can general-purpose LLMs genuinely leverage graph structural information in TAGs via ICL, once we remove confounding factors and provide an architecture explicitly designed for structural reasoning? We first introduce controlled neighborhood rewiring tests that keep node texts and label distributions fixed while perturbing structure. Across seven LLMs and four low-homophily WebKB graphs, both first-order flipping and two-hop extreme rewiring consistently degrade accuracy -2.06~-23.15% average relatively drop), demonstrating genuine structural sensitivity. After flipping, structural sensitivity strongly increases with model capability, and the performance advantage of stronger models arises primarily from correct structure rather than better text-only processing. We further show that apparent ``structure misuse'' in weaker models can be corrected by adding explicit step-by-step instructions. The previous claims is due to confounding factors—the traditional ICL framework lacks a dedicated mechanism for graph structure reasoning and handling lengthy multi-hop neighborhood contexts, rather than the inherent nature of LLMs themselves. Motivated by these findings, we propose the Text Attributes Passing Thoughts Network (TAPTN), an edge-aware, MPNN-like ICL framework that iteratively summarizes multi-hop neighborhoods using a structure-aware template and self-generated instructions. TAPTN substantially outperforms zero-shot CoT and GraphICL-style baselines on five TAG datasets by at least +13.98%, especially on malignant heterophilic graphs (with +15~25% gain), and when used to produce structurally enriched texts for downstream fine-tuning, achieves performance competitive with state-of-the-art GNN pipelines. Collectively, the results establish that LLMs can exploit structure information in TAGs as effective as SOTA GNNs through ICL once with an appropriate architecture mitigating the confounding factors.

URL: https://openreview.net/forum?id=WhaVqEkkMY

---

Title: The LLM Data Auditor: A Metric-oriented Survey on Quality and Trustworthiness in Evaluating Synthetic Data

Abstract: Large Language Models (LLMs) have emerged as powerful tools for generating data across various modalities. By transforming data from a scarce resource into a controllable asset, LLMs mitigate the bottlenecks imposed by the acquisition costs of real-world data for model training, evaluation, and system iteration. However, ensuring the high quality of LLM-generated synthetic data remains a critical challenge. Existing research primarily focuses on generation methodologies, with limited direct attention to the quality of the resulting data. Furthermore, most studies are restricted to single modalities, lacking a unified perspective across different data types. To bridge this gap, we propose the LLM Data Auditor framework. In this framework, we first describe how LLMs are utilized to generate data across six distinct modalities. More importantly, we systematically categorize intrinsic metrics for evaluating synthetic data from two dimensions: quality and trustworthiness. This approach shifts the focus from extrinsic evaluation, which relies on downstream task performance, to the inherent properties of the data itself. Using this evaluation system, we analyze the experimental evaluations of representative generation methods for each modality and identify substantial deficiencies in current evaluation practices. Based on these findings, we offer concrete recommendations for the community to improve the evaluation of data generation. Finally, the framework outlines methodologies for the practical application of synthetic data across different modalities.
Our repository has been released: \href{https://anonymous.4open.science/r/Awesome-LLM-Data-Generation-6457/README.md}{https://anonymous.4open.science/r/Awesome-LLM-Data-Generation-6457}.

URL: https://openreview.net/forum?id=f2gS9Ly6tA

---

Title: TextTeacher: What Can Language Teach About Images?

Abstract: The platonic representation hypothesis suggests that sufficiently large models converge to a shared representation geometry, even across modalities.
Motivated by this, we ask:
Can the semantic knowledge of a language model efficiently improve a vision model?
As an answer, we introduce TextTeacher, a simple auxiliary objective that injects text embeddings as additional information into image classification training.
TextTeacher uses readily available image captions, a pre-trained and frozen text encoder, and a lightweight projection to produce semantic anchors that guide efficiently representations during training while leaving the inference-time model unchanged.
On ImageNet with standard ViT backbones, TextTeacher improves accuracy by up to $+2.7$ percentage points (p.p.) and yields consistent transfer gains (on average $+1.0$ p.p.) under the same recipe and compute.
It outperforms vision knowledge distillation, yielding more accuracy at a constant compute budget or similar accuracy, but $33\%$ faster.
Our analysis indicates that TextTeacher acts as a feature‑space preconditioner, shaping deeper layers in the first stages of training, and aiding generalization by supplying complementary semantic cues.
TextTeacher adds negligible overhead, requires no costly multimodal pretraining and preserves the simplicity and latency of pure vision models.
We release our code at \texttt{<URL upon acceptance>}.

URL: https://openreview.net/forum?id=Xwb0aEUwKh

---

Title: Learning Materials Interatomic Potentials via Hybrid Invariant-Equivariant Architectures

Abstract: Machine learning interatomic potentials (MLIPs) can predict energy, force, and stress of materials and enable a wide range of downstream discovery tasks. A key design choice in MLIPs involves the trade-off between invariant and equivariant architectures. Invariant models offer computational efficiency but may not perform as well, especially when predicting high-order outputs. In contrast, equivariant models can capture high-order symmetries, but are computationally expensive. In this work, we propose HIENet, a \underline{h}ybrid \underline{i}nvariant-\underline{e}quivariant materials interatomic potential model that integrates both invariant and equivariant message passing layers. Furthermore, we show that HIENet provably satisfies key physical constraints. HIENet achieves superior performance with considerable computational speedups over prior models. Experimental results on both common benchmarks and downstream materials discovery tasks demonstrate the efficiency and effectiveness of HIENet. Finally, additional ablations further demonstrate that our hybrid invariant-equivariant approach scales well across model sizes and works with different equivariant model architectures, providing powerful insights into future MLIP designs.

URL: https://openreview.net/forum?id=fq3nrVqNmL

---

Title: Twin: Tuning Learning Rate and Weight Decay of Deep Homogeneous Classifiers without Validation

Abstract: We introduce \textbf{T}une \textbf{w}ithout Validat\textbf{i}o\textbf{n} (Twin), a simple and effective pipeline for tuning learning rate and weight decay of homogeneous classifiers without validation sets, eliminating the need to hold out data and avoiding the two-step process.
Twin leverages the margin-maximization dynamics of homogeneous networks and an empirical bias–variance scaling law that links training and test losses across hyper-parameter configurations.
This mathematical modeling yields a regime-dependent, validation-free selection rule: in the \emph{non-separable} regime, training loss is monotonic in test loss and therefore predictive of generalization, whereas in the \emph{separable} regime, the parameter norm becomes a reliable indicator of generalization due to margin maximization.
Across 37 dataset-architecture configurations for image classification, we demonstrate that Twin achieves a mean absolute error of 1.28\% compared to an \textit{Oracle} baseline that selects HPs using test accuracy.
We demonstrate Twin’s benefits in scenarios where validation data may be scarce, such as small-data regimes, or difficult and costly to collect, as in medical imaging tasks.
We plan to release our code.

URL: https://openreview.net/forum?id=1SIP2M2HJa

---

Title: Improving OOD Robustness via Background-Aware Test- Time-AugmentationinBlack-BoxandResourceConstrained Settings

Abstract: Deep learning models for text classification typically achieve strong performance on in-distribution (ID) data but often fail to generalize to out-of-distribution (OOD) inputs. This degradation frequently arises because models rely on spurious background cues (e.g., specific syntax or register) learned during training, which become unreliable when the domain changes. While recent Test-Time Augmentation (TTA) approaches have enabled robustness in black-box settings, they often rely on unconstrained rewriting strategies. For instance, standard In-Context Rewriting (ICR) instructs Large Language Models (LLMs) to modify input details to match ID exemplars, creating a high risk of semantic drift and label flipping, particularly when using smaller, resource-constrained LLMs. In this work, we propose a Background-Aware TTA framework that strictly disentangles style from semantics. Unlike prior methods that encourage broad paraphrasing, we utilize a semantic-constrained alignment strategy that enables small, efficient LLMs to transform specific background attributes, such as tone and sentence structure, to match in-distribution priors while explicitly enforcing the preservation of original meaning. This approach mitigates OOD degradation by neutralizing spurious background shifts, allowing frozen black-box models to process inputs in their native distribution without risking semantic corruption. Empirical evaluations across multiple text classification benchmarks demonstrate that our targeted alignment strategy outperforms unconstrained augmentation baselines. By generating higher-fidelity augmentations, our method achieves superior OOD robustness with reduced computational overhead, establishing a viable path for deploying robust in resource-limited black-box environments.

URL: https://openreview.net/forum?id=xptPQVCy5X

---

Title: Randomized PCA Forest for Unsupervised Outlier Detection

Abstract: We propose a novel unsupervised outlier detection method based on Randomized Principal Component Analysis (PCA). Motivated by the performance of Randomized PCA (RPCA) Forest in approximate K-Nearest Neighbor (KNN) search, we develop a novel unsupervised outlier detection method that utilizes RPCA Forest for unsupervised outlier detection by deriving an outlier score from its intrinsic properties. Experimental results showcase the superiority of the proposed approach compared to the classical and state-of-the-art methods in performing the outlier detection task on several datasets while performing competitively on the rest. The extensive analysis of the proposed method reflects its robustness and its computational efficiency, highlighting it as a good choice for unsupervised outlier detection.

URL: https://openreview.net/forum?id=hHJWe6Qcfe

---

Title: One Rank at a Time: Cascading Error Dynamics in Sequential Learning

Abstract: Sequential learning --where complex tasks are broken down into simpler, hierarchical components-- has emerged as a paradigm in AI. This paper views sequential learning through the lens of low-rank linear regression, focusing specifically on how errors propagate when learning rank-1 subspaces sequentially. We present an analysis framework that decomposes the learning process into a series of rank-1 estimation problems, where each subsequent estimation depends on the accuracy of previous steps. Our aim is explanatory rather than comparative: we analyze error propagation and derive compute allocation guidance without claiming superiority over joint or one-shot training. Our contribution is a characterization of the error propagation in this sequential process, establishing bounds on how errors --e.g., due to limited computational budgets and finite precision-- affect the overall model accuracy. We prove that these errors compound in predictable ways, with implications for both algorithmic design and stability guarantees.

URL: https://openreview.net/forum?id=EG7XJANxhX

---

Title: Analysis of Natural Actor-Critic with Randomized Low- Discrepancy Sampling

Abstract: Natural gradient methods are appealing in policy optimization due to their invariance to smooth reparameterization and their ability to account for the local geometry of the policy manifold. These properties often lead to improved conditioning of the optimization problem compared to Euclidean policy gradients. However, their reliance on Monte Carlo estimation introduces high variance and sensitivity to hyperparameters. In this paper, we address these limitations by integrating Randomized Quasi-Monte Carlo (RQMC) sampling into the natural actor-critic (NAC) framework. We revisit the NAC linear system and show that, under imperfect value approximation, the NAC update decomposes exactly into the true natural gradient plus a Fisher-metric projection of the Bellman residual onto the score-feature span. We further develop RQMC-based NAC estimators that replace IID sampling with randomized low-discrepancy trajectories. We provide a variance analysis showing that these RQMC-based estimators strictly reduce estimator variance under mild regularity conditions, thereby reducing the propagation of Bellman-residual error into the natural-gradient update. Empirical results on certain reinforcement learning benchmarks demonstrate that our RQMC-enhanced algorithms consistently match or improve upon the performance and stability of their vanilla counterparts

URL: https://openreview.net/forum?id=kOSx9v6dfb

---

Title: Injecting Falsehoods: Adversarial Man-in-the-Middle Attacks Undermining Factual Recall in LLMs

Abstract: LLMs are now an integral part of information retrieval. As such, their role as question answering chatbots raises significant concerns due to their shown vulnerability to adversarial man-in-the-middle (MitM) attacks. Here, we propose the first principled attack evaluation on LLM factual memory under prompt injection via Xmera, our novel, theory-grounded MitM framework. By perturbing the input given to "victim" LLMs in three closed-book and fact-based QA settings, we undermine the correctness of the responses and assess the uncertainty of their generation process. Surprisingly, trivial instruction-based attacks report the highest success rate (up to ~85.3%) while simultaneously having a high uncertainty for incorrectly answered questions. To provide a simple defense mechanism against Xmera, we train Random Forest classifiers on the response uncertainty levels to distinguish between attacked and unattacked queries (average AUC of up to ~96%). We believe that signaling users to be cautious about the answers they receive from black-box and potentially corrupt LLMs is a first checkpoint toward user cyberspace safety.

URL: https://openreview.net/forum?id=DWxrPA4ZBY

---

Title: Extracting Probabilistic Knowledge from Large Language Models for Bayesian Network Parameterization

Abstract: In this work, we evaluate the potential of Large Language Models (LLMs) in building Bayesian Networks (BNs) by approximating domain expert priors. LLMs have demonstrated potential as factual knowledge bases; however, their capability to generate probabilistic knowledge about real-world events remains understudied. We explore utilizing the probabilistic knowledge inherent in LLMs to derive probability estimates for statements regarding events and their relationships within a BN. Using LLMs in this context allows for the parameterization of BNs, enabling probabilistic modeling within specific domains. Our experiments on eighty publicly available Bayesian Networks, from healthcare to finance, demonstrate that querying LLMs about the conditional probabilities of events provides meaningful results when compared to baselines, including random and uniform distributions, as well as approaches based on next-token generation probabilities. We explore how these LLM-derived distributions can serve as expert priors to refine distributions extracted from data, especially when data is scarce. Overall, this work introduces a promising strategy for automatically constructing Bayesian Networks by combining probabilistic knowledge extracted from LLMs with real-world data. Additionally, we establish the first comprehensive baseline for assessing LLM performance in extracting probabilistic knowledge.

URL: https://openreview.net/forum?id=Fy3Byg3CVo

---

Title: Explicit Second-Order Min-Max Optimization: Practical Algorithms and Complexity Analysis

Abstract: We propose and analyze several inexact regularized Newton-type methods for finding a global saddle point of \emph{convex-concave} unconstrained min-max optimization problems. Compared to first-order methods, our understanding of second-order methods for min-max optimization is relatively limited, as obtaining global rates of convergence with second-order information can be much more involved. In this paper, we examine how second-order information is used to speed up extra-gradient methods, even under inexactness. In particular, we show that the proposed methods generate iterates that remain within a bounded set and that the averaged iterates converge to an $\epsilon$-saddle point within $O(\epsilon^{-2/3})$ iterations in terms of a restricted gap function. We also provide a simple routine for solving the subproblem at each iteration, requiring a single Schur decomposition and $O(\log\log(1/\epsilon))$ calls to a linear system solver in a quasi-upper-triangular system. Thus, our method improves the existing line-search-based second-order min-max optimization methods by shaving off an $O(\log\log(1/\epsilon))$ factor in the required number of Schur decompositions. Finally, we conduct experiments on synthetic and real data to demonstrate the efficiency of the proposed methods.

URL: https://openreview.net/forum?id=Hyk1GhEXGa

---

Title: Iterative Preference Optimization with Proximal Policy Regularization for Large Language Model Alignment

Abstract: Aligning large language models (LLMs) with human preferences is commonly achieved via supervised fine-tuning followed by preference optimization. While direct preference optimization (DPO) offers a simple and efficient alternative to RLHF, its offline and off-policy nature can induce a distribution shift between the policy used to sample preference pairs and the continually updated policy being optimized, reducing data efficiency and limiting alignment gains. We propose \emph{Iterative Proximal Policy Regularized Preference Optimization} (Iterative PRPO), which introduces a proximal regularization that explicitly constrains the optimized policy to remain close to the sampling policy within each iteration, thereby mitigating distribution shift while preserving the efficiency of DPO-style updates. Starting from an RLHF objective with a KL constraint to the sampling policy, we derive an equivalent direct preference optimization formulation that requires offline preference pairs under the sampling policy. Across summarization and dialogue alignment benchmarks, Iterative PRPO consistently improves win rates over offline DPO and iterative DPO baselines under both reward-model and GPT-4o evaluations, with comparable computational cost. Moreover, the same proximal regularization principle generalizes to advanced preference optimization objectives, including Identity Preference Optimization (IPO), self-play preference optimization (SPPO), and efficient exact optimization (EXO), yielding Iterative PR-IPO, PR-SPPO, and PR-EXO variants that further strengthen alignment across model scales.

URL: https://openreview.net/forum?id=xoxO5Tr4Vh

---

Title: Adversarially Robust Latent Bandits in Multiplayer Asymmetric Settings

Abstract: We examine a novel multiplayer extension of the latent multi-armed bandit problem as formulated in \cite{maillard2014latent}, with broad applications such as recommendation systems and cognitive radio. Following \cite{chang2022online}, we examine three information asymmetric scenarios: Problem A, in which players receive identical rewards but cannot observe each other's actions; Problem B, players receive private i.i.d rewards but can observe others' actions; and Problem C, players receive private i.i.d rewards and cannot observe others' actions. For problems A and B, we provide nearly optimal gap-independent regret bounds. When reduced to the single agent setting, our results improve on \cite{maillard2014latent} by allowing for adversarial nature's actions. For Problem C, we use the knowledge of the reward means to improve on the results in \cite{chang2022online}.

URL: https://openreview.net/forum?id=v5tLAfd2Ke

---

Title: Sliding Window Recurrences for Sequence Models

Abstract: Multi-hybrid architectures are poised to take over language modeling due to better quality
and performance. We introduce a hierarchical decomposition framework for linear recur-
rences that allows us to develop algorithms aligned with GPU memory hierarchies, yielding
Sliding Window Recurrences. We focus specifically on truncating recurrences to hardware-
aligned windows which are naturally jagged, limiting costly inter-warp communication.
Using SWR, we develop Phalanx layers that serve as drop-in replacements for windowed
attention or linear recurrences. In 1B parameter multi-hybrid models, Phalanx achieves
over 10-40% speedup across 4K to 32K context length over optimized Transformers while
matching perplexity.

URL: https://openreview.net/forum?id=V09uO70ouz

---

Title: SAPIENT: Continual Test-time Adaptation via Lightweight plug-and-play Adapters

Abstract: Continual test-time adaptation (TTA) is the problem of adapting a pre-trained source model at inference-time to handle test samples from a non-stationary distribution, while not forgetting the knowledge acquired from earlier domains. Existing continual TTA methods either make unsupervised test-time updates to the entire model, which can be expensive and prone to forgetting, or do so by keeping the base model frozen and adding a small number of learnable adapter modules for better time/memory efficiency and mitigating forgetting. We present SAPIENT (continual teSt-time adaPtation vIa lightwEight plug-aNd-play adapTers), a parameter-efficient adapter based approach which not only offers the usual benefits of the adapter based continual TTA methods, but offers additional key benefits, such as (1) its simple plug-and-play design seamlessly integrates with various continual TTA losses, making our approach complementary to existing continual TTA methods, improving their time/memory efficiency and knowledge retention, (2) it does not require access to the source domain data unlike recent adapter based continual TTA methods, and (3) its parameter-efficiency also makes it computationally feasible to design its Bayesian extensions which can help in estimating the uncertainty in adapter weights, which in turn yields more robust predictions. Through extensive experiments on a segmentation task and four classification tasks for continual TTA, we demonstrate that, with substantially ($\sim$90\%) fewer trainable parameters, our method achieves better/similar performance compared to existing SOTA continual TTA methods, resulting in efficient and robust adaptation and inference at test-time.

URL: https://openreview.net/forum?id=zhS2NbPR7q

---

Title: Re-evaluating Minimum Bayes Risk Decoding for Automated Speech Recognition Tasks

Abstract: While sample-based Minimum Bayes Risk (MBR) decoding has shown to outperform beam search in many text-to-text generation tasks with modern LLMs, beam search remains the dominant approach for Automatic Speech Recognition (ASR) and Speech Translation (ST). To date, the efficacy of MBR decoding within modern speech systems lacks comprehensive evaluation.
Given that MBR decoding is effective in text-to-text generation tasks, it is reasonable to expect it to also be effective for speech-to-text tasks.
In this paper, we evaluate MBR decoding for ASR and ST tasks on English and Japanese using Whisper and its derivative models.
We observe that the accuracy of MBR decoding outperforms that of beam search in most of the experimental settings we have evaluated.
The results show that MBR decoding is a promising method for ASR and ST tasks that require high accuracy.

URL: https://openreview.net/forum?id=I6iLWhRIsf

---

Title: BandAid: A Plug-in Patch for Backdoor Defenses against Clean-Label Attacks in NLP

Abstract: Recent state-of-the-art defenses against backdoor attacks on text classifiers have shown strong performance. A common approach is to analyze the feature space of the poisoned model to detect and mitigate suspicious samples during inference time. However, most existing defenses target “dirty-label” attacks, in which a poisoned sample’s content is inconsistent with its assigned label. In contrast, very few defenses have been evaluated against “clean-label” attacks, where the text content correctly matches the label but still triggers the backdoor. Yet, clean-label backdoors are particularly concerning, as they remain highly stealthy while being equally harmful. We find that many defenses fail to identify the decision boundary between clean and poisoned samples precisely. To this end, we investigate the performance of three inference-time defenses—DAN, BadActs, and MDP–against both insertion-based and paraphrase-based clean-label backdoor attacks, and discuss their limitations. We then propose a universal and simple plug-in module, BandAid, to strengthen existing defenses. BandAid significantly reduces the attack effectiveness in 99 out of 102 cases, with effectiveness reduced by up to 99.8%, while improving clean data accuracy by 7.0% on average. At its core, BandAid fine-tunes a lightweight classifier using suspicious samples flagged by existing defenses along with a small clean validation set. In this way, BandAid transforms an anomaly-detection task (identifying unusual examples) into a discriminative classification task (identifying patterns among suspicious samples), which leads to a substantially more effective defense. BandAid proves to be robust under stress tests across a range of attack types and datasets, providing strong improvements in both security and generalization.

URL: https://openreview.net/forum?id=F2cvY4xZmE

---

Title: VECO: VEctor COnformity Based OOD Detection in Text and Multimodal Models

Abstract: Out-of-distribution (OOD) detection is critical for the reliable deployment of natural language processing and multimodal document understanding systems, where domain and semantic shifts are unavoidable. While many post-hoc OOD detection methods were developed for vision models, their direct transfer to textual and multimodal Transformer architectures remains poorly understood. We show that, unlike in vision benchmarks, feature-space provides the dominant OOD signal for text and document models, consistently outperforming logit-based and hybrid detectors.
Building on this observation, we introduce \textbf{VECO} (\emph{VEctor COnformity}), a geometry-aware, purely feature-based OOD scoring framework that implements a stable soft contrast between in-distribution conformity and residual-space deviation.
We instantiate VECO using principal-subspace conformity for multimodal document models and Mahalanobis distance conformity for text classifiers, reflecting modality-aligned representation structure.
VECO achieves state-of-the-art and consistent performance improvements on multimodal document and text classification benchmarks. These results highlight the modality-dependent nature of OOD detection and the importance of adapting score design to representation cues.

URL: https://openreview.net/forum?id=sMbGqh7Zvt

---

Title: Exploration-Driven Optimization for Test-Time Large Language Model Reasoning

Abstract: Post-training techniques combined with inference-time scaling significantly enhance the reasoning and alignment capabilities of large language models (LLMs). However, a fundamental tension arises: inference-time methods benefit from diverse sampling from a relatively flattened probability distribution, whereas reinforcement learning (RL)-based post-training inherently sharpens these distributions. To address this, we propose Exploration-Driven Optimization (EDO) that integrates reward-biasing into standard RL objectives, encouraging greater diversity in sampled solutions while facilitating a more effective inference-time computation. We seamlessly incorporate EDO into established RL frameworks, specifically iterative Direct Preference Optimization (iDPO) and Group Relative Policy Optimization (GRPO), resulting in two variants: ED-iDPO and ED-GRPO. Extensive experiments demonstrate that both ED-iDPO and ED-GRPO exhibit greater solution diversity and improved reasoning abilities, particularly when combined with test-time computation techniques like self-consistency. Across three in-distribution reasoning benchmarks, EDO achieves a 1.0-1.3% improvement over the strongest baselines, and delivers an additional 1.5% average gain on five out-of-distribution tasks. Beyond accuracy, EDO preserves model entropy and stabilizes RL training dynamics, highlighting its effectiveness in preventing over-optimization collapse. Taken together, these results establish EDO as a principled framework for balancing exploration and exploitation in LLM reasoning and advancing the state of the art.

URL: https://openreview.net/forum?id=NiINDlzvNj

---

Title: GAMformer: Bridging Tabular Foundation Models and Interpretable Machine Learning

Abstract: While interpretability is crucial for machine learning applications in safety-critical domains and for regulatory compliance, existing tabular foundation models like TabPFN lack transparency. Generalized Additive Models (GAMs) provide the needed interpretability through their additive structure, but traditional GAM methods rely on iterative learning algorithms (such as splines, boosted trees, or neural networks) that are fundamentally incompatible with the in-context learning paradigm of foundation models. In this paper, we introduce GAMformer, the first tabular foundation model for GAMs that bridges the gap between the power of foundation models and the interpretability requirements of critical real-world applications. GAMformer estimates GAM shape functions in a single forward pass using in-context learning, representing a significant departure from conventional iterative approaches. Building on previous research on tabular foundation models, we train GAMformer exclusively on synthetically generated tables to prevent data leakage. Our experiments demonstrate that GAMformer performs comparably to other leading GAMs across various classification benchmarks.

URL: https://openreview.net/forum?id=647gba3osV

---

Title: Efficient Adaptation of Large Vision-Language Models: Transfer Learning Methods and Applications

Abstract: Pre-trained large vision-language models (VLMs) have become the dominant choice for handling vision-language tasks, covering from multimodal reasoning to text-image generation. However, these models heavily depend on large-scale training datasets, primarily composed of image-text pairs sourced from web data, which are typically confined to general domains rather than specific downstream tasks. Given the scarcity of data in such specialized domains, transfer learning emerges as a remedy, enabling the adaptation of a model's preexisting knowledge to new tasks with limited data, thereby mitigating the reliance on extensive datasets. Following the current trend of the transfer learning application with vision-language tasks, we provide a systematic study of existing transfer learning techniques adopted for vision-language models, including: (1) a summary of the existing state-of-the-art VLMs, (2) a comprehensive taxonomy of transfer learning approaches for VLMs, (3) the discussion of real-world applications of transfer learning methods for VLMs, (4) a summary of commonly used vision-language dataset and benchmarks in variant vision-language tasks.

URL: https://openreview.net/forum?id=Xu9Oq5RwCa

---

Title: Coherence–Diffusion Dynamics: A Continuous-Semantic Interpretation of Transformer Language Models

Abstract: Large language models (LLMs) exhibit coherent reasoning, long-range contextual integration, and abrupt failures such as hallucination, yet the internal principles governing these behaviors remain poorly understood. Existing interpretability approaches typically focus
on isolated components, including attention patterns, neuron circuits, or probing signals, and therefore provide limited insight into how semantic meaning evolves over the course of inference. This work proposes that Transformer-based language models can be productively interpreted through a continuous semantic perspective, in which internal representations evolve along structured trajectories in a latent space. We articulate this interpretation through the Coherence–Diffusion Dynamics (CDD) framework, which models semantic evolution as the interaction of coherence-restoring tendencies and stochastic variability. Within this framework, we introduce an effective instability potential serving as an interpretive proxy for semantic coherence, a coherence operator governing stabilizing dynamics, a diffusion term capturing stochastic variability, and an interpretation of dynamic sparsity capturing the apparent contraction of effective semantic degrees of freedom along inference trajectories. These constructs suggest qualitative, empirically testable implications regarding stabilization, regime shifts associated with hallucination, and the functional irrelevance of low-impact components. We evaluate these implications through controlled experiments on Transformer language models, showing broad alignment between observed behavior and
the qualitative predictions of the CDD interpretation. Taken together, this work provides a coherent and dynamically grounded account of semantic evolution in LLMs, providing a principled lens for interpreting coherence, variability, sparsity, and instability without departing from the discrete computational structure of Transformer architectures.

URL: https://openreview.net/forum?id=Q3cbXoUgFF

---

Reply all
Reply to author
Forward
0 new messages