Daily TMLR digest for Sep 02, 2025

0 views
Skip to first unread message

TMLR

unread,
Sep 2, 2025, 12:06:06 AM (5 days ago) Sep 2
to tmlr-anno...@googlegroups.com


New certifications
==================

Expert Certification: Learning Equivalence Classes of Bayesian Network Structures with GFlowNet

Michelle Liu, Zhaocheng Zhu, Olexa Bilaniuk, Emmanuel Bengio

https://openreview.net/forum?id=FAcc7oAdaa

---


Accepted papers
===============


Title: Communication Cost Reduction for Subgraph Counting under Local Differential Privacy via Hash Functions

Authors: Quentin Hillebrand, Vorapong Suppakitpaisarn, Tetsuo Shibuya

Abstract: We suggest the use of hash functions to cut down the communication costs when counting subgraphs under edge local differential privacy. While various algorithms exist for computing graph statistics --- including the count of subgraphs --- under the edge local differential privacy, many suffer with high communication costs, making them less efficient for large graphs. Though data compression is a typical approach in differential privacy, its application in local differential privacy requires a form of compression that every node can reproduce. In our study, we introduce linear congruence hashing. Leveraging amplification by sub-sampling, with a sampling size of $s$, our method can cut communication costs by a factor of $s^2$, albeit at the cost of increasing variance in the published graph statistic by a factor of $s$. The experimental results indicate that, when matched for communication costs, our method achieves a reduction in the $\ell_2$-error by up to 1000 times for triangle counts and by up to $10^3$ times for 4-cycles counts compared to the performance of leading algorithms.

URL: https://openreview.net/forum?id=N1J236mepp

---

Title: Leveraging Fully-Observable Solutions for Improved Partially-Observable Offline Reinforcement Learning

Authors: Chulabhaya Wijesundara, Andrea Baisero, Gregory David Castanon, Alan S Carlin, Robert Platt, Christopher Amato

Abstract: Offline reinforcement learning (RL) is a popular learning framework for control problems where online interactions with the environment are expensive, risky, or otherwise impractical.
Existing offline RL methods commonly assume full observability of the state, and therefore there is a lack of offline RL methods that are specialized for the more general case of partially-observable control.
To address this gap, we propose Cross-Observability Conservative Q-Learning (CO-CQL), an offline RL algorithm for partially-observable control that leverages fully-observable expert policies in an asymmetric learning setting.
To motivate the use of fully-observable experts for partially-observable control, we formalize Cross-Observability Optimality Ratio (COOR), a theoretical measure of cross-observability that quantifies the benefit of learning asymmetrically from a fully-observable expert, and Cross-Observability Approximation Ratio (COAR), an estimation of COOR computable from trained policies.
Our empirical evaluation on a wide variety of partially-observable challenges demonstrates that CO-CQL is able to exploit the guidance of fully-observable experts to outperform other state-of-the-art offline algorithms.

URL: https://openreview.net/forum?id=e9p4TDPy6A

---

Title: Learning Equivalence Classes of Bayesian Network Structures with GFlowNet

Authors: Michelle Liu, Zhaocheng Zhu, Olexa Bilaniuk, Emmanuel Bengio

Abstract: Understanding the causal graph underlying a system is essential for enabling causal inference, particularly in fields such as medicine and genetics. Identifying a causal Directed Acyclic Graph (DAG) from observational data alone is challenging because multiple DAGs can encode the same set of conditional independencies. These equivalent DAGs form a Markov Equivalence Class (MEC), which is represented by a Completed Partially Directed Acyclic Graph (CPDAG). Effectively approximating the CPDAG is crucial because it facilitates narrowing down the set of possible causal graphs underlying the data. We introduce CPDAG-GFN, a novel approach that uses a Generative Flow Network (GFlowNet) to learn a posterior distribution over CPDAGs. From this distribution, we sample high-reward CPDAG candidates that approximate the ground truth, with rewards determined by a score function that quantifies how well each graph fits the data. Additionally, CPDAG-GFN incorporates a sparsity-preferring filter to enhance the set of CPDAG candidates and improve their alignment with the ground truth. Experimental results on both simulated and real-world datasets demonstrate that CPDAG-GFN performs competitively with established methods for learning CPDAG candidates from observational data.

URL: https://openreview.net/forum?id=FAcc7oAdaa

---


New submissions
===============


Title: On Foundation Models for Dynamical Systems from Purely Synthetic Data

Abstract: Foundation models have demonstrated remarkable generalization, data efficiency, and robustness properties across various domains.
The success of these models is enabled by large-scale pretraining on Internet-scale datasets. These are available in fields like natural language processing and computer vision, but do not exist for dynamical systems. In this paper, we explore whether these properties can be achieved in the control domain when pretraining is performed entirely on synthetic data. We address this challenge by pretraining a transformer-based foundation model exclusively on synthetic data and propose to sample dynamics functions from a reproducing kernel Hilbert space. Our model, pretrained on purely synthetic data, generalizes to prediction tasks across different dynamical systems, which we validate in simulation and hardware experiments, including cart-pole and Furuta pendulum setups. Additionally, our model can be fine-tuned effectively to new systems increasing its performance even further. Our results demonstrate that even when pretrained solely on appropriately designed synthetic data, it is feasible to obtain foundation models for dynamical systems that outperform specialist models in terms of generalization, data efficiency, and robustness.

URL: https://openreview.net/forum?id=6UfPCYWEvd

---

Title: Rel-HNN: Split Parallel Hypergraph Neural Network for Learning on Relational Databases

Abstract: Relational databases (RDBs) are ubiquitous in enterprise and real-world applications. Flattening the database poses challenges for deep learning models that rely on fixed-size input representations to capture relational semantics from the structured nature of relational data. Graph neural networks (GNNs) have been proposed to address this, but they often oversimplify relational structures by modeling all the tuples as monolithic nodes and ignoring intra-tuple associations. In this work, we propose a novel hypergraph-based framework, that we call rel-HNN, which models each unique attribute-value pair as a node and each tuple as a hyperedge, enabling the capture of fine-grained intra-tuple relationships. Our approach learns explicit multi-level representations across attribute-value, tuple, and table levels. To address the scalability challenges posed by large RDBs, we further introduce a split-parallel training algorithm that leverages multi-GPU execution for efficient hypergraph learning. Extensive experiments on real-world and benchmark datasets demonstrate that rel-HNN significantly outperforms existing methods in both classification and regression tasks. Moreover, our split-parallel training achieves substantial speedups—up to 3.18x for learning on relational data and up to 2.94x for hypergraph learning—compared to conventional single-GPU execution.

URL: https://openreview.net/forum?id=L7VP7gxpVG

---

Title: Probability Consistency in Large Language Models: Theoretical Foundations Meet Empirical Discrepancies

Abstract: Can autoregressive large language models (LLMs) learn consistent probability distributions when trained on sequences in different token orders? We prove formally that for any well-defined probability distribution, sequence perplexity is invariant under any factorization, including forward, backward, or arbitrary permutations. This result establishes a rigorous theoretical foundation for studying how LLMs learn from data and defines principled protocols for empirical evaluation. Applying these protocols, we show that prior studies examining ordering effects suffer from critical methodological flaws. We retrain GPT-2 models across forward, backward, and arbitrary permuted orders on scientific text. We find systematic deviations from theoretical invariance across all orderings with arbitrary permutations strongly deviating from both forward and backward models, which largely (but not completely) agreed with one another. Deviations were traceable to differences in self-attention, reflecting positional and locality biases in processing. Our theoretical and empirical results provide novel avenues for understanding positional biases in LLMs and suggest methods for detecting when LLMs' probability distributions are inconsistent and therefore untrustworthy.

URL: https://openreview.net/forum?id=AV3fX3jB0H

---

Title: I Want to Break Free! Persuasion and Anti-Social Behavior of LLMs in Multi-Agent Settings with Social Hierarchy

Abstract: As LLM-based agents become increasingly autonomous and will more freely interact with each other, studying the interplay among them becomes crucial to anticipate emergent phenomena and potential risks. In this work, we provide an in-depth analysis of the interactions among agents within a simulated hierarchical social environment, drawing inspiration from the Stanford Prison Experiment. Leveraging 2,400 conversations across six LLMs (i.e., \texttt{LLama3}, \texttt{Orca2}, \texttt{Command-r}, \texttt{Mixtral}, \texttt{Mistral2}, and \texttt{gpt4.1}) and 240 experimental scenarios, we analyze persuasion and anti-social behavior between a guard and a prisoner agent with differing objectives. We first document model-specific conversational failures in this multi-agent power dynamic context.
Among models demonstrating successful interaction, we find that goal setting significantly influences persuasiveness but not anti-social behavior. Moreover, agent personas, especially the guard's, substantially impact both successful persuasion by the prisoner and the manifestation of anti-social actions. Notably, we observe the emergence of anti-social conduct even in absence of explicit negative personality prompts. These results have important implications for the development of interactive LLM agents and the ongoing discussion of their societal impact.

URL: https://openreview.net/forum?id=FR76oM8eGD

---

Title: A simple connection from loss flatness to compressed neural representations

Abstract: Sharpness, a geometric measure in the parameter space that reflects the flatness of the loss landscape, has long been studied for its potential connections to neural network behavior. While sharpness is often associated with generalization, recent work highlights inconsistencies in this relationship, leaving its true significance unclear. In this paper, we investigate how sharpness influences the local geometric features of neural representations in feature space, offering a new perspective on its role. We introduce this problem and study three measures for compression: the Local Volumetric Ratio (LVR), based on volume compression, the Maximum Local Sensitivity (MLS), based on sensitivity to input changes, and the Local Dimensionality, based on how uniform the sensitivity is on different directions. We show that LVR and MLS correlate with the flatness of the loss around the local minima; and that this correlation is predicted by a relatively simple mathematical relationship: a flatter loss corresponds to a lower upper bound on the compression metrics of neural representations. Our work builds upon the linear stability insight by Ma and Ying, deriving inequalities between various compression metrics and quantities involving sharpness. Our inequalities readily extend to reparametrization-invariant sharpness as well. Through empirical experiments on various feedforward, convolutional, and transformer architectures, we find that our inequalities predict a consistently positive correlation between local representation compression and sharpness.

URL: https://openreview.net/forum?id=GgpQbU9bFR

---

Title: Continual Learning on CLIP via Incremental Prompt Tuning with Intrinsic Textual Anchors

Abstract: Continual learning (CL) enables deep neural networks to acquire new knowledge over time while mitigating catastrophic forgetting of previously learned information. The powerful generalization ability of pre-trained models (PTMs), such as the Contrastive Language-Image Pre-training (CLIP) model, has inspired a range of CL methods targeting new and specialized tasks, further bridging the gap between PTMs and continual adaptation. Leveraging its multi-modal visual and textual representations, CLIP offers a natural paradigm for CL, where new tasks can be accommodated by incrementally learning lightweight parameters, particularly prompts. However, existing prompt-based CL methods for PTMs often rely on complex designs built upon specific assumptions, such as intricate regularization schemes for prompt pools, specialized routing mechanisms, or multi-stage incrementation processes. While these approaches improve performance, they frequently introduce additional-and possibly unnecessary-complexity, underutilizing CLIP's intrinsic capabilities. In this paper, we propose a concise CL approach for CLIP based on incremental prompt tuning that fully exploits its multi-modal structure and the stability of textual representations. Our method, Textual Prototype-guided Prompt Tuning (TPPT), introduces textual prototypes not merely as static classifiers, as in existing methods, but as stable anchors to guide the learning of visual prompts, thereby shaping the embedding space (i.e., TPPT-V). We show that our bidirectional supervision strategy enables more effective learning of new knowledge while reducing forgetting. To further close the vision-language gap during CL, we activate the language branch and extend our approach to jointly optimize both visual and textual prompts (i.e., TPPT-VT). We also introduce a relational diversity regularization on the textual anchors to prevent embedding space collapse and mitigate correlated forgetting. Extensive experiments and analyses demonstrate the effectiveness of our proposed approach, highlighting the benefits of leveraging CLIP's intrinsic guidance for continual adaptation.

URL: https://openreview.net/forum?id=YJnjkzKq5Y

---

Title: TextRegion: Text-Aligned Region Tokens from Frozen Image-Text Models

Abstract: Image-text models excel at image-level tasks but struggle with detailed visual understanding. While these models provide strong visual-language alignment, segmentation models like SAM2 offer precise spatial boundaries for objects. To this end, we propose TextRegion, a simple, effective, and training-free framework that combines the strengths of image-text models and SAM2 to generate powerful text-aligned region tokens. These tokens enable detailed visual understanding while preserving open-vocabulary capabilities. They can be directly applied to various downstream tasks, including open-world semantic segmentation, referring expression comprehension, and grounding. We conduct extensive evaluations and consistently achieve superior or competitive performance compared to state-of-the-art training-free methods. Additionally, our framework is compatible with many image-text models, making it highly practical and easily extensible as stronger models emerge. We will release the code to support open research and promote reproducibility.

URL: https://openreview.net/forum?id=KZLmkL62M4

---

Title: Rank-1 Approximation of Inverse Fisher for Natural Policy Gradients in Deep Reinforcement Learning

Abstract: Natural gradients have been long studied in deep reinforcement learning due to its fast convergence properties and covariant weight updates.
However, computing natural gradients requires inversion of Fisher Information Matrix (FIM) at each iteration, which is computationally prohibitive in nature.
In this paper, we present an efficient and scalable natural policy optimization technique which leverages a rank-1 approximation to full inverse-FIM.
We theoretically show that under certain conditions, rank-1 approximation to inverse-FIM converges faster than policy gradients and under some condition, enjoys the same sample complexity as stochastic policy gradient methods.
We benchmark our method on a diverse set of environments and show that our methods achieve superior performance than standard trust-region and actor-critic baselines.

URL: https://openreview.net/forum?id=ko8Kn7TS6m

---

Title: Demystifying amortized causal discovery with transformers

Abstract: Supervised learning for causal discovery from observational data often achieves competitive performance despite seemingly avoiding the explicit assumptions that traditional methods require for identifiability. In this work, we analyze CSIvA (Ke et al., 2023) on bivariate causal models, a transformer architecture for amortized inference promising to train on synthetic data and transfer to real ones. First, we bridge the gap with identifiability theory, showing that the training distribution implicitly defines a prior on the causal model of the test observations: consistent with classical approaches, good performance is achieved when we have a good prior on the test data, and the underlying model is identifiable. Second, we find that CSIvA can not generalize to classes of causal models unseen during training: to overcome this limitation, we theoretically and empirically analyze \textit{when} training CSIvA on datasets generated by multiple identifiable causal models with different structural assumptions improves its generalization at test time. Overall, we find that amortized causal discovery still adheres to identifiability theory, violating the previous hypothesis from Lopez-Paz et al. (2015) that supervised learning methods could overcome its restrictions.

URL: https://openreview.net/forum?id=9Lgy7IGSfp

---

Title: Density-Aware Farthest Point Sampling

Abstract: We focus on training machine learning regression models in scenarios where the availability labeled training data is limited due to computational constraints or high labeling costs. Thus, selecting suitable training sets from unlabeled data is essential for balancing performance and efficiency. For the selection of the training data, we focus on passive and model-agnostic sampling methods that only consider the data feature representations. We derive an upper bound for the expected prediction error of Lipschitz continuous regression models that linearly depends on the weighted fill distance of the training set—a quantity we can estimate simply by considering the data features. We introduce "Density-Aware Farthest Point Sampling" (DA-FPS), a novel sampling method. We prove that DA-FPS provides approximate minimizers for a data-driven estimation of the weighted fill distance, thereby aiming at minimizing our derived bound. We conduct experiments using two regression models across three datasets. The results demonstrate that DA-FPS significantly reduces the mean absolute prediction error compared to other sampling strategies.

URL: https://openreview.net/forum?id=vI47lgIfYc

---

Title: Understanding Transformer-based Vision Models through Inversion

Abstract: Understanding the mechanisms underlying deep neural networks remains a fundamental challenge in machine learning and computer vision. One promising, yet only preliminarily explored approach, is feature inversion, which attempts to reconstruct images from intermediate representations using trained inverse neural networks. In this study, we revisit feature inversion, introducing a novel, modular variation that enables significantly more efficient application of the technique. We demonstrate how our method can be systematically applied to the large-scale transformer-based vision models, Detection Transformer and Vision Transformer, and how reconstructed images can be qualitatively interpreted in a meaningful way. We further quantitatively evaluate our method, thereby uncovering underlying mechanisms of representing image features that emerge in the two transformer architectures. Our analysis reveals key insights into how these models encode contextual shape and image details, how their layers correlate, and their robustness against color perturbations. These findings contribute to a deeper understanding of transformer-based vision models and their internal representations.

URL: https://openreview.net/forum?id=qcgWlzKiCl

---

Reply all
Reply to author
Forward
0 new messages