Daily TMLR digest for Aug 06, 2025

0 views
Skip to first unread message

TMLR

unread,
Aug 6, 2025, 12:06:05 AMAug 6
to tmlr-anno...@googlegroups.com


New certifications
==================

Featured Certification: Understanding In-Context Learning of Linear Models in Transformers Through an Adversarial Lens

Usman Anwar, Johannes von Oswald, Louis Kirsch, David Krueger, Spencer Frei

https://openreview.net/forum?id=CtMXJxO7SJ

---


Accepted papers
===============


Title: Potential Score Matching: Debiasing Molecular Structure Sampling with Potential Energy Guidance

Authors: Liya Guo, Zun Wang, Chang Liu, Junzhe Li, Pipi Hu, Yi Zhu, Tao Qin

Abstract: The ensemble average of physical properties of molecules is closely related to the distribution of molecular conformations, and sampling such distributions is a fundamental challenge in physics and chemistry. Traditional methods like molecular dynamics (MD) simulations and Markov chain Monte Carlo (MCMC) sampling are commonly used but can be time-consuming and costly. Recently, diffusion models have emerged as efficient alternatives by learning the distribution of training data. Obtaining an unbiased target distribution is still an expensive task, primarily because it requires satisfying ergodicity. To tackle these challenges, we propose Potential Score Matching (PSM), an approach that utilizes the potential energy gradient to guide generative models. PSM does not require exact energy functions and can debias sample distributions even when trained on limited and biased data. Our method outperforms existing state-of-the-art (SOTA) models on the Lennard-Jones (LJ) potential, a commonly used toy model. Furthermore, we extend the evaluation of PSM to high-dimensional problems using the MD17 and MD22 datasets. The results demonstrate that molecular distributions generated by PSM more closely approximate the Boltzmann distribution compared to traditional diffusion models.

URL: https://openreview.net/forum?id=tTdzbnvTno

---

Title: Understanding In-Context Learning of Linear Models in Transformers Through an Adversarial Lens

Authors: Usman Anwar, Johannes von Oswald, Louis Kirsch, David Krueger, Spencer Frei

Abstract: In this work, we make two contributions towards understanding of in-context learning of linear models by transformers. First, we investigate the adversarial robustness of in-context learning in transformers to hijacking attacks — a type of adversarial attacks in which the adversary’s goal is to manipulate the prompt to force the transformer to generate a specific output. We show that both linear transformers and transformers with GPT-2 architectures are vulnerable to such hijacking attacks. However, adversarial robustness to such attacks can be significantly improved through adversarial training --- done either at the pretraining or finetuning stage --- and can generalize to stronger attack models. Our second main contribution is a comparative analysis of adversarial vulnerabilities across transformer models and other algorithms for learning linear models. This reveals two novel findings. First, adversarial attacks transfer poorly between larger transformer models trained from different seeds despite achieving similar in-distribution performance. This suggests that transformers of the same architecture trained according to the same recipe may implement different in-context learning algorithms for the same task. Second, we observe that attacks do not transfer well between classical learning algorithms for linear models (single-step gradient descent and ordinary least squares) and transformers. This suggests that there could be qualitative differences between the in-context learning algorithms that transformers implement and these traditional algorithms.

URL: https://openreview.net/forum?id=CtMXJxO7SJ

---

Title: Set-Based Training for Neural Network Verification

Authors: Lukas Koller, Tobias Ladner, Matthias Althoff

Abstract: Neural networks are vulnerable to adversarial attacks, i.e., small input perturbations can significantly affect the outputs of a neural network. Therefore, to ensure safety of neural networks in safety-critical environments, the robustness of a neural network must be formally verified against input perturbations, e.g., from noisy sensors. To improve the robustness of neural networks and thus simplify the formal verification, we present a novel set-based training procedure in which we compute the set of possible outputs given the set of possible inputs and compute for the first time a gradient set, i.e., each possible output has a different gradient. Therefore, we can directly reduce the size of the output enclosure by choosing gradients toward its center. Small output enclosures increase the robustness of a neural network and, at the same time, simplify its formal verification. The latter benefit is due to the fact that a larger size of propagated sets increases the conservatism of most verification methods. Our extensive evaluation demonstrates that set-based training produces robust neural networks with competitive performance, which can be verified using fast (polynomial-time) verification algorithms due to the reduced output set.

URL: https://openreview.net/forum?id=n0lzHrAWIA

---


New submissions
===============


Title: Networked Communication for Decentralised Agents in Mean-Field Games

Abstract: We introduce networked communication to the mean-field game framework, in particular to oracle-free settings where $N$ decentralised agents learn along a single, non-episodic run of the empirical system. We prove that our architecture has sample guarantees bounded between those of the centralised- and independent-learning cases. We provide the order of the difference in these bounds in terms of network structure and number of communication rounds, and also contribute a policy-update stability guarantee. We discuss how the sample guarantees of the three theoretical algorithms do not actually result in practical convergence times. We thus contribute practical enhancements to all three algorithms allowing us to present their first empirical demonstrations, where we do not need to enforce several of the theoretically required assumptions. We then show that in practical settings where the theoretical hyperparameters are not observed (leading to poor estimation of the Q-function), our communication scheme considerably accelerates learning over the independent case, which hardly seems to learn at all. Indeed networked agents often perform similarly to the centralised case, while removing the restrictive assumption of the latter. We provide ablations and additional studies showing that our networked approach also has advantages over both alternatives in terms of robustness to update failures and to changes in population size.

URL: https://openreview.net/forum?id=J9WGHU78gb

---

Title: Whisper Smarter, not Harder: Adversarial Attack on Partial Suppression

Abstract: Automatic Speech Recognition (ASR) models are deployed in an extensive range of applications. However, recent studies have demonstrated the possibility of adversarial attack on these models which could potentially suppress or disrupt model output. We investigate and verify the robustness of these attacks and explore if it is possible to increase their imperceptibility. We additionally find that by relaxing the optimisation objective from complete suppression to partial suppression, we can further decrease the imperceptibility of the attack. We also explore possible defences against these attacks and show a low-pass filter defence could potentially serve as an effective defence.

URL: https://openreview.net/forum?id=ND0kU1NQWG

---

Title: Rethinking Prompt Optimization: Reinforcement, Diversification, and Migration in Blackbox LLMs

Abstract: An increasing number of NLP applications interact with large language models (LLMs) through black-box APIs, making prompt engineering critical for controlling model outputs. While recent Automatic Prompt Optimization (APO) methods iteratively refine prompts using model-generated feedback, \textit{textual gradients}, they primarily focus on error correction and neglect valuable insights from correct predictions. This limits both their effectiveness and efficiency. In this paper, we propose a novel APO framework centered on enhancing the feedback mechanism. We reinterpret the textual gradient as a form of negative reinforcement and introduce the complementary positive reinforcement to explicitly preserve beneficial prompt components identified through successful predictions. To mitigate the noise inherent in LLM-generated feedback, we introduce a technique called feedback diversification, which aggregates multiple feedback signals, emphasizing consistent, actionable advice while filtering out outliers. Motivated by the rapid evolution and diversity of available LLMs, we also formalize Continual Prompt Optimization (CPO), addressing the practical challenge of efficiently migrating optimized prompts between different model versions or API providers. Our experiments reveal that naive prompt migration often degrades performance due to loss of critical instructions. In contrast, our approach consistently outperforms strong baselines, achieving significant accuracy improvements, faster convergence, and lower computational costs in both standard and migration scenarios.

URL: https://openreview.net/forum?id=1IgBOgImqE

---

Title: Growing Transformers: Modular Composition and Layer-wise Expansion on a Frozen Substrate

Abstract: The prevailing paradigm for scaling large language models (LLMs) involves monolithic, end-to-end training, a resource-intensive process that lacks flexibility. This paper explores an alternative, constructive approach to model development, built upon the foundation of non-trainable, deterministic input embeddings. Building upon the recent finding that high-level semantic reasoning can emerge in Transformers using frozen embeddings derived from the visual structure of Unicode glyphs, we demonstrate that this fixed representational substrate acts as a universal "docking port," enabling two powerful and efficient scaling paradigms: seamless modular composition and progressive layer-wise growth. First, we show that specialist models trained on disparate datasets (e.g., Russian and Chinese text) can be merged into a single, more capable Mixture-of-Experts (MoE) model, post-training, with zero architectural modification. This is achieved by simply averaging their output logits. The resulting MoE model exhibits immediate performance improvements on reasoning benchmarks like MMLU, surpassing its constituent experts without catastrophic forgetting. Second, we introduce a layer-wise constructive training methodology, where a deep Transformer is "grown" by progressively stacking and training one layer at a time. This method demonstrates stable convergence and a clear correlation between model depth and the emergence of complex reasoning abilities, such as those required for SQuADv2. Our findings suggest a paradigm shift from monolithic optimization towards a more biological or constructive model of AI development, where complexity is built incrementally and modules can be composed freely. This opens new avenues for resource-efficient scaling, continual learning, and a more democratized ecosystem for building powerful AI systems. We release all code and models to facilitate further research.

URL: https://openreview.net/forum?id=gSdftmJelp

---

Title: Frame-wise Conditioning Adaptation for Fine-Tuning Diffusion Models in Text-to-Video Prediction

Abstract: Text-video prediction (TVP) is a downstream video generation task that requires a model to produce subsequent video frames given a series of initial video frames and text describing the required motion.
In practice TVP methods focus on a particular category of videos depicting manipulations of objects carried out by human beings or robot arms.
Previous methods adapt models pre-trained on text-to-image tasks, and thus tend to generate video that lacks the required continuity.
A natural progression would be to leverage more recent pre-trained text-to-video (T2V) models.
This approach is rendered more challenging by the fact that the most common fine-tuning technique, low-rank adaptation (LoRA), yields undesirable results.
In this work, we propose an adaptation-based strategy we label Frame-wise Conditioning Adaptation (FCA).
Within the module, we devise a sub-module that produces frame-wise text embeddings from the input text, which acts as an additional text condition to aid generation.
We use FCA to fine-tune the T2V model, which incorporates the initial frame(s) as an extra condition.
We compare and discuss the more effective strategy for injecting such embeddings into the T2V model.
We conduct extensive ablation studies on our design choices with quantitative and qualitative performance analysis.
Our approach establishes a new baseline for the task of TVP.

URL: https://openreview.net/forum?id=HSAjl4LUHK

---

Title: How to Upscale Neural Networks with Scaling Law?

Abstract: Neural scaling laws have revolutionized the design and optimization of large-scale AI models by revealing predictable relationships between model size, dataset volume, and computational resources. Early research established power-law relationships in model performance, leading to compute-optimal scaling strategies. However, recent studies highlighted their limitations across architectures, modalities, and deployment contexts. Sparse models, mixture-of-experts, retrieval-augmented learning, and multimodal models often deviate from traditional scaling patterns. Moreover, scaling behaviors vary across domains such as vision, reinforcement learning, and fine-tuning, underscoring the need for more nuanced approaches. In this survey, we synthesize insights from current studies, examining the theoretical foundations, empirical findings, and practical implications of scaling laws. We also explore key challenges, including data efficiency, inference scaling, and architecture-specific constraints, advocating for adaptive scaling strategies tailored to real-world applications. We suggest that while scaling laws provide a useful guide, they do not always generalize across all architectures and training strategies.

URL: https://openreview.net/forum?id=AL7N0UOfgI

---

Title: Script: Graph-Structured and Query-Conditioned Semantic Token Pruning for Multimodal Large Language Models

Abstract: The rapid growth of visual tokens in multimodal large language models (MLLMs) leads to excessive memory consumption and inference latency, especially when handling high-resolution images and videos. Token pruning is a technique used to mitigate this issue by removing redundancy, but existing methods often ignore relevance to the user query or suffer from the limitations of attention mechanisms, reducing their adaptability and effectiveness. To address these challenges, we propose Script, a plug-and-play pruning method that requires no retraining and generalizes across diverse MLLMs. Script comprises two modules: a graph-structured pruning module that removes visually redundant tokens, and a query-conditioned semantic pruning module that preserves query-relevant visual information. Together, they enhance performance on multimodal tasks. Experiments on fourteen benchmarks across image and video understanding tasks show that Script consistently achieves higher model efficiency and predictive accuracy compared to existing pruning methods. On LLaVA-NeXT-7B, it achieves up to $6.8\times$ prefill speedup and $10\times$ FLOP reduction, while retaining 96.88\% of the original performance. Code will be made publicly available upon acceptance.

URL: https://openreview.net/forum?id=F6xKzbgcHq

---

Title: Inverting Gradient Attacks Makes Powerful Data Poisoning

Abstract: Gradient attacks and data poisoning tamper with the training of machine learning algorithms to maliciously alter them and have been proven to be equivalent in convex settings. The extent of harm these attacks can produce in non-convex settings is still to be determined.
Gradient attacks are practical for fewer systems than data poisoning but have been argued to be more harmful since they can be arbitrary, whereas data poisoning reduces the attacker’s power to only being able to inject data points to training sets, via e.g. legitimate participation in a collaborative dataset. This raises the question whether the harm made by gradient attacks can be matched by data poisoning in non-convex settings. In this work, we provide a positive answer and show how data poisoning can mimic gradient attacks to perform an availability attack on (non-convex) neural networks. Through gradient inversion, commonly used to reconstruct data points from actual gradients, we show how reconstructing data points out of malicious gradients can be sufficient to perform a range of attacks. This allows us to show, for the first time, a worst-case availability attack on neural networks through data poisoning, degrading the model’s performances to random-level through a minority (as low as 1%) of poisoned points.

URL: https://openreview.net/forum?id=Lvy5MjyTh3

---

Reply all
Reply to author
Forward
0 new messages