Daily TMLR digest for Nov 15, 2025

0 views
Skip to first unread message

TMLR

unread,
Nov 15, 2025, 12:30:08 AMNov 15
to tmlr-anno...@googlegroups.com

New submissions
===============


Title: BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions

Abstract: Efficiently solving real-world problems with LLMs increasingly hinges on their ability to interact with dynamic web environments and autonomously acquire external information. While recent research like Search-R1 and WebDancer demonstrates strong performance in solving web tasks, they heavily rely on additional tools to convert the interactive web environment into static text content. This is in contrast to human browsing behaviors, which involve diverse interactions with the browser, such as scrolling, clicking, and typing. In this paper, we propose BrowserAgent, a more interactive agent that solves complex tasks through human-inspired browser actions. BrowserAgent operates directly on raw web pages via Playwright through a set of predefined browser actions. We adopt a two-stage training (Supervised Fine-Tuning (SFT) and Rejection Fine-Tuning (RFT)) to improve the model's generalization abilities. Despite using significantly less training data than Search-R1, BrowserAgent achieves more competitive results across different Open-QA tasks. Additionally, we introduce an explicit memory mechanism to store key conclusions across steps, further enhancing the model's reasoning capabilities for long-horizon tasks. Notably, BrowserAgent-7B can achieve around 20\% improvement over Search-R1 on multi-hop QA tasks like HotpotQA, 2Wiki, and Bamboogle. These results indicate that BrowserAgent can serve as a more advanced framework for more interactive and scalable web agents.

URL: https://openreview.net/forum?id=X4CfZPSEHE

---

Title: PRISM: Diversifying Dataset Distillation by Decoupling Architectural Priors

Abstract: Dataset distillation (DD) promises compact yet faithful synthetic data, but existing approaches often inherit the inductive bias of a single teacher model. As dataset size increases, this bias drives generation toward overly smooth, homogeneous samples, reducing intra-class diversity and limiting generalization. We present PRISM (PRIors from diverse Source Models), a framework that disentangles architectural priors during synthesis. PRISM decouples the logit-matching and regularization objectives, supervising them with different teacher architectures: a primary model for logits and a stochastic subset for batch-normalization (BN) alignment. On ImageNet-1K, PRISM consistently and reproducibly outperforms single-teacher methods (e.g., SRe2L) and recent multi-teacher variants (e.g., G-VBSM) at low- and mid-IPC regimes. The generated data also show significantly richer intra-class diversity, as reflected by a notable drop in cosine similarity between features. We further analyze teacher selection strategies (pre- vs. intra-distillation) and introduce a scalable cross-class batch formation scheme for fast parallel synthesis. Code will be released after the review period.

URL: https://openreview.net/forum?id=xN58FtB1Gq

---

Title: Topological Inductive Bias fosters Multiple Instance Learning in Data-Scarce Scenarios

Abstract: Multiple instance learning (MIL) is a framework for weakly supervised classification, where labels are assigned to sets of instances, i.e., bags, rather than to individual data points. This paradigm has proven effective in tasks where fine-grained annotations are unavailable or costly to obtain. However, the effectiveness of MIL drops sharply when training data are scarce, such as for rare disease classification. To address this challenge, we propose incorporating topological inductive biases into the data representation space within the MIL framework. This bias introduces a topology-preserving constraint that encourages the instance encoder to maintain the topological structure of the instance distribution within each bag when mapping them to MIL latent space. As a result, our Topology Guided MIL (TG-MIL) method enhances the performance and generalizability of MIL classifiers across different aggregation functions, especially under scarce-data regimes. Our evaluations show average performance improvement of 15.3% for synthetic MIL datasets, 2.8% for MIL benchmarks, and 5.5% for rare anemia datasets compared to current state-of-the-art MIL models, where only 17–120 samples per class are available. We make our code publicly available at https://anonymous.4open.science/r/TGMIL-59B6.

URL: https://openreview.net/forum?id=1hZy9ZjjCc

---

Title: Toward bilipshiz geometric models

Abstract: Many neural networks for point clouds are, by design, invariant to the symmetries of this datatype: permutations and rigid motions. The purpose of this paper is to examine whether such networks preserve natural symmetry aware distances on the point cloud spaces, through the notion of bi-Lipschitz equivalence. This inquiry is motivated by recent work in the Equivariant learning literature which highlights the advantages of bi-Lipschitz models in other scenarios.

We consider two symmetry aware metrics on point clouds: (a) The Procrustes Matching (PM) metric and (b) the Hard Gromov Wasserstien distances. We show that these two distances themselves are not bi-Lipschitz equivalent, and as a corollary deduce that popular invariant networks for point clouds are not bi-Lipschitz with respect to the PM metric. We then show how these networks can be modified so that they do obtain bi-Lipschitz guarantees. Finally, we provide initial experiments showing the advantage of the proposed bi-Lipschitz model over standard invariant models, for the tasks of finding correspondences between 3D point clouds.

URL: https://openreview.net/forum?id=UeLoPZPjBu

---

Title: Diversity Sampling Regularization for Multi-Domain Generalization

Abstract: Domain Generalization (DG) seeks to create models that can successfully generalize to new,
unseen target domains without the need for target domain data during training. Traditional
approaches often rely on data augmentation or feature mixing techniques, such as MixUp;
however, these methods may fall short in capturing the essential diversity within the feature
space, resulting in limited robustness against domain shifts. In this research, we revisit the
importance of diversity in DG tasks and propose a simple yet effective method to improve DG
performance through diversity-sampling regularization. Specifically, we calculate entropy
values for input data to assess their prediction uncertainty, and use these values to guide
sampling through Determinantal Point Process (DPP), which prioritizes selecting data sub-
sets with high diversity. By incorporating DPP-based diversity sampling as a regularization
strategy, our framework enhances the standard Empirical Risk Minimization (ERM) objec-
tive, promoting the learning of domain-agnostic features without relying on explicit data aug-
mentation. We empirically validate the effectiveness of our method on standard DG bench-
marks, including PACS, VLCS, OfficeHome, TerraIncognita, and DomainNet, and through
extensive experiments show that it consistently improves generalization to unseen domains
and outperforms widely used baselines and S.O.T.A without relying on any task-specific
heuristics.

URL: https://openreview.net/forum?id=nXqMt7X2RX

---

Title: Plan2Cleanse: Test-Time Backdoor Defense via Monte-Carlo Planning in Deep Reinforcement Learning

Abstract: Ensuring the security of reinforcement learning (RL) models is critical, particularly when they are trained by third parties and deployed in real-world systems. Attackers can implant backdoors into these models, causing them to behave normally under typical conditions but execute malicious behaviors when specific triggers are activated. In this work, we propose Plan2Cleanse, a test-time detection and mitigation framework that adapts Monte Carlo Tree Search to efficiently identify and neutralize RL backdoor attacks without requiring model retraining. Our approach recasts backdoor detection as a planning problem, enabling systematic exploration of temporally extended trigger sequences while maintaining black-box access to the target policy. By leveraging the detection results, Plan2Cleanse can further achieve efficient mitigation through tree-search preventive replanning. We evaluate our method across competitive MuJoCo environments, simulated O-RAN wireless networks, and Atari games. Plan2Cleanse achieves substantial improvements, increasing trigger detection success rates by over 61.4 percentage points in stealthy O-RAN scenarios and improving win rates from 35\% to 53\% in competitive Humanoid environments. These results demonstrate the effectiveness of our test-time defense approach and highlight the importance of proactive defenses against backdoor threats in RL deployments.

URL: https://openreview.net/forum?id=ZKhKxqwuPu

---

Title: A Bayesian Approach to Segmentation with Noisy Labels via Spatially Correlated Distributions

Abstract: In semantic segmentation, the accuracy of models heavily depends on the high-quality annotations. However, in many practical scenarios, such as medical imaging and remote sensing, obtaining true annotations is not straightforward and usually requires significant human labor.
Relying on human labor often introduces annotation errors, including mislabeling, omissions, and inconsistency between annotators.
In the case of remote sensing, differences in procurement time can lead to misaligned ground-truth annotations.
These label errors are not independently distributed, and instead usually appear in spatially connected regions where adjacent pixels are more likely to share the same errors.
To address these issues, we propose an approximate Bayesian estimation based on a probabilistic model that assumes training data include label errors, incorporating the tendency for these errors to occur with spatial correlations between adjacent pixels.
However, Bayesian inference for such spatially correlated discrete variables is notoriously intractable. To overcome this fundamental challenge, we introduce a novel class of probabilistic models, which we term the \textbf{ELBO-Computable Correlated Discrete Distribution (ECCD)}. By representing the discrete dependencies through a continuous latent Gaussian field with a Kac-Murdock-Szeg\"{o} (KMS) structured covariance, our framework enables scalable and efficient variational inference for problems previously considered computationally prohibitive.
Through experiments on multiple segmentation tasks, we confirm that leveraging the spatial correlation of label errors significantly improves performance.
Notably, in specific tasks such as lung segmentation, the proposed method achieves performance comparable to training with clean labels under moderate noise levels. Code is included in the supplementary materials.

URL: https://openreview.net/forum?id=oMgfr8Kk2x

---

Title: BrightDreamer: Generic 3D Gaussian Generative Framework for Fast Text-to-3D Synthesis

Abstract: Text-to-3D synthesis has recently seen intriguing advances by combining the text-to-image priors with 3D representation methods, e.g., 3D Gaussian Splatting (3D GS), via Score Distillation Sampling (SDS). However, a hurdle of existing methods is the low efficiency, per-prompt optimization for a single 3D object. Therefore, it is imperative for a paradigm shift from per-prompt optimization to feed-forward generation for any unseen text prompts, which yet remains challenging. An obstacle is how to directly generate a set of millions of 3D Gaussians to represent a 3D object. This paper presents BrightDreamer, an end-to-end feed-forward approach that can achieve generalizable and fast (77 ms) text-to-3D generation. Our key idea is to formulate the generation process as estimating the 3D deformation from an anchor shape with predefined positions. For this, we first propose a Text-guided Shape Deformation (TSD) network to predict the deformed shape and its new positions, used as the centers (one attribute) of 3D Gaussians. To estimate the other four attributes (i.e., scaling, rotation, opacity, and SH), we then design a novel Text-guided Triplane Generator (TTG) to generate a triplane representation for a 3D object. The center of each Gaussian enables us to transform the spatial feature into the four attributes. The generated 3D Gaussians can be finally rendered at 705 frames per second. Extensive experiments demonstrate the superiority of our method over existing methods. Also, BrightDreamer possesses a strong semantic understanding capability even for complex text prompts. The project code is available in supplementary materials.

URL: https://openreview.net/forum?id=Rb19CQCwbi

---

Title: Lifelong Open-Ended Probability Predictors

Abstract: We advance probabilistic multiclass prediction on lifelong streams of
items. A (learner) predictor must provide item probabilities, adapting
to significant non-stationarity, including new item appearances and
frequency changes. The predictor is not given the set of items that it
needs to predict before hand, and moreover the set can grow unbounded:
the space-limited predictor need only track the currently salient
items and their probabilities.
We develop Sparse Moving Average techniques (SMAs), including
adaptations of sparse EMA as well as novel queue-based methods with
dynamic per-item histories. For performance evaluation, to handle new
items, we develop a bounded version of log-loss. Our findings, on a
range of synthetic and real data streams, show that dynamic
predictand-specific (per connection) parameters, such as learning
rates, enhance both adaptation speed and stability. Code is provided.

URL: https://openreview.net/forum?id=rojnGCcMaK

---

Title: Incorporating New Knowledge into Federated Learning: Advances, Insights, and Future Directions

Abstract: Federated Learning (FL) is a distributed learning approach that allows participants to collaboratively train machine learning models without sharing the raw data. It is rapidly developing in an era where privacy protection is increasingly valued. It is this rapid development trend, along with the continuous emergence of new demands for FL in the real world, that prompts us to focus on a very important problem: How to Incorporate New Knowledge into Federated Learning? The primary challenge here is to effectively and timely incorporate various new knowledge into existing FL systems and evolve these systems to reduce costs, extend their lifespan, and facilitate sustainable development. In the meantime, established FL systems should preserve existing functionalities during the incorporation of new knowledge. In this paper, we systematically define the main sources of new knowledge in FL, including new features, tasks, models, and algorithms. For each source, we thoroughly analyze and discuss the technical approaches for incorporating new knowledge into existing FL systems and examine the impact of the form and timing of new knowledge arrival on the incorporation process. Furthermore, we comprehensively discuss the potential future directions for FL, incorporating new knowledge and considering a variety of factors, including scenario setups, security and privacy threats, and incentives.

URL: https://openreview.net/forum?id=BWBfK3B3b7

---

Title: Topology-Guided Graph Pre-training and Prompt Learning on Directed Graphs

Abstract: In recent years, graph neural networks (GNNs) have been the dominant approach for graph representation learning, leading to new state-of-the-art
results on many classification and prediction tasks. However, they are limited by the fact that they cannot effectively learn expressive node representations without the guide of labels, thus suffering from the labeled data scarcity problem. To address the challenges of labeling costs and improve robustness in few-shot scenarios, pre-training on self-supervised tasks has garnered significant attention. Additionally, numerous prompting methods have been proposed as effective ways to bridge the gap between pretext tasks and downstream applications. Although graph pre-training and prompt tuning methods have explored various downstream tasks on undirected graphs, directed graphs have been largely under-explored and these models suffer limitations in capture directional and topological information in directed graphs. In this paper, we propose a novel topology-guided directed graph pre-training and prompt tuning model, named TopoDIG, that can effectively capture intrinsic directional structural and local topological features in directed graphs. These features play essential roles in transferring knowledge from a pre-trained model to downstream tasks. For model architecture, TopoDIG consists of an encoder in the form of a magnetic Laplacian matrix, a topological encoder, and a graph prompt learning function. Experimental results on both real-world and synthetic directed graphs demonstrate the superior performance of TopoDIG compared to prominent baseline methods.

URL: https://openreview.net/forum?id=kMIdkLTys8

---

Title: Towards A More Transparent Understanding of Weight-Averaged Model Merging

Abstract: Model merging, particularly through weight averaging, has shown surprising effectiveness in saving computations and improving model performance without any additional training. However, the interpretability of why and how this technique works remains unclear. In this work, we reinterpret weight-averaged model merging through the lens of interpretability and provide empirical insights into the underlying mechanisms that govern its behavior. We approach the problem from three perspectives: (1) we analyze the learned weight structures and demonstrate that model weights encode structured representations that help explain the compatibility of weight averaging; (2) we compare averaging in weight space and feature space across diverse model architectures (CNNs and ViTs) and datasets, aiming to expose under which circumstances what combination paradigm will work more effectively; (3) we study the effect of parameter scaling on prediction stability, highlighting how weight averaging acts as a form of regularization that contributes to robustness. By framing these analyses in an interpretability context, our work contributes to a more transparent and systematic understanding of model merging for stakeholders interested in the safety and reliability of untrained model combination methods. The code is available at https://anonymous.4open.science/r/Rethink-Merge-E9BE.

URL: https://openreview.net/forum?id=DF7YplmcYx

---

Reply all
Reply to author
Forward
0 new messages