Daily TMLR digest for Dec 15, 2025

1 view

Skip to first unread message

TMLR

unread,

Dec 15, 2025, 12:30:08 AM12/15/25

to tmlr-anno...@googlegroups.com

New submissions
===============

Title: You Only Train Once: Differentiable Subset Selection for Omics Data

Abstract: Selecting compact and informative gene subsets from single-cell transcriptomic data is essential for biomarker discovery, improving interpretability, and cost-effective profiling. However, most existing feature selection approaches either operate as multi-stage pipelines or rely on post hoc feature attribution, making selection and prediction weakly coupled. In this work, we present YOTO (you only train once), an end-to-end framework that jointly identifies discrete gene subsets and performs prediction within a single differentiable architecture. In our model, the prediction task directly guides which genes are selected, while the learned subsets, in turn, shape the predictive representation. This closed feedback loop enables the model to iteratively refine both what it selects and how it predicts during training. Unlike existing approaches, YOTO enforces sparsity so that only the selected genes contribute to inference, eliminating the need to train additional downstream classifiers. Through a multi-task learning design, the model learns shared representations across related objectives, allowing partially labeled datasets to inform one another, and discovering gene subsets that generalize across tasks without additional training steps.
We evaluate YOTO on two representative single-cell RNA-seq datasets, showing that it consistently outperforms state-of-the-art baselines. These results demonstrate that sparse, end-to-end, multi-task gene subset selection improves predictive performance and yields compact and meaningful gene subsets, advancing biomarker discovery and single-cell analysis.

URL: https://openreview.net/forum?id=xQiXlADW5v

---

Title: SHEP: Spatial Heterogeneity–Driven Experience Prioritization in Scalable Multi-Agent Reinforcement Learning

Abstract: Scalable Multi-Agent Reinforcement Learning (MARL) faces severe challenges regarding the exponential explosion of joint state-action space dimensionality and the difficulty of global coordination as the number of agents increases. Traditional methods optimize fine-grained individual strategies within an exponentially vast state space, leading to low sample efficiency and training bottlenecks in large-scale scenarios. To address these issues, this paper proposes \textbf{SHEP} (Spatial Heterogeneity–Driven Experience Prioritization), a mesoscopic guidance framework designed for large-scale group coordination. SHEP utilizes Occupancy Entropy, Action Diversity Entropy, and Moran's I to construct a set of topological feature descriptors, mapping the high-dimensional individual state space into a low-dimensional, interpretable group feature space. Building on this, we design heterogeneity-driven prioritized experience replay and Group Hindsight Experience Replay (Group-HER). By identifying critical moments of abrupt spatial heterogeneity changes or highly structured clustering, these mechanisms accurately screen for high-value samples and perform ``dimensionality reduction pruning'' on the ineffective exploration space, significantly improving sample efficiency. Due to the universality of its experience screening mechanism, SHEP can be seamlessly integrated as a ``plug-in'' into mainstream centralized training algorithms like MAPPO without altering their underlying policy optimization objectives. In MAgent environments and SMAC benchmarks, SHEP demonstrates superior performance, with convergence speed and final win rates significantly outperforming baseline methods such as QMIX and Mean-Field approaches. These results robustly validate that introducing explicit spatial heterogeneity features to guide experience prioritization is an effective paradigm for resolving the curse of dimensionality in scalable MARL.

URL: https://openreview.net/forum?id=b6WUL2GH1w

---

Title: Seeing is Simulating: Differentiable Physics for Interaction-Aware Material Estimation

Abstract: Modeling human-object interactions is crucial for creating immersive virtual experiences. However, synthesizing 3D object dynamics conditioned on actions remains a challenging problem. Existing approaches equip static 3D objects with motion priors distilled from video diffusion models. However, this methodology has two drawbacks: (i) video diffusion models are not physically grounded. Thus, the generated videos may contain physical inaccuracies; (ii) video diffusion models cannot generate complex dynamics where multiple objects interact under actions with long durations and large spatial extent. We present $\textbf{PhysInteract}$, a physics-based framework that (i) models interactions with a representation that captures their duration and contact information; (ii) estimates object material properties (e.g., Young's modulus) from objects' deformation caused by interactions; (iii) uses physics simulation to reproduce realistic object dynamics based on estimated interactions and material properties. We highlight that PhysInteract is fully differentiable, enabling joint optimization of interaction representations and object material properties. PhysInteract achieves better performance than existing methods. We demonstrate its superiority by quantitatively testing PhysInteract on a curated dataset. In conjunction with an additional user study, our method shows a step towards more realistic and immersive virtual experiences.

URL: https://openreview.net/forum?id=lwuaTI4ISa

---

Title: LARP: Learner-Agnostic Robust Data Prefiltering

Abstract: Public datasets, crucial for modern machine learning and statistical inference, often contain low-quality or contaminated data that harms model performance. This motivates the development of principled prefiltering procedures that facilitate accurate downstream learning. In this work, we formalize the problem of **L**earner-**A**gnostic **R**obust data **P**refiltering (LARP), which aims at finding prefiltering procedures that minimize a worst-case loss over a pre-specified set of learners. We instantiate this framework in two theoretical settings, providing a hardness result and upper bounds. Our theoretical results indicate that performing LARP on heterogeneous learner sets causes some performance loss compared to individual, learner-specific prefiltering; we term this gap as the price of LARP. To assess whether LARP remains worthwhile, we (i) empirically measure the price of LARP across image and tabular tasks and (ii) introduce a game-theoretic cost model that trades off the price of LARP against the cost of learner-specific prefiltering. The model yields sufficient conditions under which LARP is provably beneficial.

URL: https://openreview.net/forum?id=gI6VOV3jfO

---

Title: FairSpace: Search Space Pruning of AutoML for Fairness-Accuracy Trade-off

Abstract: A major challenge in responsible Machine Learning (ML) engineering is ensuring fairness across multiple protected attributes and their intersections. Existing bias mitigation techniques and Automated Machine Learning (AutoML) systems often fail to address this due to the combinatorial explosion of configurations during hyperparameter optimization (HPO). We propose \textsc{FairSpace}, a fairness-aware framework that jointly performs HPO and dataset-specific feature engineering while strategically pruning the configuration space. \textsc{FairSpace} integrates LLM-assisted feature engineering methods with a bi-objective cost function to balance fairness and accuracy. Experimental results on five widely-used datasets demonstrate that \textsc{FairSpace} achieves win–win outcomes—simultaneously improving fairness and accuracy for 63\% of the cases, outperforming state-of-the-art (SOTA) baselines that achieve up to 60\%. Moreover, \textsc{FairSpace} achieves these results with approximately 25\% less computation time, owing to its targeted pruning strategy as compared to the SOTA AutoML baseline such as FairAutoML. By explicitly tackling intersectional fairness, \textsc{FairSpace} reaches 94\% of its outcomes in the \emph{win–win} and \emph{good trade-off} regions, providing a consistent and generalizable foundation for fairness-aware AutoML.

URL: https://openreview.net/forum?id=dkO4IwfwJe

---

Title: Evaluating LLM Understanding via Structured Tabular Decision Simulations

Abstract: Large language models (LLMs) often achieve impressive predictive accuracy, yet correctness alone does not imply genuine understanding. True LLM understanding, analogous to human expertise, requires making consistent, well-founded decisions across multiple instances and diverse domains, relying on relevant and domain-grounded decision factors. We introduce Structured Tabular Decision Simulations (STaDS), a suite of expert-like decision settings that evaluate LLMs as if they were professionals undertaking structured decision "exams". In this context, understanding is defined as the ability to identify and rely on the correct decision factors, features that determine outcomes within a domain. STaDS jointly assesses understanding through: (i) question and instruction comprehension, (ii) knowledge-based prediction, and (iii) reliance on relevant decision factors. By analyzing 9 frontier LLMs across 15 diverse decision settings, we find that (a) most models struggle to achieve consistently strong accuracy across diverse domains; (b) models can be accurate yet globally unfaithful, and there are frequent mismatches between stated rationales and factors driving predictions. Our findings highlight the need for global-level understanding evaluation protocols and advocate for novel frameworks that go beyond accuracy to enhance LLMs' understanding ability.

URL: https://openreview.net/forum?id=R4NninzmGb

---

Title: Domain-Oriented Time Series Inference Agents for Reasoning and Automated Analysis

Abstract: Time series analysis is crucial in real-world applications, yet traditional methods focus on isolated tasks only, and recent studies on time series reasoning remain limited to either single-step inference or are constrained to natural language answers. In this work, we introduce TS-Reasoner, a domain-specialized agent designed agent designed for multi-step time series inference. By integrating large language model (LLM) reasoning with domain- specific computational tools and error feedback loop, TS-Reasoner enables domain-informed, constraint-aware analytical workflows that combine symbolic reasoning with precise numerical analysis. We assess the system’s capabilities along two axes: 1) fundamental time series understanding assessed by TimeSeriesExam and 2) complex, multi-step inference, evaluated by a newly proposed dataset designed to test both compositional reasoning and computational precision in time series analysis. Experiments show that our approach outperforms standalone general-purpose LLMs in both basic time series concept understanding as well as the multi-step time series inference task, highlighting the promise of domain-specialized agents for automating real-world time series reasoning and analysis.

URL: https://openreview.net/forum?id=yhy7Vigjcf

---

Title: Meta-Learning and Meta-Reinforcement Learning - Tracing the Path towards Deep Mind's Adaptive Agent

Abstract: Humans are highly effective at utilizing prior knowledge to adapt to novel tasks, a capability that standard machine learning models struggle to replicate due to their reliance on task-specific training.
Meta-learning overcomes this limitation by allowing models to acquire transferable knowledge from various tasks, enabling rapid adaptation to new challenges with minimal data.
This survey provides a rigorous, task-based formalization of meta‑learning and meta-reinforcement learning and uses that paradigm to chronicle the landmark algorithms that paved the way for DeepMind’s Adaptive Agent, consolidating the essential concepts needed to understand the Adaptive Agent and other generalist approaches.

URL: https://openreview.net/forum?id=NZp1UVstvt

---

Title: Characterizing the ability of LLMs to recapitulate Americans’ distributional responses to public opinion polling questions across political issues

Abstract: Traditional survey-based political issue polling is becoming less tractable due to increasing costs and risk of bias associated with growing non-response rates and declining coverage of key demographic groups. With researchers and pollsters seeking alternatives, Large Language Models have drawn attention for their potential to augment human population studies in polling contexts. We propose and implement a new framework for anticipating human responses on multiple-choice political issue polling questions by directly prompting an LLM to predict a distribution of responses. By comparison to a large and high quality issue poll of the US population, the Cooperative Election Study, we evaluate how the accuracy of this framework varies across a range of demographics and questions on a variety of topics, as well as how this framework compares to previously proposed frameworks where LLMs are repeatedly queried to simulate individual respondents. We find the proposed framework consistently exhibits more accurate predictions than individual querying at significantly lower cost. In addition, we find the performance of the proposed framework varies much more systematically and predictably across demographics and questions, making it possible for those performing AI polling to better anticipate model performance using only information available before a query is issued.

URL: https://openreview.net/forum?id=TR84HetGOH

---

Reply all

Reply to author

Forward

0 new messages