Daily TMLR digest for Aug 29, 2025

1 view
Skip to first unread message

TMLR

unread,
Aug 29, 2025, 12:06:07 AM (10 days ago) Aug 29
to tmlr-anno...@googlegroups.com

Accepted papers
===============


Title: Cardinality Sparsity: Applications in Matrix-Matrix Multiplications and Machine Learning

Authors: Ali Mohaddes, Johannes Lederer

Abstract: High-dimensional data has become ubiquitous across the sciences but presents computational and statistical challenges. A common approach to addressing these challenges is through sparsity. In this paper, we introduce a new concept of sparsity, called cardinality sparsity. Broadly speaking, we define a tensor as sparse if it contains only a small number of unique values. We demonstrate that cardinality sparsity can improve deep learning and tensor regression both statistically and computationally. Along the way, we generalize recent statistical theories in these fields. Most importantly, we show that cardinality sparsity has a strikingly powerful application beyond high-dimensional data analysis: it can significantly speed up matrix-matrix multiplications. For instance, we demonstrate that cardinality sparsity leads to algorithms for binary-matrix multiplication that outperform state-of-the-art algorithms by a substantial margin. Additionally, another crucial aspect of this sparsity is minimizing memory usage. By executing matrix multiplication in the compressed domain, we can significantly lower the amount of memory needed to store the input data.

URL: https://openreview.net/forum?id=zoSRSpGu9C

---

Title: On Time Series Clustering with Graph Neural Networks

Authors: Jonas Berg Hansen, Andrea Cini, Filippo Maria Bianchi

Abstract: Graph clustering and pooling operators have been adopted in graph-based architectures to capture meaningful patterns in time series data by leveraging both temporal and relational structures. However, the contribution of each design choice and the behavior of different operators remain underexplored. This work introduces a streamlined deep learning framework based on a spatio-temporal graph neural network (STGNN) for clustering time series, which can leverage prior knowledge on the spatial structure of the data. The STGNN-based model flexibly identifies clusters in various data settings through an encoder-decoder architecture with a bottleneck, showing that a spatio-temporal approach can identify meaningful clusters even in datasets that do not explicitly include spatial relations. We validate the framework’s qualitative performance through experiments on synthetic and real-world data, showing its effectiveness in different scenarios. We also provide a heuristic for model selection in unsupervised settings via a self-supervised forecasting loss. Code is available at https://github.com/NGMLGroup/Time-Series-Clustering-with-GNNs

URL: https://openreview.net/forum?id=MHQXfiXsr3

---

Title: Variance Reduced Smoothed Functional REINFORCE Policy Gradient Algorithms

Authors: Shalabh Bhatnagar, Deepak H R

Abstract: We revisit the REINFORCE policy gradient algorithm from the literature that works with reward (or cost) returns obtained over episodes or trajectories. We propose a major enhancement to the basic algorithm where we estimate the policy gradient using a smoothed
functional (random perturbation) gradient estimator obtained from direct function measurements. To handle the issue of high variance that is typical of REINFORCE, we propose two independent enhancements to the basic scheme: (i) use the sign of the increment instead
of the original (full) increment that results in smoother convergence and (ii) use clipped gradient estimates as proposed in the Proximal Policy Optimization (PPO) based scheme. We prove the asymptotic convergence of all algorithms and show the results of several experiments on various MuJoCo locomotion tasks wherein we compare the performance of our algorithms with the recently proposed ARS algorithms in the literature as well as other well known algorithms namely A2C, PPO and TRPO. Our algorithms are seen to be competitive
against all algorithms and in fact show the best results on a majority of experiments.

URL: https://openreview.net/forum?id=yagxqSJbiY

---

Title: Streaming Heteroscedastic Probabilistic PCA with Missing Data

Authors: Kyle Gilman, David Hong, Jeffrey A Fessler, Laura Balzano

Abstract: Streaming principal component analysis (PCA) is an integral tool in large-scale machine learning for rapidly estimating low-dimensional subspaces from very high-dimensional data arriving at a high rate. However, modern datasets increasingly combine data from a variety of sources, and thus may exhibit heterogeneous quality across samples. Standard streaming PCA algorithms do not account for non-uniform noise, so their subspace estimates can quickly degrade. While the recently proposed Heteroscedastic Probabilistic PCA Technique (HePPCAT) addresses this heterogeneity, it was not designed to handle streaming data that may exhibit non-stationary behavior. Moreover, HePPCAT does not allow for missing entries in the data, which can be common in streaming data. This paper proposes the Streaming HeteroscedASTic Algorithm for PCA (SHASTA-PCA) to bridge this divide. SHASTA-PCA employs a stochastic alternating expectation-maximization approach that jointly learns the low-rank latent factors and the unknown noise variances from streaming data that may have missing entries and heteroscedastic noise, all while maintaining a low memory and computational footprint. Numerical experiments demonstrate the superior subspace estimation of our method compared to state-of-the-art streaming PCA algorithms in the heteroscedastic setting. Finally, we illustrate SHASTA-PCA applied to highly heterogeneous real data from astronomy.

URL: https://openreview.net/forum?id=lb2rPLuP9X

---

Title: Fast and Cost-effective Speculative Edge-Cloud Decoding with Early Exits

Authors: Yeshwanth Venkatesha, Souvik Kundu, Priyadarshini Panda

Abstract: Large Language Models (LLMs) enable various applications on edge devices such as smartphones, wearables, and embodied robots. However, their deployment often depends on expensive cloud-based APIs, creating high operational costs, which limit access for smaller organizations and raise sustainability concerns. Certain LLMs can be deployed on-device, offering a cost-effective solution with reduced latency and improved privacy. Yet, limited computing resources constrain the size and accuracy of models that can be deployed, necessitating a collaborative design between edge and cloud. We propose a fast and cost-effective speculative edge-cloud decoding framework with a large target model on the server and a small draft model on the device. By introducing early exits in the target model, tokens are generated mid-verification, allowing the client to preemptively draft subsequent tokens before final verification, thus utilizing idle time and enhancing parallelism between edge and cloud. Using an NVIDIA Jetson Nano (client) and an A100 GPU (server) with Vicuna-68M (draft) and Llama2-7B (target) models, our method achieves up to a 35% reduction in latency compared to cloud-based autoregressive decoding, with an additional 11% improvement from preemptive drafting. To demonstrate real-world applicability, we deploy our method on the Unitree Go2 quadruped robot using Vision-Language Model (VLM) based control, achieving a 21% speedup over traditional cloud-based autoregressive decoding. These results demonstrate the potential of our framework for real-time LLM and VLM applications on resource-constrained edge devices.

URL: https://openreview.net/forum?id=PTIUjARnbc

---

Title: TFAR: A Training-Free Framework for Autonomous Reliable Reasoning in Visual Question Answering

Authors: Zhuo Zhi, Chen Feng, Adam Daneshmend, Mine Orlu, Andreas Demosthenous, Lu Yin, Da Li, Ziquan Liu, Miguel R. D. Rodrigues

Abstract: Recent approaches introduce chain-of-thought (CoT) reasoning to mitigate the challenges, such as hallucination and reasoning deficit in multimodal large language models (MLLMs) and enhance performance. However, existing CoT-based methods often rely on extensive data annotation and training. To overcome these limitations, we propose a training-free framework for autonomous and reliable reasoning (TFAR), which only uses common lightweight vision tools to improve the reasoning ability of MLLMs. TFAR enables an MLLM to autonomously and accurately identify relevant regions of interest (RoIs) and support CoT reasoning, without requiring additional training or annotations, and with low computational overhead during inference. However, the use of external tools will introduce noise and uncertainty. To mitigate the uncertainty introduced by external tools and select the optimal pathway, we propose a conformal prediction-based uncertainty quantification method that calibrates the outputs from external tools and dynamically selects the most appropriate tool based on the MLLM’s output uncertainty. Experiments across five datasets demonstrate that TFAR improves performance over the base MLLM by an average of 4.6$\%$, in some cases even outperforming fine-tuned baselines, while maintaining low inference cost. These results offer new insights into training-free CoT guidance for MLLMs and underscore the value of reliable visual tools.

URL: https://openreview.net/forum?id=cBAKeZN3jy

---

Title: nnActive: A Framework for Evaluation of Active Learning in 3D Biomedical Segmentation

Authors: Carsten T. Lüth, Jeremias Traub, Kim-Celine Kahl, Till J. Bungert, Lukas Klein, Lars Krämer, Paul F Jaeger, Fabian Isensee, Klaus Maier-Hein

Abstract: Semantic segmentation is crucial for various biomedical applications, yet its reliance on large annotated datasets presents a significant bottleneck due to the high cost and specialized expertise required for manual labeling. Active Learning (AL) aims to mitigate this challenge by selectively querying the most informative samples, thereby reducing annotation effort.
However, in the domain of 3D biomedical imaging, there remains no consensus on whether AL consistently outperforms Random sampling strategies. Current methodological assessment is hindered by the wide-spread occurrence of four pitfalls with respect to AL method evaluation. These are (1) restriction to too few datasets and annotation budgets, (2) training 2D models on 3D images and not incorporating partial annotations, (3) Random baseline not being adapted to the task, and (4) measuring annotation cost only in voxels.
In this work, we introduce nnActive, an open-source AL framework that systematically overcomes the aforementioned pitfalls by (1) means of a large scale study evaluating 8 Query Methods on four biomedical imaging datasets and three label regimes, accompanied by four large-scale ablation studies, (2) extending the state-of-the-art 3D medical segmentation method nnU-Net by using partial annotations for training with 3D patch-based query selection, (3) proposing Foreground Aware Random sampling strategies tackling the foreground-background class imbalance commonly encountered in 3D medical images and (4) propose the foreground efficiency metric, which captures that the annotation cost for background- compared to foreground-regions is very low. We reveal the following key findings: (A) while all AL methods outperform standard Random sampling, none reliably surpasses an improved Foreground Aware Random sampling; (B) the benefits of AL depend on task specific parameters like number of classes and their locations; (C) Predictive Entropy is overall the best performing AL method, but likely requires the most annotation effort; (D) AL performance can be improved with more compute intensive design choices like longer training and smaller query sizes. As a holistic, open-source framework, nnActive has the potential to act as a catalyst for research and application of AL in 3D biomedical imaging. Code is at: \href{https://github.com/MIC-DKFZ/nnActive}{https://github.com/MIC-DKFZ/nnActive}

URL: https://openreview.net/forum?id=AJAnmRLJjJ

---

Title: On the Convergence Rates of Federated Q-Learning across Heterogeneous Environments

Authors: Muxing Wang, Pengkun Yang, Lili Su

Abstract: Large-scale multi-agent systems are often deployed across wide geographic areas, where agents interact with heterogeneous environments. There is an emerging interest in understanding the role of heterogeneity in the performance of the federated versions of classic reinforcement learning algorithms. In this paper, we study synchronous federated Q-learning, which aims to learn an optimal Q-function by having $K$ agents average their local Q-estimates per $E$ iterations. We provide a fine-grained characterization of the error evolution, which decays to zero as the number of iterations $T$ increases. When $K(E-1)$ is below a certain threshold, similar to the homogeneous environment settings, there is a linear speed-up concerning $K$. The slow convergence of having $E>1$ turns out to be fundamental rather than an artifact of our analysis. We prove that, for a wide range of stepsizes, the $\ell_{\infty}$ norm of the error cannot decay faster than $\Theta_R (\frac{E}{(1-\gamma)T})$, where $\Theta_R$ only hides numerical constants and the specific choice of reward values. In addition, our experiments demonstrate that the convergence exhibits an interesting two-phase phenomenon. For any given stepsize, there is a sharp phase transition of the convergence: the error decays rapidly in the beginning yet later bounces up and stabilizes.

URL: https://openreview.net/forum?id=EkLAG3gt3g

---


New submissions
===============


Title: Self-Improving LLMs with Synthetic Data Through Dynamic Noise Preference Optimization

Abstract: Although LLMs have achieved significant success, their reliance on large volumes of human-annotated data has limited their potential for further scaling. In this situation, utilizing self-generated synthetic data has become crucial for fine-tuning LLMs without extensive human annotation. However, current methods often fail to ensure consistent improvements across iterations, with performance stagnating after only minimal updates. To overcome these challenges, we introduce Dynamic Noise Preference Optimization (DNPO), which combines dynamic sample labeling for constructing preference pairs with controlled, trainable noise injection during preference optimization. Our approach effectively prevents stagnation and enables continuous improvement. In experiments with Llama-3.2-3B and Zephyr-7B, DNPO consistently outperforms existing methods across multiple benchmarks. Additionally, with Zephyr-7B, DNPO shows a significant improvement in model-generated data quality, with a 29.4% win-loss rate gap compared to the baseline in GPT-4 evaluations.

URL: https://openreview.net/forum?id=qDexGLXpef

---

Title: Real-Time Deepfake Detection in the Real World

Abstract: Recent improvements in generative AI made synthesizing fake images easy; as they can be used to cause harm, it is crucial to develop accurate techniques to identify them. This paper introduces "Locally Aware Deepfake Detection Algorithm" (LaDeDa), that accepts a single 9 x9 image patch and outputs its deepfake score. The image deepfake score is the pooled score of its patches. With merely patch-level information, LaDeDa significantly improves over the state-of-the-art, achieving around 99% mAP on current benchmarks. Owing to the patch-level structure of LaDeDa, we hypothesize that the generation artifacts can be detected by a simple model. We therefore distill LaDeDa into Tiny-LaDeDa, a highly efficient model consisting of only 4 convolutional layers. Remarkably, Tiny-LaDeDa has 375x fewer FLOPs and is 10,000x more parameter-efficient than LaDeDa, allowing it to run efficiently on edge devices with a minor decrease in accuracy. These almost-perfect scores raise the question: is the task of deepfake detection close to being solved? Perhaps surprisingly, our investigation reveals that current training protocols prevent methods from generalizing to real-world deepfakes extracted from social media. To address this issue, we introduce WildRF, a new deepfake detection dataset curated from several popular social networks. Our method achieves the top performance of 93.7% mAP on WildRF, however the large gap from perfect accuracy shows that reliable real-world deepfake detection is still unsolved.

URL: https://openreview.net/forum?id=ibmmuUJTCx

---

Title: The Geometry of Stability: A Cohomological View on Preference Cycles and Algorithmic Robustness

Abstract: Algorithmic stability—the robustness of predictions to training data perturbations—is fundamental to reliable machine learning. While methods like bagging, regularization, and inflated operators improve stability, they appear as disconnected techniques. We propose a unified mathematical framework demonstrating that algorithmic instability often arises from fundamental inconsistencies in local data preferences, mathematically analogous to Condorcet cycles in social choice theory. We formalize these inconsistencies as cohomological obstructions ($H^1 \neq 0$), leveraging established connections between social choice theory and algebraic topology. This framework reveals bagging as a strategy for obstruction prevention (smoothing the preference landscape) and inflated operators as a strategy for obstruction resolution (target space enlargement). Furthermore, we derive a novel technique from this framework, obstruction-aware regularization, which directly enforces mathematical consistency. We provide direct empirical validation for our claims. First, we demonstrate that engineered Condorcet cycles induce high instability in standard methods, which is resolved by inflated operators. Second, using Hodge decomposition, we confirm that bagging significantly reduces the magnitude of cohomological obstructions. Third, we show that our proposed obstruction-aware regularization successfully reduces mathematical inconsistencies and yields substantial improvements across multiple metrics of algorithmic stability.

URL: https://openreview.net/forum?id=rFqsgVXZYO

---

Title: Incentivizing High-quality Participation From Federated Learning Agents

Abstract: Federated learning (FL) provides a promising paradigm for facilitating collaboration between multiple clients that jointly learn a global model without directly sharing their local data. However, existing research suffers from two caveats: 1) From the perspective of agents, voluntary and unselfish participation is often assumed. But self-interested agents may opt out of the system or provide low-quality contributions without proper incentives; 2) From the mechanism designer's perspective, the aggregated models can be unsatisfactory as the existing game-theoretical federated learning approach for data collection ignores the potential heterogeneous effort caused by contributed data.
To alleviate above challenges, we propose an incentive-aware framework for agent participation that considers data heterogeneity to accelerate the convergence process. Specifically, we first introduce the notion of Wasserstein distance to explicitly illustrate the heterogeneous effort and reformulate the existing upper bound of convergence. To induce truthful reporting from agents, we analyze and measure the generalization error gap of any two agents by leveraging the peer prediction mechanism to develop score functions. We further present a two-stage Stackelberg game model that formalizes the process and examines the existence of equilibrium. Extensive experiments on real-world datasets demonstrate the effectiveness of our proposed mechanism.

URL: https://openreview.net/forum?id=PeaEnCWAQa

---

Reply all
Reply to author
Forward
0 new messages