Daily TMLR digest for Feb 11, 2025

2 views

Skip to first unread message

TMLR

unread,

Feb 11, 2025, 5:06:07 AMFeb 11

to tmlr-anno...@googlegroups.com

New certifications
==================

Survey Certification: Class Incremental Learning from First Principles: A Review

Neil Ashtekar, Jingxi Zhu, Vasant G Honavar

https://openreview.net/forum?id=sZdtTJInUg

---

Survey Certification: Evaluating Interpretable Methods via Geometric Alignment of Functional Distortions

Anna Hedström, Philine Lou Bommer, Thomas F Burns, Sebastian Lapuschkin, Wojciech Samek, Marina MC Höhne

https://openreview.net/forum?id=ukLxqA8zXj

---

Expert Certification: The RealHumanEval: Evaluating Large Language Models’ Abilities to Support Programmers

Hussein Mozannar, Valerie Chen, Mohammed Alsobay, Subhro Das, Sebastian Zhao, Dennis Wei, Manish Nagireddy, Prasanna Sattigeri, Ameet Talwalkar, David Sontag

https://openreview.net/forum?id=hGaWq5Buj7

---

Accepted papers
===============

Title: Class Incremental Learning from First Principles: A Review

Authors: Neil Ashtekar, Jingxi Zhu, Vasant G Honavar

Abstract: Continual learning systems attempt to efficiently learn over time without forgetting previously acquired knowledge. In recent years, there has been an explosion of work on continual learning, mainly focused on the class-incremental learning (CIL) setting. In this review, we take a step back and reconsider the CIL problem. We reexamine the problem definition and describe its unique challenges, contextualize existing solutions by analyzing non-continual approaches, and investigate the implications of various problem configurations. Our goal is to provide an alternative perspective to existing work on CIL and direct attention toward unexplored aspects of the problem.

URL: https://openreview.net/forum?id=sZdtTJInUg

---

Title: Neural Lattice Reduction: A Self-Supervised Geometric Deep Learning Approach

Authors: Giovanni Luca Marchetti, Gabriele Cesa, Kumar Pratik, Arash Behboodi

Abstract: Lattice reduction is a combinatorial optimization problem aimed at finding the most orthogonal basis in a given lattice. The Lenstra–Lenstra–Lovász (LLL) algorithm is the best algorithm in the literature for solving this problem. In light of recent research on algorithm discovery, in this work, we would like to answer this question: is it possible to parametrize the algorithm space for lattice reduction problem with neural networks and find an algorithm without supervised data? Our strategy is to use equivariant and invariant parametrizations and train in a self-supervised way. We design a deep neural model outputting factorized unimodular matrices and train it in a self-supervised manner by penalizing non-orthogonal lattice bases. We incorporate the symmetries of lattice reduction into the model by making it invariant to isometries and scaling of the ambient space and equivariant with respect to the hyperocrahedral group permuting and flipping the lattice basis elements.
We show that this approach yields an algorithm with comparable complexity and performance to the LLL algorithm on a set of benchmarks. Additionally, motivated by certain applications for wireless communication, we extend our method to a convolutional architecture which performs joint reduction of spatially-correlated lattices arranged in a grid, thereby amortizing its cost over multiple lattices.

URL: https://openreview.net/forum?id=YxXyRSlZ4b

---

Title: Evaluating Interpretable Methods via Geometric Alignment of Functional Distortions

Authors: Anna Hedström, Philine Lou Bommer, Thomas F Burns, Sebastian Lapuschkin, Wojciech Samek, Marina MC Höhne

Abstract: Interpretability researchers face a universal question: without access to ground truth labels, how can the faithfulness of an explanation to its model be determined? Despite immense efforts to develop new evaluation methods, current approaches remain in a pre-paradigmatic state: fragmented, difficult to calibrate, and lacking cohesive theoretical grounding. Observ- ing the lack of a unifying theory, we propose a novel evaluative criterion entitled Generalised Explanation Faithfulness (GEF) which is centered on explanation-to-model alignment, and integrates existing perturbation-based evaluations to eliminate the need for singular, task-specific evaluations. Complementing this unifying perspective, from a geometric point of view, we reveal a prevalent yet critical oversight in current evaluation practice: the failure to account for the learned geometry, and non-linear mapping present in the model, and explanation spaces. To solve this, we propose a general-purpose, threshold-free faithfulness evaluator GEF that incorporates principles from differential geometry, and facilitates evaluation agnostically across tasks, and interpretability approaches. Through extensive cross-domain benchmarks on natural language processing, vision, and tabular tasks, we provide first-of-its-kind insights into the comparative performance of various interpretable methods. This includes local linear approximators, global feature visualisation methods, large language models as post-hoc explainers, and sparse autoencoders. Our contributions are important to the interpretability and AI safety communities, offering a principled, unified approach for evaluation.

URL: https://openreview.net/forum?id=ukLxqA8zXj

---

Title: The RealHumanEval: Evaluating Large Language Models’ Abilities to Support Programmers

Authors: Hussein Mozannar, Valerie Chen, Mohammed Alsobay, Subhro Das, Sebastian Zhao, Dennis Wei, Manish Nagireddy, Prasanna Sattigeri, Ameet Talwalkar, David Sontag

Abstract: Evaluation of large language models for code has primarily relied on static benchmarks, including HumanEval (Chen et al., 2021), or more recently using human preferences of LLM responses. As LLMs are increasingly used as programmer assistants, we study whether gains on existing benchmarks or more preferred LLM responses translate to programmer productivity when coding with LLMs, including time spent coding. We introduce RealHumanEval, a web interface to measure the ability of LLMs to assist programmers, through either autocomplete or chat support. We conducted a user study (N=243) using RealHumanEval in which users interacted with seven LLMs of varying base model performance. Despite static benchmarks not incorporating humans-in-the-loop, we find that improvements in benchmark performance lead to increased programmer productivity; however gaps in benchmark versus human performance are not proportional---a trend that holds across both forms of LLM support. In contrast, we find that programmer preferences do not correlate with their actual performance, motivating the need for better proxy signals. We open-source RealHumanEval to enable human-centric evaluation of new models and the study data to facilitate efforts to improve code models.

URL: https://openreview.net/forum?id=hGaWq5Buj7

---

Title: Enhancing Remaining Useful Life Prediction with Ensemble Multi-Term Fourier Graph Neural Networks

Authors: Ya Song, Laurens Bliek, Yaoxin Wu, Yingqian Zhang

Abstract: Remaining useful life (RUL) prediction is crucial in predictive maintenance. Recently, deep learning forecasting methods, especially Spatio-Temporal Graph Neural Networks (ST-GNNs), have achieved remarkable performance in RUL prediction. Most existing ST-GNNs require searching for the graph structure before utilizing GNNs to learn spatial graph representation, and they necessitate a temporal model such as LSTM to leverage the temporal dependencies in a fixed lookback window. However, such an approach has several limitations. Firstly, it demands substantial computational resources to learn graph structures for the time series data. Secondly, independently learning spatial and temporal information disregards their inherent correlation, and thirdly, capturing information within a fixed lookback window ignores long-term dependencies across the entire time series. To mitigate the issues above, instead of treating the data within the lookback window as a sequence of graphs in ST-GNN methods, we regard it as a complete graph and employ a Fourier Graph Neural Network (FGN) to learn the spatiotemporal information within this graph in the frequency space. Additionally, we create training and test graphs with varying sizes of lookback windows, enabling the model to learn both short-term and long-term dependencies and provide multiple predictions for ensemble averaging. We also consider scenarios where sensor signals exhibit multiple operation conditions and design a sequence decomposition plugin to denoise input signals, aiming to enhance the performance of FGN. We evaluate the proposed model on two benchmark datasets, demonstrating its superior performance on the RUL prediction task compared to state-of-the-art approaches.

URL: https://openreview.net/forum?id=tzFjcVqmxw

---

Title: Data Augmentation Policy Search for Long-Term Forecasting

Authors: Liran Nochumsohn, Omri Azencot

Abstract: Data augmentation serves as a popular regularization technique to combat overfitting challenges in neural networks. While automatic augmentation has demonstrated success in image classification tasks, its application to time-series problems, particularly in long-term forecasting, has received comparatively less attention. To address this gap, we introduce a time-series automatic augmentation approach named TSAA, which is both efficient and easy to implement. The solution involves tackling the associated bilevel optimization problem through a two-step process: initially training a non-augmented model for a limited number of epochs, followed by an iterative split procedure. During this iterative process, we alternate between identifying a robust augmentation policy through Bayesian optimization and refining the model while discarding suboptimal runs. Extensive evaluations on challenging univariate and multivariate forecasting benchmark problems demonstrate that TSAA consistently outperforms several robust baselines, suggesting its potential integration into prediction pipelines. Code is available at this repository: \href{https://github.com/azencot-group/TSAA}{https://github.com/azencot-group/TSAA}.

URL: https://openreview.net/forum?id=Wnd0XY0twh

---

Title: Adaptive Multi-step Refinement Network for Robust Point Cloud Registration

Authors: Zhi Chen, Yufan Ren, Tong Zhang, Zheng Dang, Wenbing Tao, Sabine Susstrunk, Mathieu Salzmann

Abstract: Point Cloud Registration (PCR) estimates the relative rigid transformation between two point clouds of the same scene. Despite significant progress with learning-based approaches, existing methods still face challenges when the overlapping region between the two point clouds is small. In this paper, we propose an adaptive multi-step refinement network that refines the registration quality at each step by leveraging the information from the preceding step. To achieve this, we introduce a training procedure and a refinement network. Firstly, to adapt the network to the current step, we utilize a generalized one-way attention mechanism, which prioritizes the last step's estimated overlapping region, and we condition the network on step indices. Secondly, instead of training the network to map either random transformations or a fixed pre-trained model's estimations to the ground truth, we train it on transformations with varying registration qualities, ranging from accurate to inaccurate, thereby enhancing the network's adaptiveness and robustness. Despite its conceptual simplicity, our method achieves state-of-the-art performance on both the 3DMatch/3DLoMatch and KITTI benchmarks. Notably, on 3DLoMatch, our method reaches 80.4% recall rate, with an absolute improvement of 1.2%.

URL: https://openreview.net/forum?id=M3SkSMfWcP

---

New submissions
===============

Title: Learning Energy-Based Generative Models via Potential Flow: A Variational Principle Approach to Probability Density Homotopy Matching

Abstract: Energy-based models (EBMs) are a powerful class of probabilistic generative models due to their flexibility and interpretability. However, relationships between potential flows and explicit EBMs remain underexplored, while contrastive divergence training via implicit Markov chain Monte Carlo (MCMC) sampling is often unstable and expensive in high-dimensional settings. In this paper, we propose Variational Potential Flow Bayes (VPFB), a new energy-based generative framework that eliminates the need for implicit MCMC sampling and does not rely on auxiliary networks or cooperative training. VPFB learns an energy-parameterized potential flow by constructing a flow-driven density homotopy that is matched to the data distribution through a variational loss minimizing the Kullback-Leibler divergence between the flow-driven and marginal homotopies. This principled formulation enables robust and efficient generative modeling while preserving the interpretability of EBMs. Experimental results on image generation, interpolation, out-of-distribution detection, and compositional generation confirm the effectiveness of VPFB, showing that our method performs competitively with existing approaches in terms of sample quality and versatility across diverse generative modeling tasks.

URL: https://openreview.net/forum?id=vc7poEYOFK

---

Title: Dynamic Pricing in the Linear Valuation Model using Shape Constraints

Abstract: We propose a shape-constrained approach to dynamic pricing for censored data in the linear valuation model that eliminates the need for tuning parameters commonly required in existing methods. Previous works have addressed the challenge of unknown market noise distribution F using strategies ranging from kernel methods to reinforcement learning algorithms, such as bandit techniques and upper confidence bounds (UCB), under the Lipschitz (and stronger) assumption(s) on $F_0$. In contrast, our method relies on isotonic regression under the weaker assumption that $F_0$ is $\alpha$-H\"older continuous for some $\alpha \in (0,1]$. We obtain an upper bound on the asymptotic expected regret that matches existing bounds in the literature for $\alpha = 1$ (the Lipschitz case). Simulations and experiments with real-world data obtained by Welltower Inc (a major healthcare Real Estate Investment Trust) consistently demonstrate that our method attains better empirical regret in comparison to several existing methods in the literature while offering the advantage of being completely tuning-parameter free.

URL: https://openreview.net/forum?id=uKZ0R4IQaO

---

Title: Towards Optimal LLM Selection

Abstract: Generative AI and LLMs in particular are heavily used nowadays for various document processing tasks such as question answering and document summarization. Enterprises are incurring huge costs when operating or using LLMs for their respective use cases.
In this work, we propose optimizing the usage costs of LLMs in a quality-aware manner for document summarization tasks. Specifically, we propose to exploit the variability of LLM performances across different types and formats of data to maximize the output quality while maintaining expected costs under a budget and latency within a threshold. This presents two challenges: 1) estimating the output quality of LLMs at runtime without invoking each LLM, 2) optimally allocating queries to LLMs such that the objectives are optimized and constraints are satisfied. We propose a model to predict the output quality of LLMs on text summarization, followed by an LP rounding algorithm to optimize the selection of LLMs. We study the problems both theoretically and empirically. Our methods reduce costs by $40\%- 90\%$ while improving quality by $4\%-7\%$. In addition to the quantitative results, we further show that our model quality estimation aligns majorly with human preferences through a user study.

URL: https://openreview.net/forum?id=0tkcWwVtaK

---

Title: Reinforcement Learning from Bagged Reward

Abstract: In Reinforcement Learning (RL), it is commonly assumed that an immediate reward signal is generated for each action taken by the agent, helping the agent maximize cumulative rewards to obtain the optimal policy. However, in many real-world scenarios, designing immediate reward signals is difficult; instead, agents receive a single reward that is contingent upon a partial sequence or a complete trajectory. In this work, we define this challenging problem as RL from Bagged Reward (RLBR), where sequences of data are treated as bags with non-Markovian bagged rewards, leading to the formulation of Bagged Reward Markov Decision Processes (BRMDPs). Theoretically, we demonstrate that RLBR can be addressed by solving a standard MDP with properly redistributed bagged rewards allocated to each instance within a bag. Empirically, we find that reward redistribution becomes more challenging as the bag length increases, due to reduced informational granularity. Existing reward redistribution methods are insufficient to address these challenges. Therefore, we propose a novel reward redistribution method equipped with a bidirectional attention mechanism, enabling the accurate interpretation of contextual nuances and temporal dependencies within each bag. We experimentally demonstrate that our proposed method consistently outperforms existing approaches. The code is available at an anonymous link: https://anonymous.4open.science/r/RLBR-F66E/.

URL: https://openreview.net/forum?id=bXUipBbZDA

---

Title: Fast and Provable Low-Rank High-Order Tensor Completion via Scaled Gradient Descent

Abstract: This work studies the low-rank high-order tensor completion problem, which aims to exactly recover a low-rank order-$d$ ($d \geq 4$) tensor from partially observed entries. Recently, tensor Singular Value Decomposition (t-SVD)-based low-rank tensor completion has gained considerable attention due to its ability to capture the low-rank structure of multidimensional data. However, existing approaches often rely on the computationally expensive tensor nuclear norm (TNN), thereby limiting their scalability for real-world tensors. Leveraging the low-rank structure under the t-SVD decomposition, we propose an efficient algorithm that directly estimates the high-order tensor factors---starting from a spectral initialization---via scaled gradient descent (ScaledGD). Theoretically, we rigorously establish the recovery guarantees of the proposed algorithm under mild assumptions, demonstrating that it achieves linear convergence to the true low-rank tensor at a constant rate that is independent of the condition number. Numerical experiments on both synthetic and real-world data verify our results and demonstrate the superiority of our method.

URL: https://openreview.net/forum?id=uoleRRKse4

---

Title: Tighter sparse variational Gaussian processes

Abstract: Sparse variational Gaussian process (GP) approximations based on inducing points have become the de facto standard for scaling GPs to large datasets, owing to their theoretical elegance, computational efficiency, and ease of implementation. This paper introduces a provably tighter variational approximation by relaxing the standard assumption that the conditional approximate posterior given the inducing points must match that in the prior. The key innovation is to modify the conditional posterior to have smaller variances than that of the prior at the training points. We derive the collapsed bound for the regression case, describe how to use the proposed approximation in large data settings, and discuss its application to handle orthogonally structured inducing points and GP latent variable models. Extensive experiments on regression benchmarks, classification, and latent variable models demonstrate that the proposed approximation consistently matches or outperforms standard sparse variational GPs while maintaining the same computational cost. An implementation will be made available in all popular GP packages.

URL: https://openreview.net/forum?id=L33DSu3zvq

---

Title: Decentralized Transformers with Centralized Aggregation are Sample-Efficient Multi-Agent World Models

Abstract: Learning a world model for model-free Reinforcement Learning (RL) agents can significantly improve the sample efficiency by learning policies in imagination. However, building a world model for Multi-Agent RL (MARL) can be particularly challenging due to the scalability issue in a centralized architecture arising from a large number of agents, and also the non-stationarity issue in a decentralized architecture stemming from the inter-dependency among agents. To address both challenges, we propose a novel world model for MARL that learns decentralized local dynamics for scalability, combined with a centralized representation aggregation from all agents. We cast the dynamics learning as an auto-regressive sequence modeling problem over discrete tokens by leveraging the expressive Transformer architecture, in order to model complex local dynamics across different agents and provide accurate and consistent long-term imaginations. As the first pioneering Transformer-based world model for multi-agent systems, we introduce a Perceiver Transformer as an effective solution to enable centralized representation aggregation within this context. Main results on Starcraft Multi-Agent Challenge (SMAC) and additional results on MAMujoco show that it outperforms strong model-free approaches and existing model-based methods in both sample efficiency and overall performance.

URL: https://openreview.net/forum?id=xT8BEgXmVc

---

Title: Generative Models for Long Time Series: Approximately Equivariant Recurrent Network Structures for an Adjusted Training Scheme

Abstract: We apply a novel training scheme to a specific implementation of a Variational Autoencoder (VAE), which, in combination, we refer to as the Recurrent Variational Autoencoder Subsequent Train (RVAE-ST). This method progressively increases the sequence length during training, leveraging the sequence-length independent parameterization of the model to address the challenge recurrent layers face when handling long sequences, particularly for datasets exhibiting approximate stationarity. Our experiments demonstrate that this approach significantly improves the model’s performance, especially for datasets with periodic behavior. Compared to other recurrent and convolutional-based generative models, our method excels in generating synthetic data for long sequences of l = 1000, with notable improvements in both sample quality and the distribution of the generated datasets. We evaluate the effectiveness of our approach using multiple metrics, including the discriminative score, evidence lower bound (ELBO), and visualizations of embeddings generated by t-SNE and PCA.

URL: https://openreview.net/forum?id=HQ9C9xcrWZ

---

Title: Examining Post-Training Quantization for Mixture-of-Experts: A Benchmark

Abstract: Large Language Models (LLMs) have become foundational in the realm of natural language processing, demonstrating performance improvements as model sizes increase. The Mixture-of-Experts (MoE) approach offers a promising way to scale LLMs more efficiently by using fewer computational FLOPs through sparse activation. However, it suffers from significant memory overheads, necessitating model compression techniques. Post-training quantization, a popular method for model compression, proves less effective when directly applied to MoE models due to MoE's overlooked inherent sparsity. This paper explores several MoE structure-aware quantization heuristics, ranging from coarse to fine granularity, from MoE block to individual linear weight. Our investigations reveal critical principles: different MoE structures (i.e., blocks, experts, linear layers) require varying numbers of weight bits for effective and efficient quantization. Conclusions are supported by extensive benchmarking across two representative MoE models and six tasks. We further introduce novel enhancements to more accurately identify the most critical weights in MoE quantization that necessitate higher bit allocations, including the linear weight outlier scorer and MoE block scorer. Additionally, subsequent experiments validate our findings in the context of both weight and activation quantization. Our code for reproducing all our experiments is provided as supplemental material.

URL: https://openreview.net/forum?id=VVty3mELRN

---

Title: Discovering group dynamics in coordinated time series via hierarchical recurrent switching-state models

Abstract: We seek a computationally efficient model for a collection of time series arising from multiple interacting entities (a.k.a. "agents"). Recent models of spatiotemporal patterns across individuals fail to incorporate explicit system-level collective behavior that can influence the trajectories of individual entities. To address this gap in the literature, we present a new hierarchical switching-state model that can be trained in an unsupervised fashion to simultaneously learn both system-level and individual-level dynamics. We employ a latent system-level discrete state Markov chain that provides top-down influence on latent entity-level chains which in turn govern the emission of each observed time series. Recurrent feedback from the observations to the latent chains at both entity and system levels allows recent situational context to inform how dynamics unfold at all levels in bottom-up fashion. We hypothesize that including both top-down and bottom-up influences on group dynamics will improve interpretability of the learned dynamics and reduce error when forecasting. Our hierarchical switching recurrent dynamical model can be learned via closed-form variational coordinate ascent updates to all latent chains that scale linearly in the number of entities. This is asymptotically no more costly than fitting a separate model for each entity. Analysis of both synthetic data and real basketball team movements suggests our lean parametric model can achieve competitive forecasts compared to larger neural network models that require far more computational resources. Further experiments on soldier data as well as a synthetic task with 64 cooperating entities show how our approach can yield interpretable insights about team dynamics over time.

URL: https://openreview.net/forum?id=LHchZthcOf

---

Title: Lie Symmetry Net: Preserving Conservation Laws in Modelling Financial Market Dynamics via Differential Equations

Abstract: This paper employs a novel Lie symmetries-based framework to model the intrinsic symmetries within financial market. Specifically, we introduce Lie symmetry net (LSN), which characterises the Lie symmetries of the differential equations (DE) estimating financial market dynamics, such as the Black-Scholes equation. To simulate these differential equations in a symmetry-aware manner, LSN incorporates a Lie symmetry risk derived from the conservation laws associated with the Lie symmetry operators of the target differential equations. This risk measures how well the Lie symmetries are realised and guides the training of LSN under the structural risk minimisation framework. Extensive numerical experiments demonstrate that LSN effectively realises the Lie symmetries and achieves an error reduction of more than one order of magnitude compared to state-of-the-art methods. The code is available at https://anonymous.4open.science/r/LSN_code-5608/README.md.

URL: https://openreview.net/forum?id=rkfop9GyxB

---

Title: ViT-EBoT: Vision Transformer for Encrypted Botnet Detection in Resource-Constrained Edge Devices

Abstract: With the advent of lightweight cryptography in edge devices, attackers can hide malicious code under encrypted network communications to perform malware attacks. This makes IoT botnet attacks extremely challenging to detect by means of traditional signature-based techniques. In this paper, we propose a novel IoT botnet detection framework that uses vision transformers to detect malicious communications captured in encrypted network flow images. Our approach achieved ∼98% accuracy and around 94% reduced inference latency compared to state-of-the-art approaches. Further, we have validated the practicality of our approach by testing it on Jetson Orin Nano acting as an edge gateway and achieved reduced inference latency of 25.16 ms and area overhead of 88.13 MB.

URL: https://openreview.net/forum?id=P3vxtPoq8c

---

Title: Learning to Guide Human Decision Makers with Vision-Language Models

Abstract: There is increasing interest in developing AIs for assisting human decision making in high-stakes tasks, such as medical diagnosis, for the purpose of improving decision quality and reducing cognitive strain. Mainstream approaches team up an expert with a machine learning model to which safer decisions are offloaded, thus letting the former focus on cases that demand their attention. This separation of responsibilities setup, however, is inadequate for high-stakes scenarios. On the one hand, the expert may end up over-relying on the machine’s decisions due to anchoring bias, thus losing the human oversight that is increasingly being required by regulatory agencies to ensure trustworthy AI. On the other hand, the expert is left entirely unassisted on the (typically hardest) decisions on which the model abstained. As a remedy, we introduce learning to guide (LTG), an alternative framework in which – rather than taking control from the human expert – the machine provides guidance useful for decision making, and the human is entirely responsible for coming up with a decision. In order to ensure guidance is interpretable and task-specific, we develop SLOG, an approach for turning any vision-language model into a capable generator of textual guidance by leveraging a modicum of human feedback. Our empirical evaluation highlights the promise of SLOG on a challenging, real-world medical diagnosis task.

URL: https://openreview.net/forum?id=JAW1C8RNth

---

Reply all

Reply to author

Forward

0 new messages