Daily TMLR digest for Mar 03, 2025

5 views

Skip to first unread message

TMLR

unread,

Mar 3, 2025, 12:06:07 AM3/3/25

to tmlr-anno...@googlegroups.com

New certifications
==================

Reproducibility Certification: Shedding Light on Problems with Hyperbolic Graph Learning

Isay Katsman, Anna Gilbert

https://openreview.net/forum?id=rKAkp1f3R7

---

Accepted papers
===============

Title: Long Short-Term Imputer: Handling Consecutive Missing Values in Time Series

Authors: Jiacheng You, Xinyang Chen, Yu Sun, Weili Guan, Liqiang Nie

Abstract: Encountered frequently in time series data, missing values can significantly impede time-series analysis. With the progression of deep learning, advanced imputation models delve into the temporal dependencies inherent in time series data, showcasing remarkable performance. This positions them as intuitive selections for time series imputation tasks which assume ``Miss Completely at Random''. Nonetheless, long-interval consecutive missing values may obstruct the model's ability to grasp long-term temporal dependencies, consequently hampering the efficacy of imputation performance. To tackle this challenge, we propose Long Short-term Imputer (LSTI) to impute consecutive missing values with different length of intervals. Long-term Imputer is designed using the idea of bi-directional autoregression. A forward prediction model and a backward prediction model are trained with a consistency regularization, which is designed to capture long-time dependency and can adapt to long-interval consecutive missing values. Short-term Imputer is designed to capture short-time dependency and can impute the short-interval consecutive missing values effectively. A meta-weighting network is then proposed to take advantage of the strengths of two imputers. As a result, LSTI can impute consecutive missing values with different intervals effectively. Experiments demonstrate that our approach, on average, reduces the error by 57.4% compared to state-of-the-art deep models across five datasets.

URL: https://openreview.net/forum?id=9NVJ0ZgEfT

---

Title: Evolution of Discriminator and Generator Gradients in GAN Training: From Fitting to Collapse

Authors: Weiguo Gao, Ming Li

Abstract: Generative Adversarial Networks (GANs) are powerful generative models but often suffer from mode mixture and mode collapse. We propose a perspective that views GAN training as a two-phase progression from fitting to collapse, where mode mixture and mode collapse are treated as inter-connected. Inspired by the particle model interpretation of GANs, we leverage the discriminator gradient to analyze particle movement and the generator gradient, specifically "steepness," to quantify the severity of mode mixture by measuring the generator's sensitivity to changes in the latent space. Using these theoretical insights into evolution of gradients, we design a specialized metric that integrates both gradients to detect the transition from fitting to collapse. This metric forms the basis of an early stopping algorithm, which stops training at a point that retains sample quality and diversity. Experiments on synthetic and real-world datasets, including MNIST, Fashion MNIST, and CIFAR-10, validate our theoretical findings and demonstrate the effectiveness of the proposed algorithm.

URL: https://openreview.net/forum?id=58gPkcVbFL

---

Title: Shedding Light on Problems with Hyperbolic Graph Learning

Authors: Isay Katsman, Anna Gilbert

Abstract: Recent papers in the graph machine learning literature have introduced a number of approaches for hyperbolic representation learning. The asserted benefits are improved performance on a variety of graph tasks, node classification and link prediction included. Claims have also been made about the geometric suitability of particular hierarchical graph datasets to representation in hyperbolic space. Despite these claims, our work makes a surprising discovery: when simple Euclidean models with comparable numbers of parameters are properly trained in the same environment, in most cases, they perform as well, if not better, than all introduced hyperbolic graph representation learning models, even on graph datasets previously claimed to be the most hyperbolic as measured by Gromov $\delta$-hyperbolicity (i.e., perfect trees). This observation gives rise to a simple question: how can this be? We answer this question by taking a careful look at the field of hyperbolic graph representation learning as it stands today, and find that a number of results do not diligently present baselines, make faulty modelling assumptions when constructing algorithms, and use misleading metrics to quantify geometry of graph datasets. We take a closer look at each of these three problems, elucidate the issues, perform an analysis of methods, and introduce a parametric family of benchmark datasets to ascertain the applicability of (hyperbolic) graph neural networks.

URL: https://openreview.net/forum?id=rKAkp1f3R7

---

New submissions
===============

Title: Reproducibility Study of ’SLICE: Stabilized LIME for Consistent Explanations for Image Classification’

Abstract: This paper presents a reproducibility study of SLICE: Stabilized LIME for Consistent Explanations for Image Classification by Bora et al. (2024). SLICE enhances LIME by incorporating Sign Entropy-based Feature Elimination (SEFE) to remove unstable superpixels and an adaptive perturbation strategy using Gaussian blur to improve consistency in feature importance rankings. The original work claims that SLICE significantly improves explanation stability and fidelity. Our study systematically verifies these claims through extensive experimentation using the Oxford-IIIT Pets, PASCAL VOC, and MS COCO datasets. Our results confirm that SLICE achieves higher consistency than LIME, supporting its ability to reduce instability. However, our fidelity analysis challenges the claim of superior performance, as LIME often achieves higher Ground Truth Overlap (GTO) scores, indicating stronger alignment with object segmentations. To further investigate fidelity, we introduce an alternative AOPC evaluation to ensure a fair comparison across methods. Additionally, we propose GRID-LIME, a structured grid-based alternative to LIME, which improves stability while maintaining computational efficiency. Our findings highlight trade-offs in post-hoc explainability methods and emphasize the need for fairer fidelity evaluations. Our implementation is publicly available at our GitHub repository.

URL: https://openreview.net/forum?id=vKUPXuEzj8

---

Title: A Local Polyak-Łojasiewicz and Descent Lemma of Gradient Descent For Overparametrized Linear Models

Abstract: Most prior work on the convergence of gradient descent (GD) for overparameterized neural networks relies on strong assumptions on the step size (infinitesimal), the hidden-layer width (infinite), or the initialization (large, spectral, balanced). Recent efforts to relax these assumptions focus on two-layer linear networks trained with the squared loss.
In this work, we derive a linear convergence rate for training two-layer linear neural networks with GD for general losses and under relaxed assumptions on the step size, width, and initialization. A key challenge in deriving this result is that classical ingredients for deriving convergence rates for nonconvex problems, such as the Polyak-Łojasiewicz (PL) condition and Descent Lemma, do not hold globally for overparameterized neural networks. Here, we prove that these two conditions hold locally with local constants that depend on the weights. Then, we provide bounds on these local constants, which depend on the initialization of the weights, the current loss, and the global PL and smoothness constants of the non-overparameterized model. Based on these bounds, we derive a linear convergence rate for GD. Our convergence analysis not only improves upon prior results but also suggests a better choice for the step size, as verified through our numerical experiments.

URL: https://openreview.net/forum?id=VPl3T43Hxb

---

Reply all

Reply to author

Forward

0 new messages