Daily TMLR digest for Aug 31, 2025

0 views
Skip to first unread message

TMLR

unread,
Aug 31, 2025, 12:06:11 AM (8 days ago) Aug 31
to tmlr-anno...@googlegroups.com

Accepted papers
===============


Title: Using Platt’s scaling for calibration after undersampling — limitations and how to address them

Authors: Nathan Phelps, Daniel J Lizotte, Douglas G. Woolford

Abstract: When modelling data where the response is dichotomous and highly imbalanced, response-based sampling where a subset of the majority class is retained (i.e., undersampling) is often used to create more balanced training datasets prior to modelling. However, the models fit to this undersampled data, which we refer to as base models, generate predictions that are severely biased. There are several calibration methods that can be used to combat this bias, one of which is Platt’s scaling. Here, a logistic regression model is used to model the relationship between the base model’s original predictions and the response. Despite its popularity for calibrating models after undersampling, Platt’s scaling was not designed for this purpose. Our work presents what we believe is the first detailed study focused on the validity of using Platt’s scaling to calibrate models after undersampling. We show analytically, as well as via a simulation study, that Platt’s scaling should not be used for calibration after undersampling without critical thought. If Platt’s scaling would have been able to successfully calibrate the base model had it been trained on the entire dataset (i.e., without undersampling), then Platt’s scaling might be appropriate for calibration after undersampling. If this is not the case, we recommend a modified version of Platt’s scaling that fits a logistic generalized additive model to the logit of the base model’s predictions, as this method is theoretically motivated and performed relatively well across the settings considered in our study.

URL: https://openreview.net/forum?id=80b2zaeTUe

---

Title: Spurious Privacy Leakage in Neural Networks

Authors: Chenxiang Zhang, Jun Pang, Sjouke Mauw

Abstract: Neural networks trained on real-world data often exhibit biases while simultaneously being vulnerable to privacy attacks aimed at extracting sensitive information. Despite extensive research on each problem individually, their intersection remains poorly understood. In this work, we investigate the privacy impact of spurious correlation bias. We introduce _spurious privacy leakage_, a phenomenon in which spurious groups are significantly more vulnerable to privacy attacks than non-spurious groups. We observe that privacy disparity between groups increases in tasks with simpler objectives (e.g. fewer classes) due to spurious features. Counterintuitively, we demonstrate that spurious robust methods, designed to reduce spurious bias, fail to mitigate privacy disparity. Our analysis reveals that this occurs because robust methods can reduce reliance on spurious features for prediction, but do not prevent their memorization during training. Finally, we systematically compare the privacy of different model architectures trained with spurious data, demonstrating that, contrary to previous work, architectural choice can affect privacy evaluation.

URL: https://openreview.net/forum?id=tRXDCIgvTT

---

Title: ReHub: Linear Complexity Graph Transformers with Adaptive Hub-Spoke Reassignment

Authors: Tomer Borreda, Daniel Freedman, Or Litany

Abstract: We present ReHub, a novel graph transformer architecture that achieves linear complexity through an efficient reassignment technique between nodes and virtual nodes. Graph transformers have become increasingly important in graph learning for their ability to utilize long-range node communication explicitly, addressing limitations such as oversmoothing and oversquashing found in message-passing graph networks. However, their dense attention mechanism scales quadratically with the number of nodes, limiting their applicability to large-scale graphs. ReHub draws inspiration from the airline industry's hub-and-spoke model, where flights are assigned to optimize operational efficiency. In our approach, graph nodes (spokes) are dynamically reassigned to a fixed number of virtual nodes (hubs) at each model layer. Recent work, Neural Atoms (Li et al., 2024), has demonstrated impressive and consistent improvements over GNN baselines by utilizing such virtual nodes; their findings suggest that the number of hubs strongly influences performance. However, increasing the number of hubs typically raises complexity, requiring a trade-off to maintain linear complexity. Our key insight is that each node only needs to interact with a small subset of hubs to achieve linear complexity, even when the total number of hubs is large. To leverage all hubs without incurring additional computational costs, we propose a simple yet effective adaptive reassignment technique based on hub-hub similarity scores, eliminating the need for expensive node-hub computations. Our experiments on long-range graph benchmarks indicate a consistent improvement in results over the base method, Neural Atoms, while maintaining a linear complexity instead of $O(n^{3/2})$. Remarkably, our sparse model achieves performance on par with its non-sparse counterpart. Furthermore, ReHub outperforms competitive baselines and consistently ranks among the top performers across various benchmarks.

URL: https://openreview.net/forum?id=L4S54TUOQR

---

Title: Graph Personalized Federated Learning via Client Network Learning

Authors: Jiachen Zhou, Han Xie, Carl Yang

Abstract: Graph classification is a widely studied problem for applications such as molecule/protein function prediction and drug discovery. Powerful graph neural networks (GNNs) have demonstrated state-of-the-art performance for the classification of complex graphs, but training such models can require significant amounts of high-quality labeled graphs that are expensive to collect. When individual institutes do not possess sufficient graph data, federated learning (FL) becomes a handy solution for them to collaboratively obtain powerful graph models without directly sharing their own graph data. However, existing FL frameworks for graph data do not consider the realistic setting of personalized FL with heterogeneous data, where each client aims to leverage the data of certain other clients to boost its own model performance. In this work, inspired by graph structure learning, we propose to learn a dynamic client network that tracks the graph data similarity across clients to guide model sharing along FL. Specifically, we rely on the marginal parameters of local GNNs to dynamically learn the client network, and refer to a set of fundamental graph properties to guide its learning. Extensive experiments on three real-world graph datasets demonstrate the consistent effectiveness of our two major proposed modules, which also mutually verify the effectiveness of each other.

URL: https://openreview.net/forum?id=pyTTR4pxkU

---


New submissions
===============


Title: Interestingness First Classifiers

Abstract: Most machine learning models are designed to maximize predictive accuracy. In this work, we explore a different goal: building classifiers that are interesting. An ``interesting classifier'' is one that uses unusual or unexpected features, even if its accuracy is lower than the best possible model. For example, predicting room congestion from CO2 levels achieves near-perfect accuracy but is unsurprising. In contrast, predicting room congestion from humidity is less accurate yet more nuanced and intriguing. We introduce EUREKA, a simple framework that selects features according to their perceived interestingness. Our method leverages large language models to rank features by their interestingness and then builds interpretable classifiers using only the selected interesting features. Across several benchmark datasets, EUREKA consistently identifies features that are non-obvious yet still predictive. For example, in the Occupancy Detection dataset, our method favors humidity over CO2 levels and light intensity, producing classifiers that achieve meaningful accuracy while offering insights. In the Twin Papers dataset, our method discovers the rule that papers with a colon in the title are more likely to be cited in the future. We argue that such models can support new ways of knowledge discovery and communication, especially in settings where moderate accuracy is sufficient but novelty and interpretability are valued.

URL: https://openreview.net/forum?id=zHvIY49qp8

---

Title: Extracting and Following Paths for Robust Relational Reasoning with Large Language Models

Abstract: Large language models (LLMs) possess vast semantic knowledge but often struggle with complex reasoning tasks, particularly in relational reasoning problems such as kinship or spatial reasoning. In this paper, we present Path-of-Thoughts (PoT), a novel framework designed to tackle relation reasoning by decomposing the task into three key stages: graph extraction, path identification, and reasoning. Unlike previous approaches, PoT efficiently extracts a task-agnostic graph that identifies crucial entities, relations, and attributes within the problem context. Subsequently, PoT identifies relevant reasoning chains within the graph corresponding to the posed question, facilitating inference of potential answers. Experimental evaluations on four benchmark datasets, demanding long reasoning chains, demonstrate that PoT surpasses state-of-the-art baselines by a significant margin (maximum 21.3%) without necessitating fine-tuning or extensive LLM calls. Furthermore, as opposed to prior neuro-symbolic methods, PoT exhibits improved resilience against LLM errors by leveraging the compositional nature of graphs.

URL: https://openreview.net/forum?id=EbELaNKmZK

---

Title: Training Dynamics of Learning 3D-Rotational Equivariance

Abstract: While data augmentation is widely used to train symmetry-agnostic models, it remains unclear how quickly and effectively they learn to respect symmetries.
We investigate this by deriving a principled measure of equivariance error that, for convex losses, calculates the percent of total loss attributable to imperfections in learned symmetry.
We focus our empirical investigation to 3D-rotation equivariance on high-dimensional molecular tasks (flow matching, force field prediction, denoising voxels) and find that models rapidly become nearly equivariant within 1k-10k training steps, a result robust to model and dataset size.
This happens because learning 3D-rotational equivariance is an easier learning task, with a smoother and better-conditioned loss landscape, than the main prediction task.
We then theoretically characterize learning dynamics for models that are nearly equivariant, as ``stochastic equivariant learning dynamics'', via analyses that also hold beyond 3D rotations.
For 3D rotations, the loss penalty for non-equivariant models is small throughout training, so they may achieve lower test loss than equivariant models per GPU-hour unless the equivariant ``efficiency gap'' is narrowed.

URL: https://openreview.net/forum?id=DLOIAW18W3

---

Title: Node Embeddings via Neighbor Embeddings

Abstract: Node embeddings are a paradigm in non-parametric graph representation learning, where graph nodes are embedded into a given vector space to enable downstream processing. State-of-the-art node-embedding algorithms, such as DeepWalk and node2vec, are based on random-walk notions of node similarity and on contrastive learning. In this work, we introduce the graph neighbor-embedding (graph NE) framework that directly pulls together embedding vectors of adjacent nodes without relying on any random walks. We show that graph NE strongly outperforms state-of-the-art node-embedding algorithms in terms of local structure preservation. Furthermore, we apply graph NE to the 2D node-embedding problem, obtaining graph layouts that also outperform existing graph-layout algorithms.

URL: https://openreview.net/forum?id=8APIU9cauZ

---

Title: Symbolic Quantile Regression for the Interpretable Prediction of Conditional Quantiles

Abstract: Symbolic Regression (SR) is a well-established framework for generating interpretable or white-box predictive models.
Although SR has been successfully applied to create interpretable estimates of the average of the outcome, it is currently not well understood how it can be used to estimate the relationship between variables at other points in the distribution of the target variable. Such estimates of e.g. the median or an extreme value provide a fuller picture of how predictive variables affect the outcome and are necessary in high-stakes, safety-critical application domains. This study introduces Symbolic Quantile Regression (SQR), an approach to predict conditional quantiles with SR. In an extensive evaluation, we find that SQR outperforms transparent models and performs comparably to a strong black-box baseline without compromising transparency. We also show how SQR can be used to explain differences in the target distribution by comparing models that predict extreme and central outcomes in an airline fuel usage case study. We conclude that SQR is suitable for predicting conditional quantiles and understanding interesting feature influences at varying quantiles.

URL: https://openreview.net/forum?id=x9OYbyPJOG

---

Reply all
Reply to author
Forward
0 new messages