Featured Certification, J2C Certification: Retrospective Feature Estimation for Continual Learning
Nghia D. Nguyen, Hieu Trung Nguyen, Ang Li, Hoang Pham, Viet Anh Nguyen, Khoa D Doan
https://openreview.net/forum?id=9NnhVME4Q6
---
Accepted papers
===============
Title: Adapting Language Models to Produce Good Class Probabilities for Classification Tasks
Authors: Lautaro Estienne, Matias Vera, Elizabeth Fons, Elena Kochkina, Pablo Piantanida, Luciana Ferrer
Abstract: Large generative language models (GLM) provide a versatile tool for solving a wide variety of natural processing tasks. GLM responses, though, are provided in the form of text, without an indication of the model's confidence in the answer. This limits the usability of these models on high-risk applications where decisions made based on an incorrect answer can have severe consequences. In this work, we focus on the problem of generating class posterior distributions for text classification tasks like sentiment, news category and intent classification. These posteriors can be used for decision making and as interpretable scores for the user. We show that the naive approach for computing posteriors based on the token posteriors produced by the GLM results in extremely poor posteriors. We then explore different adaptation approaches for improving the quality of posteriors, focusing on low resource scenarios where a small amount of data is available for adaptation. We show that parameter-efficient supervised fine-tuning (SFT), while providing large gains in terms of decision quality, produces suboptimal posteriors due to overfitting. To address this problem, we propose an approach that combines SFT and post-hoc calibration (PHC) using a three-stage training strategy, improving the quality of both posteriors and categorical decisions.
URL: https://openreview.net/forum?id=VVneIp69GR
---
Title: Scaling Gaussian Process Regression with Full Derivative Observations
Authors: Daniel Huang
Abstract: We present a scalable Gaussian Process (GP) method called DSoftKI that can fit and predict full derivative observations. It extends SoftKI, a method that approximates a kernel via softmax interpolation, to the setting with derivatives. DSoftKI enhances SoftKI's interpolation scheme by replacing its global temperature vector with local temperature vectors associated with each interpolation point. This modification allows the model to encode local directional sensitivity, enabling the construction of a scalable approximate kernel, including its first and second-order derivatives, through interpolation. Moreover, the interpolation scheme eliminates the need for kernel derivatives, facilitating extensions such as Deep Kernel Learning (DKL). We evaluate DSoftKI on synthetic benchmarks, a toy n-body physics simulation, standard regression datasets with synthetic gradients, and high-dimensional molecular force field prediction (100-1000 dimensions). Our results demonstrate that DSoftKI is accurate and scales to larger datasets with full derivative observations than previously possible.
URL: https://openreview.net/forum?id=fbonXp38r9
---
Title: Retrospective Feature Estimation for Continual Learning
Authors: Nghia D. Nguyen, Hieu Trung Nguyen, Ang Li, Hoang Pham, Viet Anh Nguyen, Khoa D Doan
Abstract: The intrinsic capability to continuously learn a changing data stream is a desideratum of deep neural networks (DNNs). However, current DNNs suffer from catastrophic forgetting, which interferes with remembering past knowledge. To mitigate this issue, existing Continual Learning (CL) approaches often retain exemplars for replay, regularize learning, or allocate dedicated capacity for new tasks. This paper investigates an unexplored direction for CL called Retrospective Feature Estimation (RFE). RFE learns to reverse feature changes by aligning the features from the current trained DNN backward to the feature space of the old task, where performing predictions is easier. This retrospective process utilizes a chain of small feature mapping networks called retrospector modules. Empirical experiments on several CL benchmarks, including CIFAR10, CIFAR100, and Tiny ImageNet, demonstrate the effectiveness and potential of this novel CL direction compared to existing representative CL methods, motivating further research into retrospective mechanisms as a principled alternative for mitigating catastrophic forgetting in CL. Code is available at: https://github.com/mail-research/retrospective-feature-estimation.
URL: https://openreview.net/forum?id=9NnhVME4Q6
---
Title: Theoretically Understanding Data Reconstruction Leakage in Federated Learning
Authors: Binghui Zhang, Zifan Wang, Meng Pang, Yuan Hong, Binghui Wang
Abstract: Federated learning (FL) is a collaborative learning paradigm that aims to protect data privacy. Unfortunately, recent works show FL algorithms are vulnerable to data reconstruction attacks (DRA), a serious type of privacy leakage. However, existing works lack a theoretical foundation on to what extent the devices' data can be reconstructed and the effectiveness of these attacks cannot be compared fairly due to their unstable performance. To address this deficiency, we propose a theoretical framework to understand DRAs to FL. Our framework involves bounding the data reconstruction error and an attack's error bound reflects its inherent effectiveness using Lipschitz constant. We show that a smaller Lipschitz constant indicates a stronger attacker. Under the framework, we theoretically compare the effectiveness of existing attacks (such as DLG and iDLG). We then empirically examine our results on multiple datasets, validating that the iDLG attack inherently outperforms the DLG attack.
URL: https://openreview.net/forum?id=1UfDXeYxwk
---
Title: $$\texttt{C2-DPO}$$: Constrained Controlled Direct Preference Optimization
Authors: Kavosh Asadi, Xingzi Xu, Julien Han, Ege Beyazit, Idan Pipano, Dominique Perrault-Joncas, Shoham Sabach, Mohammad Ghavamzadeh, Karim Bouyarmane
Abstract: Direct preference optimization (\texttt{DPO}) has emerged as a promising approach for solving the alignment problem in AI. In this paper, we make two counter-intuitive observations about \texttt{DPO}. First, we show that the \texttt{DPO} loss could be derived by starting from an alternative optimization problem that only defines the KL guardrail on in-sample responses, unlike the original RLHF problem where guardrails are defined on the entire distribution. Second, we prove a surprising property of this alternative optimization problem, where both the preferred and rejected responses tend to decrease in probability under its optimal policy, a phenomenon typically displayed by DPO in practice. To control this behavior, we propose a set of constraints designed to limit the displacement of probability mass between the preferred and rejected responses in the reference and target policies. The resulting algorithm, which we call Constrained Controlled DPO (\texttt{C2-DPO}), has a meaningful RLHF interpretation. By hedging against the displacement, \texttt{C2-DPO} provides practical improvements over vanilla \texttt{DPO} when aligning several language models using standard preference datasets.
URL: https://openreview.net/forum?id=7h5Ho9t5NL
---
Title: Explaining with trees: interpreting CNNs using hierarchies
Authors: Caroline Mazini Rodrigues, Nicolas Boutry, Laurent Najman
Abstract: Challenges remain in providing interpretable explanations for neural network decision-making in explainable AI (xAI). Existing methods like Integrated Gradients produce noisy maps, and LIME, while intuitive, may deviate from the model’s internal logic. We introduce a framework that uses hierarchical segmentation techniques for faithful and interpretable explanations of Convolutional Neural Networks (CNNs). Our method constructs model-based hierarchical segmentations that maintain fidelity to the model’s decision-making process and allow both human-centric and model-centric segmentation. This approach can be combined with various xAI methods and provides multiscale explanations that help identify biases and improve understanding of neural network predictive behavior. Experiments show that our framework, xAiTrees, delivers highly interpretable and faithful model explanations, not only surpassing traditional xAI methods but shedding new light on a novel approach to enhancing xAI interpretability.
URL: https://openreview.net/forum?id=zjyWZh5IiI
---
New submissions
===============
Title: Evaluating the Reversal Curse in Model Editing
Abstract: Large language models (LLMs) are prone to hallucinate unintended text due to false or outdated knowledge. Since retraining LLMs is resource intensive, there has been a growing interest in model editing. Despite the emergence of benchmarks and approaches, these unidirectional editing and evaluation have failed to explore the reversal curse. In this paper, we study bidirectional language model editing, aiming to provide a rigorous evaluation to assess if edited LLMs can recall the editing knowledge bidirectionally. A metric of reversibility is introduced and a benchmark dubbed as Bidirectional Assessment for Knowledge Editing (BAKE) is constructed to evaluate if post-edited models can recall the editing knowledge in the reverse direction of editing. Experimental results show that while most editing methods are able to accurately recall editing facts along the modification direction, they exhibit substantial systematic deficiencies when evaluating in the reverse direction. Our findings also reveal that the in-context learning (ICL) can mitigate the reversal curse to a certain extent.
URL: https://openreview.net/forum?id=jAHwodCUxP
---
Title: Echo-GAT: Debiasing Graph Attention with Echo Nodes and Degree Diversity for Heterophilic Graphs
Abstract: Attention mechanisms have become a de facto standard for enhancing the expressivity of deep learning models, achieving remarkable success in graph data. Recent studies have shown that attention-based graph neural networks (GNNs) often perform poorly on heterophilic graphs and have attributed this degradation primarily to low levels of homophily. In contrast to this prevailing explanation, we find that on heterophilic graphs, under standard graph attention mechanisms, node-level homophily shows only a weak correlation with prediction accuracy, and nodes with lower homophily ratios can even achieve higher accuracy on average. These observations suggest that homophily alone is insufficient to explain the failure of graph attention. In this work, we show that standard graph attention networks exhibit a systematic performance imbalance across nodes with different degrees of diversity, favoring structurally inhomogeneous nodes (i.e., those with significantly divergent degrees compared to their neighbors). To mitigate this bias, we propose a graph attention optimization framework that integrates augmented feature attention and degree diversity-aware attention score to mitigate node-level structural bias. Experiments show that the proposed method consistently outperforms strong GAT variants and state-of-the-art heterophily-oriented GNNs. Moreover, it maintains stable performance gains across nodes with varying heterophily levels, demonstrating its effectiveness on diverse graph structures.
URL: https://openreview.net/forum?id=mj1UMx1sAL
---