Daily TMLR digest for Jun 19, 2024

1 view

Skip to first unread message

TMLR

unread,

Jun 19, 2024, 12:00:06 AM (12 days ago) Jun 19

to tmlr-anno...@googlegroups.com

Accepted papers
===============

Title: Directional Convergence Near Small Initializations and Saddles in Two-Homogeneous Neural Networks

Authors: Akshay Kumar, Jarvis Haupt

Abstract: This paper examines gradient flow dynamics of two-homogeneous neural networks for small initializations, where all weights are initialized near the origin. For both square and logistic losses, it is shown that for sufficiently small initializations, the gradient flow dynamics spend sufficient time in the neighborhood of the origin to allow the weights of the neural network to approximately converge in direction to the Karush-Kuhn-Tucker (KKT) points of a neural correlation function that quantifies the correlation between the output of the neural network
and corresponding labels in the training data set. For square loss, it has been observed that neural networks undergo saddle-to-saddle dynamics when initialized close to the origin. Motivated by this, this paper also shows a similar directional convergence among weights of
small magnitude in the neighborhood of certain saddle points.

URL: https://openreview.net/forum?id=hfrPag75Y0

---

Title: Towards Minimal Targeted Updates of Language Models with Targeted Negative Training

Authors: Lily H Zhang, Rajesh Ranganath, Arya Tafvizi

Abstract: Generative models of language exhibit impressive capabilities but still place non-negligible probability mass over undesirable outputs. In this work, we address the task of updating a model to avoid unwanted outputs while minimally changing model behavior otherwise, a challenge we refer to as a minimal targeted update. We first formalize the notion of a minimal targeted update and propose a method to achieve such updates using negative examples from a model's generations. Our proposed Targeted Negative Training (TNT) results in updates that keep the new distribution close to the original, unlike existing losses for negative signal which push down probability but do not control what the updated distribution will be. In experiments, we demonstrate that TNT yields a better trade-off between reducing unwanted behavior and maintaining model generation behavior than baselines, paving the way towards a modeling paradigm based on iterative training updates that constrain models from generating undesirable outputs while preserving their impressive capabilities.

URL: https://openreview.net/forum?id=lrZ2yiqOS2

---

Title: Towards Understanding Variants of Invariant Risk Minimization through the Lens of Calibration

Authors: Kotaro Yoshida, Hiroki Naganuma

Abstract: Machine learning models traditionally assume that training and test data are independently and identically distributed. However, in real-world applications, the test distribution often differs from training. This problem, known as out-of-distribution (OOD) generalization, challenges conventional models. Invariant Risk Minimization (IRM) emerges as a solution that aims to identify invariant features across different environments to enhance OOD robustness. However, IRM's complexity, particularly its bi-level optimization, has led to the development of various approximate methods. Our study investigates these approximate IRM techniques, using the consistency and variance of calibration across environments as metrics to measure the invariance aimed for by IRM. Calibration, which measures the reliability of model prediction, serves as an indicator of whether models effectively capture environment-invariant features by showing how uniformly over-confident the model remains across varied environments. Through a comparative analysis of datasets with distributional shifts, we observe that Information Bottleneck-based IRM achieves consistent calibration across different environments. This observation suggests that information compression techniques, such as IB, are potentially effective in achieving model invariance. Furthermore, our empirical evidence indicates that models exhibiting consistent calibration across environments are also well-calibrated. This demonstrates that invariance and cross-environment calibration are empirically equivalent. Additionally, we underscore the necessity for a systematic approach to evaluating OOD generalization. This approach should move beyond traditional metrics, such as accuracy and F1 scores, which fail to account for the model’s degree of over-confidence, and instead focus on the nuanced interplay between accuracy, calibration, and model invariance.

URL: https://openreview.net/forum?id=9YqacugDER

---

New submissions
===============

Title: An Attentive Approach for Building Partial Reasoning Agents from Pixels

Abstract: We study the problem of building reasoning agents that are able to generalize in an effective manner. Towards this goal, we propose an end-to-end approach for building model-based reinforcement learning agents that dynamically focus their reasoning to the relevant aspects of the environment: after automatically identifying the distinct aspects of the environment, these agents dynamically filter out the relevant ones and then pass them to their simulator to perform partial reasoning. Unlike existing approaches, our approach works with pixel-based inputs and it allows for interpreting the focal points of the agent. Our quantitative analyses show that the proposed approach allows for effective generalization in high-dimensional domains with raw observational inputs. We also perform ablation analyses to validate of design choices. Finally, we demonstrate through qualitative analyses that our approach actually allows for building agents that focus their reasoning on the relevant aspects of the environment.

URL: https://openreview.net/forum?id=S3FUKFMRw8

---

Title: FMU: Fair Machine Unlearning via Distribution Correction

Abstract: Machine unlearning, a technique used to remove the influence of specific data points from a trained model, is often applied in high-stakes scenarios. While most current machine unlearning methods aim to maintain the performance of the model after removing requested data traces, they may inadvertently introduce biases during the unlearning process. This raises the question: Does machine unlearning actually introduce bias? To address this question, we evaluate the fairness of model predictions before and after applying existing machine unlearning approaches. Interestingly, our findings reveal that the model after unlearning can exhibit a greater bias. To mitigate the bias induced by unlearning, we developed a novel framework, Fair Machine Unlearning (FMU), which ensures group fairness during the unlearning process. Specifically, for privacy preservation, FMU first withdraws the model updates of the batches containing the unlearning requests. For debiasing, it then deletes the model updates of sampled batches that have reversed sensitive attributes associated with the unlearning requests. To validate the effectiveness of FMU, we compare it with standard machine unlearning baselines and one existing fair machine unlearning approach. FMU demonstrates superior fairness in predictions while maintaining privacy and comparable prediction accuracy to retraining the model. Furthermore, we illustrate the advantages of FMU in scenarios involving diverse unlearning requests, encompassing various data distributions of the original dataset. Our framework is orthogonal to specific machine unlearning approaches and debiasing techniques, making it flexible for various applications. This work represents a pioneering effort, serving as a foundation for more advanced techniques in fair machine unlearning.

URL: https://openreview.net/forum?id=E9BELz8Mod

---

Reply all

Reply to author

Forward

0 new messages