Daily TMLR digest for Nov 23, 2022

3 views

Skip to first unread message

TMLR

unread,

Nov 22, 2022, 7:00:08 PM11/22/22

to tmlr-anno...@googlegroups.com

Accepted papers
===============

Title: Stochastic Douglas-Rachford Splitting for Regularized Empirical Risk Minimization: Convergence, Mini-batch, and Implementation

Authors: Aysegul Bumin, Kejun Huang

Abstract: In this paper, we study the stochastic Douglas-Rachford splitting (SDRS) for general empirical risk minimization (ERM) problems with regularization. Our first contribution is to prove its convergence for both convex and strongly convex problems; the convergence rates are $O(1/\sqrt{t})$ and $O(1/t)$, respectively. Since SDRS reduces to the stochastic proximal point algorithm (SPPA) when there is no regularization, it is pleasing to see the result matches that of SPPA, under the same mild conditions. We also propose the mini-batch version of SDRS that handles multiple samples simultaneously while maintaining the same efficiency as that of a single one, which is not a straight-forward extension in the context of stochastic proximal algorithms. We show that the mini-batch SDRS again enjoys the same convergence rate. Furthermore, we demonstrate that, for some of the canonical regularized ERM problems, each iteration of SDRS can be efficiently calculated either in closed form or in close to closed form via bisection---the resulting complexity is identical to, for example, the stochastic (sub)gradient method. Experiments on real data demonstrate its effectiveness in terms of convergence compared to SGD and its variants.

URL: https://openreview.net/forum?id=uvDD9rN6Zz

---

New submissions
===============

Title: Expected Pinball Loss For Inverse CDF Estimation

Abstract: We analyze and improve a recent strategy to train a quantile regression model by minimizing an expected pinball loss over all quantiles. We give an asymptotic convergence rate that shows that minimizing the expected pinball loss can be more efficient at estimating single quantiles than training with the standard pinball loss for that quantile, an insight that generalizes the known deficiencies of the sample quantile in the unconditioned setting. Then, to guarantee a legitimate inverse CDF, we propose using flexible deep lattice networks with a monotonicity constraint on the quantile input to guarantee non-crossing quantiles, and show lattice models can be regularized to the same location-scale family. Our analysis and experiments on simulated and real datasets show that the proposed method produces state-of-the-art legitimate inverse CDF estimates that are likely to be as good or better for specific target quantiles.

URL: https://openreview.net/forum?id=mbyjtOIizu

---

Title: Learning Object-Centric Neural Scattering Functions for Free-viewpoint Relighting and Scene Composition

Abstract: Photorealistic object appearance modeling from 2D images is a constant topic in vision and graphics. While neural implicit methods (such as Neural Radiance Fields) have shown high-fidelity view synthesis results, they cannot relight the captured objects. More recent neural inverse rendering approaches have enabled object relighting, but they represent surface properties as simple BRDFs, and therefore cannot handle translucent objects. We propose Object-Centric Neural Scattering Functions (OSFs) for learning to reconstruct object appearance from only images. OSFs not only support free-viewpoint object relighting, but also can model both opaque and translucent objects. While accurately modeling subsurface light transport for translucent objects can be highly complex and even intractable for neural methods, OSFs learn to approximate the radiance transfer from a distant light to an outgoing direction at any spatial location. This approximation avoids explicitly modeling complex subsurface scattering, making learning a neural implicit model tractable. Experiments on real and synthetic data show that OSFs accurately reconstruct appearances for both opaque and translucent objects, allowing faithful free-viewpoint relighting as well as scene composition. In our supplementary material, we include a video for an overview.

URL: https://openreview.net/forum?id=NrfSRtTpN5

---

Title: A Cubic Regularization Approach for Finding Local Minimax Points in Nonconvex Minimax Optimization

Abstract: Gradient descent-ascent (GDA) is a widely used algorithm for minimax optimization. However, GDA has been proved to converge to stationary points for nonconvex minimax optimization, which are suboptimal compared with local minimax points. In this work, we develop cubic regularization (CR) type algorithms that globally converge to local minimax points in nonconvex-strongly-concave minimax optimization. We first show that local minimax points are equivalent to second-order stationary points of a certain envelope function. Then, inspired by the classic cubic regularization algorithm, we propose an algorithm named Cubic-LocalMinimax for finding local minimax points, and provide a comprehensive convergence analysis by leveraging its intrinsic potential function. Specifically, we establish the global convergence of Cubic-LocalMinimax to a local minimax point at a sublinear convergence rate and characterize its iteration complexity. Also, we propose a GDA-based solver for solving the cubic subproblem involved in Cubic-LocalMinimax up to certain pre-defined accuracy, and analyze the overall gradient and Hessian-vector product computation complexities of such an inexact Cubic-LocalMinimax algorithm. Moreover, we propose a stochastic variant of Cubic-LocalMinimax for large-scale minimax optimization, and characterize its sample complexity under stochastic sub-sampling. Experimental results demonstrate faster convergence of our stochastic Cubic-LocalMinimax than the standard stochastic GDA algorithm.

URL: https://openreview.net/forum?id=jVMMdg31De

---

Title: Cold Start Streaming Learning for Deep Networks

Abstract: The ability to dynamically adapt neural networks to newly-available data without performance deterioration would revolutionize deep learning applications. Streaming learning (i.e., learning from one data example at a time) has the potential to enable such real-time adaptation, but current approaches i) freeze a majority of network parameters during streaming and ii) are dependent upon offline, base initialization procedures over large subsets of data, which damages performance and limits applicability. To mitigate these shortcomings, we propose Cold Start Streaming Learning (CSSL), a simple, end-to-end approach for streaming learning with deep networks that uses a combination of replay and data augmentation to avoid catastrophic forgetting.

Because CSSL updates all model parameters during streaming, the algorithm is capable of beginning streaming from a random initialization, making base initialization optional. Going further, the algorithm's simplicity allows theoretical convergence guarantees to be derived using analysis of the Neural Tangent Random Feature (NTRF). In experiments, we find that CSSL outperforms existing baselines for streaming learning in experiments on CIFAR100, ImageNet, and Core50 datasets. Additionally, we propose a novel multi-task streaming learning setting and show that CSSL performs favorably in this domain. Put simply, CSSL performs well and demonstrates that the complicated, multi-step training pipelines adopted by most streaming methodologies can be replaced with a simple, end-to-end learning approach without sacrificing performance.

URL: https://openreview.net/forum?id=Q2MAi8YAL5

---

Title: FLUID: A Unified Evaluation Framework for Flexible Sequential Data

Abstract: Modern machine learning methods excel when training data is IID, large-scale, and well labeled. Learning in less ideal conditions remains an open challenge. The sub-fields of few-shot, continual, transfer, and representation learning have made substantial strides in learning under adverse conditions, each affording distinct advantages through methods and insights. These methods address different challenges such as data arriving sequentially or scarce training examples, however often the difficult conditions an ML system will face over its lifetime cannot be anticipated prior to deployment. Therefore, general ML systems which can handle the many challenges of learning in practical settings are needed. To foster research towards the goal of general ML methods, we introduce a new unified evaluation framework – FLUID (Flexible Sequential Data). FLUID integrates the objectives of few-shot, continual, transfer, and representation learning while enabling comparison and integration of techniques across these subfields. In FLUID, a learner faces a stream of data and must make sequential predictions while choosing how to update itself, adapt quickly to novel classes, and deal with changing data distributions; while accounting for the total amount of compute. We conduct experiments on a broad set of methods which shed new insight on the advantages and limitations of current techniques and indicate new research problems to solve. As a starting point towards more general methods, we present two new baselines which outperform other evaluated methods on FLUID.

URL: https://openreview.net/forum?id=UvJBKWaSSH

---

Reply all

Reply to author

Forward

0 new messages