Séminaire Parisien d'Optimisation: lundi 3 Juin à l'IHP

3 views
Skip to first unread message

Robert Gower

unread,
May 29, 2019, 4:08:25 AM5/29/19
to SMILE in Paris
SEMINAIRE PARISIEN D'OPTIMISATION (SPO)

   Institut Henri Poincaré
   11, rue Pierre et Marie Curie
   Paris 5ème

   https://sites.google.com/site/spoihp/

  Lundi 3 Juin 2019 - salle 01

       * 15 h 00 :   Sebastian Stich (EPFL)

   Gradient Compression with Error Feedback for Distributed Optimization


       * 16 h 15 : Dan Garber (Technion)

   Highly-efficient Algorithms for Online Learning with Semi-adversarial 
Sequences


-------------------------------------------------------------------------

Sebastian Stich

   Gradient Compression with Error Feedback for Distributed Optimization

Résumé:

The communication overhead is a key bottleneck that hinders perfect 
scalability of distributed optimization algorithms. Various recent works 
propose to use quantization or sparsification techniques to reduce the 
amount of data that needs to be communicated between workers.

We analyze Stochastic Gradient Descent (SGD) with compression (like e.g. 
sparsification or quantization) of  exchanged messages and show that 
these schemes converge at the same rate as vanilla SGD when equipped 
with error compensation technique (i.e. keeping track of accumulated 
errors in memory). We discuss both centralized and decentralized 
implementations, the latter based on modification of the gossip protocol.

------
Dan Garber

   Highly-efficient Algorithms for Online Learning with Semi-adversarial 
Sequences


Résumé:

Online Learning is an appealing and successful paradigm for sequential 
prediction under uncertainty. In particular, efficient online algorithms 
for regret minimization usually require only convexity and boundness 
assumptions on the model (i.e., the feasible set and loss functions), 
and thus could be applied, with provable performance guarantees, to a 
great variety of prediction scenarios, which go far beyond more 
traditional models of assuming stochastic i.i.d. data.

On the downside, for several problems of interest, all current online 
leaning algorithms are not scalable to large instances. For instance, 
all current algorithms for Online Principal Component Analysis require 
quadratic memory and at least quadratic runtime per iteration. This is 
in stark contrast to the offline or stochastic i.i.d variants of the 
problem, for which there exist algorithms that require only linear 
memory and linear runtime per iteration. Another example, is when the 
loss functions are exp-concave (e.g., linear regression). In this case, 
the only current algorithm to achieve (optimal) logarithmic regret, 
requires quadratic memory and to solve a linear system of equations on 
each iteration, whereas the offline problem could be solved via 
highly-efficient gradient methods which require only linear memory.

In this talk I will show that imposing some more structure on the data 
can result in much more efficient regret minimization algorithms. In 
particular, I will consider cases in which the data follows a (unknown) 
stationary i.i.d. distribution combined with adversarial perturbations. 
I will show how such models allow to 1) obtain a linear-memory and 
linear-runtime algorithm for Online PCA with a nearly optimal regret 
bound, and 2) obtain a logarithmic regret bound for the vanila Online 
Gradient Descent method for several problems of interest with 
exp-concave losses such as Online LASSO, and Online Portfolio Selection.
Reply all
Reply to author
Forward
0 new messages