Séminaire Parisien d'Optimisation: lundi 3 Juin à l'IHP

3 views

Skip to first unread message

Robert Gower

unread,

May 29, 2019, 4:08:25 AM5/29/19

to SMILE in Paris

SEMINAIRE PARISIEN D'OPTIMISATION (SPO)

Institut Henri Poincaré
11, rue Pierre et Marie Curie
Paris 5ème

https://sites.google.com/site/spoihp/

Lundi 3 Juin 2019 - salle 01

* 15 h 00 : Sebastian Stich (EPFL)

Gradient Compression with Error Feedback for Distributed Optimization

* 16 h 15 : Dan Garber (Technion)

Highly-efficient Algorithms for Online Learning with Semi-adversarial
Sequences

-------------------------------------------------------------------------

Sebastian Stich

Gradient Compression with Error Feedback for Distributed Optimization

Résumé:

The communication overhead is a key bottleneck that hinders perfect
scalability of distributed optimization algorithms. Various recent works
propose to use quantization or sparsification techniques to reduce the
amount of data that needs to be communicated between workers.

We analyze Stochastic Gradient Descent (SGD) with compression (like e.g.
sparsification or quantization) of exchanged messages and show that
these schemes converge at the same rate as vanilla SGD when equipped
with error compensation technique (i.e. keeping track of accumulated
errors in memory). We discuss both centralized and decentralized
implementations, the latter based on modification of the gossip protocol.

------
Dan Garber

Highly-efficient Algorithms for Online Learning with Semi-adversarial
Sequences

Résumé:

Online Learning is an appealing and successful paradigm for sequential
prediction under uncertainty. In particular, efficient online algorithms
for regret minimization usually require only convexity and boundness
assumptions on the model (i.e., the feasible set and loss functions),
and thus could be applied, with provable performance guarantees, to a
great variety of prediction scenarios, which go far beyond more
traditional models of assuming stochastic i.i.d. data.

On the downside, for several problems of interest, all current online
leaning algorithms are not scalable to large instances. For instance,
all current algorithms for Online Principal Component Analysis require
quadratic memory and at least quadratic runtime per iteration. This is
in stark contrast to the offline or stochastic i.i.d variants of the
problem, for which there exist algorithms that require only linear
memory and linear runtime per iteration. Another example, is when the
loss functions are exp-concave (e.g., linear regression). In this case,
the only current algorithm to achieve (optimal) logarithmic regret,
requires quadratic memory and to solve a linear system of equations on
each iteration, whereas the offline problem could be solved via
highly-efficient gradient methods which require only linear memory.

In this talk I will show that imposing some more structure on the data
can result in much more efficient regret minimization algorithms. In
particular, I will consider cases in which the data follows a (unknown)
stationary i.i.d. distribution combined with adversarial perturbations.
I will show how such models allow to 1) obtain a linear-memory and
linear-runtime algorithm for Online PCA with a nearly optimal regret
bound, and 2) obtain a logarithmic regret bound for the vanila Online
Gradient Descent method for several problems of interest with
exp-concave losses such as Online LASSO, and Online Portfolio Selection.

Reply all

Reply to author

Forward

0 new messages