first meeting on Friday!

Leif Johnson

未读，

2015年2月10日 16:53:162015/2/10

收件人 ut-f...@googlegroups.com

Hi folks -

Sorry about the late notice for this meeting. We'll meet for the first
time this semester on Friday the 13th, in GDC 3.516 at 2:30pm. This
week we'll discuss the following paper (it's relatively short):

Do Deep Nets Really Need to be Deep?
Lei Jimmy Ba, Rich Caruana
http://arxiv.org/pdf/1312.6184v5.pdf

Currently, deep neural networks are the state of the art on problems
such as speech recognition and computer vision. In this extended
abstract, we show that shallow feed-forward networks can learn the
complex functions previously learned by deep nets and achieve
accuracies previously only achievable with deep models. Moreover, in
some cases the shallow neural nets can learn these deep functions
using a total number of parameters similar to the original deep model.
We evaluate our method on the TIMIT phoneme recognition task and are
able to train shallow fully-connected nets that perform similarly to
complex, well-engineered, deep convolutional architectures. Our
success in training shallow neural nets to mimic deeper models
suggests that there probably exist better algorithms for training
shallow feed-forward nets than those currently available.

--
http://www.cs.utexas.edu/~leif

Karl Pichotta

未读，

2015年2月10日 16:56:432015/2/10

收件人 Leif Johnson、ut-f...@googlegroups.com

Y'all wanna read the longer NIPS version instead?!?!?!?!

http://papers.nips.cc/paper/5484-do-deep-nets-really-need-to-be-deep.pdf

?!?!

_k

--
You received this message because you are subscribed to the Google Groups "FLARE" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ut-flare+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Leif Johnson

未读，

2015年2月10日 16:57:562015/2/10

收件人 Karl Pichotta、ut-f...@googlegroups.com

Let's read the NIPS one, it has graphs. :)

lmj

>> email to ut-flare+u...@googlegroups.com.

>> For more options, visit https://groups.google.com/d/optout.

--
http://www.cs.utexas.edu/~leif

Dinesh Jayaraman

未读，

2015年2月10日 17:01:032015/2/10

收件人 Leif Johnson、Karl Pichotta、ut-f...@googlegroups.com

Geoff Hinton has apparently been independently working on something very similar to the "model compression" idea of this paper and going by this talk, he's gone a fair deal further with it: https://www.youtube.com/watch?v=EK61htlw8hY

Dinesh

Leif Johnson

未读，

2015年2月11日 11:18:462015/2/11

收件人 dineshjay...@gmail.com、Karl Pichotta、ut-f...@googlegroups.com

Excellent video -- I'm always amazed how Hinton manages to seem so
reasonable and be so informative in his talks.

lmj

--
http://www.cs.utexas.edu/~leif

Dinesh Jayaraman

未读，

2015年2月11日 20:48:312015/2/11

收件人 Leif Johnson、Karl Pichotta、ut-flare

I haven't read it, but it looks like this NIPS workshop paper pretty much has the same content as the video lecture link above: https://fb56552f-a-62cb3a1a-s-sites.googlegroups.com/site/deeplearningworkshopnips2014/65.pdf?attachauth=ANoY7cpgjM07hsKIrSvIARjz5LFyi-pkxSI4Q80Wj78fHDeq75J7T7IpYjuvllPKCYhCw39RhZZ_xGfpBW6SZtscWYnZQeXUe9q-Fuu9Vzb0R102gP2MjFVkla8Barh_-tVLDGgAvK6VPImal0NpoMUpzqflkiAm61cEPGat0v98uprWBTzA8yZ3BffvsZsza52OEXrPJBJu_nJ-nGXz2Tk5gnpQbb_ir1hWcWR3vrgoHXsOBnr7i3E%3D&attredirects=2

Dinesh

Dinesh Jayaraman

未读，

2015年2月13日 17:54:272015/2/13

收件人 Dinesh Jayaraman、Karl Pichotta、ut-flare

Here's a reference I found from a quick search for the local minima result I mentioned. http://papers.nips.cc/paper/5486-identifying-and-attacking-the-saddle-point-problem-in-high-dimensional-non-convex-optimization.pdf

Again, I haven't actually read this paper either, but from a quick perusal, the result about local minima being close to global minima seems to have been proved for some more general class of high-dimensional optimization problems, rather than specifically for deep neural nets. (See Section 2) The point of this paper though appears to be something else that Yoshua Bengio also mentioned while he was here: deep neural network training with gradient descent more often runs into saddle points (i.e. second derivative along at least one direction is negative) than genuine local minima, which is interesting in itself.

Dinesh

Leif Johnson

未读，

2015年2月16日 10:34:522015/2/16

收件人 Dinesh Jayaraman、Karl Pichotta、ut-flare

This reminds me of a paper I saw recently exploring whether local
minima are found in training some types of networks (short answer: not
so much, but IMO needs more work): http://arxiv.org/abs/1412.6544

Maybe we should have a look at a paper like this for one of our
meetings this semester.

lmj

On Fri, Feb 13, 2015 at 4:54 PM, Dinesh Jayaraman

Dinesh Jayaraman

未读，

2015年2月16日 10:59:352015/2/16

收件人 Leif Johnson、Karl Pichotta、ut-flare

I'd like to read some of this stuff in detail too.

And while we're on it, I found the actual paper I was thinking about, that had the narrow band of deep net local minima result: "The Loss Surface of Multilayer Networks": http://www.columbia.edu/~aec2163/NonFlash/Papers/PAPER_AMMGY.pdf

Dinesh

回复全部

回复作者