Deep Learning using Linear Support Vector Machines

xudong cao

unread,

Nov 27, 2013, 1:18:51 AM11/27/13

to pylea...@googlegroups.com

Anybody realize the method in this paper? Sounds like a free lunch for deep learning.

http://deeplearning.net/wp-content/uploads/2013/03/dlsvm.pdf

Kyle Kastner

unread,

Nov 27, 2013, 1:22:45 AM11/27/13

to pylea...@googlegroups.com

I have started an implementation as of today, but it may take a while to verify everything is OK. Hopefully by the end of the week

On Wed, Nov 27, 2013 at 12:18 AM, xudong cao <not...@gmail.com> wrote:

Anybody realize the method in this paper? Sounds like a free lunch for deep learning.

http://deeplearning.net/wp-content/uploads/2013/03/dlsvm.pdf

--

---
You received this message because you are subscribed to the Google Groups "pylearn-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pylearn-dev...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

xudong cao

unread,

Nov 27, 2013, 1:37:19 AM11/27/13

to pylea...@googlegroups.com

It is reported in the original paper that better results are achieved by replacing softmax layer with linear svm layer, which means the objective function for classification are changed. It implies that maximizing a margin is a better objective.

In short, it improve the accuracy but doesn't bring in additional cost in training/testing.

David Warde-Farley

unread,

Nov 27, 2013, 2:24:20 AM11/27/13

to pylea...@googlegroups.com

Yep, we've read the paper, thank you for the summary.

xudong cao

unread,

Dec 10, 2013, 6:18:08 AM12/10/13

to pylea...@googlegroups.com

Cool, does it work as well as the paper reported?

Ian Goodfellow

unread,

Dec 10, 2013, 1:47:11 PM12/10/13

to pylea...@googlegroups.com

I haven't re-implemented it myself but if Charlie Tang says it works, it works.

--
Sent from Gmail Mobile

Kyle Kastner

unread,

Dec 10, 2013, 1:55:33 PM12/10/13

to pylea...@googlegroups.com

This is a good point to say that work is still ongoing (I called the layer HingeLoss for now - is there a better name?). It seems to not crash, but I am still working on making a good test/example that won't take as long as "train 10 CIFAR-10 nets"... any ideas?

xudong cao

unread,

Dec 10, 2013, 9:01:39 PM12/10/13

to pylea...@googlegroups.com

How about training mnist using fully connected neural network. It will avoid time-consuming convolution.

On Wednesday, December 11, 2013 2:55:33 AM UTC+8, Kyle Kastner wrote:

This is a good point to say that work is still ongoing (I called the layer HingeLoss for now - is there a better name?). It seems to not crash, but I am still working on making a good test/example that won't take as long as "train 10 CIFAR-10 nets"... any ideas?

On Tue, Dec 10, 2013 at 12:47 PM, Ian Goodfellow <goodfel...@gmail.com> wrote:

I haven't re-implemented it myself but if Charlie Tang says it works, it works.

On Tuesday, December 10, 2013, xudong cao wrote:

Cool, does it work as well as the paper reported?

On Wednesday, November 27, 2013 2:22:45 PM UTC+8, Kyle Kastner wrote:

I have started an implementation as of today, but it may take a while to verify everything is OK. Hopefully by the end of the week

On Wed, Nov 27, 2013 at 12:18 AM, xudong cao <not...@gmail.com> wrote:

Anybody realize the method in this paper? Sounds like a free lunch for deep learning.

http://deeplearning.net/wp-content/uploads/2013/03/dlsvm.pdf

--

---
You received this message because you are subscribed to the Google Groups "pylearn-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pylearn-dev...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

---
You received this message because you are subscribed to the Google Groups "pylearn-dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to pylearn-dev+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

Esmaeil zahedi

unread,

Dec 16, 2013, 4:35:40 PM12/16/13

to pylea...@googlegroups.com

hello Kyle Kastner

i should to implement this paper for my machine learning course project

if you implement this paper please help me

thanks

Esmaeil zahedi

unread,

Dec 16, 2013, 4:38:19 PM12/16/13

to pylea...@googlegroups.com

HEllo

some help me please .

i should to implement this paper for my machine learning course project

But i cant to implement this paper

please help me

Ian Goodfellow

unread,

Dec 16, 2013, 4:41:38 PM12/16/13

to pylearn-dev

Esmaeli, I've changed your posting permissions to "moderated," meaning
the other moderators and I must approve your posts before they can be
sent out. This mailing list is for discussing the development of
pylearn2, not begging other people to do your homework for you.

Esmaeil zahedi

unread,

Dec 16, 2013, 4:49:23 PM12/16/13

to pylea...@googlegroups.com

ok !

some body can implement this paper ?

sent me its code .

Esmaeil zahedi

unread,

Dec 16, 2013, 4:55:48 PM12/16/13

to pylea...@googlegroups.com

my question has relation with this topic

because i want to implement this paper in pylearn2

i ask you can i help implement this paper?

Kyle Kastner

unread,

Dec 16, 2013, 8:46:53 PM12/16/13

to pylea...@googlegroups.com

I have a working branch here:
https://github.com/kastnerkyle/pylearn2/tree/svm_layer

but I am still verifying that the implementation is correct. It seems to work better than softmax, n_classes=2 for the problems I have tried (separating MNIST 0-4 from 5-9), but this doesn't mean there isn't a bug somewhere. I also had some issues with NaN in L1_W that seem to have gone away, but I want to confirm that on the kaggle dogs vs cats stuff. It would be ideal to recreate the paper but will take some time. Still having some doubts about my cost being correct, since there is no w^Tw term.

You can always look at the breze-no-salt link I posted earlier if you want to see another implementation - it seems fairly readable in my opinion.

Kyle Kastner

unread,

Dec 16, 2013, 8:55:00 PM12/16/13

to pylea...@googlegroups.com

Also, the testing file is here:
https://gist.github.com/kastnerkyle/7994667

Good luck!

Abhishek Thakur

unread,

May 15, 2014, 10:51:52 AM5/15/14

to pylea...@googlegroups.com

Can you tell me why does NaN in L1_W arise? I had the same error today while using hinge loss as output layer from your code

Kyle Kastner

unread,

May 15, 2014, 11:13:48 AM5/15/14

to pylea...@googlegroups.com

I am not sure yet - I have not had much time to look at this recently. Hoping to get back on it in June, but any information you find while investigating would be useful! For what it is worth it seemed "sensitive" to me, though it could be that this cost is more sensitive to input scaling, or has a bug somewhere in this implementation.

For more options, visit https://groups.google.com/d/optout.

Arash

unread,

Mar 14, 2016, 4:14:32 PM3/14/16

to pylearn-dev, kastn...@gmail.com

Hi Kyle,

I am trying to write the code for this paper in MATLAB and I have an issue with NAN as well. How did you fix it? Could you please explain it to me? Thanks.

Reply all

Reply to author

Forward