L1Penalty is an inline module that in its forward propagation copies the input Tensor directly to
the output, and computes an L1 loss of the latent state (input) and stores it in the module’s loss field.
During backward propagation: gradInput = gradOutput + gradLoss.
This module can be used in autoencoder architectures to apply L1 losses to internal latent state
without having to use Identity and parallel containers to carry the internal code to an output
criterion.
--
You received this message because you are subscribed to a topic in the Google Groups "torch7" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/torch7/__iAj8XJQtg/unsubscribe.
To unsubscribe from this group and all its topics, send an email to torch7+unsubscribe@googlegroups.com.
To post to this group, send email to tor...@googlegroups.com.
Visit this group at https://groups.google.com/group/torch7.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and all its topics, send an email to torch7+un...@googlegroups.com.
To unsubscribe from this group and all its topics, send an email to torch7+unsubscribe@googlegroups.com.
Thank you for your help, but part of my question is specifically about why my example seems to work with the penalty before the linear layer.
Sorry, I don't know the answer to your question but...By just looking at the source code L1Penalty does not do anything special. In forward pass it carries your input and does not change it but it stores a self.loss which is calculated in the process.This self.loss field is not accessed by any of the parent torch modules. It is simply there, isolated.During backward pass however L1Penalty does the following thing to your forward output: self.gradInput:resizeAs(input):copy(input):sign():mul(m)It uses the sign of tensor and multiplies it by m value updating the wieghts. So, it if is applied after some layer, the weights after this layer will be L1 normalized and so on.Now, look at your example. You do not do anything to your weights except changing the sign because L1Penalty(1) does multiply the input by 1, so it is not the L1Penalty that changes your gradInput but Linear itself.
On Thursday, August 17, 2017 at 9:53:34 PM UTC+2, Gregory Gundersen wrote:
Thank you for your help, but part of my question is specifically about why my example seems to work with the penalty before the linear layer.
To unsubscribe from this group and all its topics, send an email to torch7+unsubscribe@googlegroups.com.