Feedback on Saliency Map & Deep Dreaming Code

173 views
Skip to first unread message

Dieterich Lawson

unread,
Aug 10, 2015, 8:57:42 PM8/10/15
to Caffe Users
I'm trying to generate saliency maps as developed in Simonyan et al in Caffe and would appreciate others' opinions on the following code.

Given an image and a class, the goal is to get the gradient of the activation of the given class's neuron, with respect to the first layer parameters, evaluated at the image (what a mouthful).

My approach is motivated by the fact that to my knowledge it is impossible to use caffe to take the gradient of the activation of an arbitrary neuron with respect to the input parameters of the net. Instead, you can only get the gradient of a loss with respect to the parameters. To get around this I perform net surgery and then manually set gradients before the backward pass.

So, first I create a new deploy.prototxt that removes the softmax and loss layer (as recommended in the paper) and also changes the final fully connected layer to have a different name and one output, where before it had 1000 (because there are 1000 classes). I also set force_backward to true. So here are the relevant parts of my original deploy.prototxt:

name: "VGG_128_ft"
input: "data"
input_dim: 50
input_dim: 3
input_dim: 224
input_dim: 224
layer {
  bottom: "data"
  top: "conv1"
  name: "conv1"
  type: "Convolution"
  convolution_param {
    num_output: 96
    kernel_size: 7
    stride: 2
  }
}

... omitted ...
 
layer {
  bottom: "fc7"
  top: "fc8"
  name: "fc8"
  type: "InnerProduct"
  inner_product_param {
    num_output: 1000
  }
}
layer {
  name: "prob"
  type: "Softmax"
  bottom: "fc8"
  top: "prob"
}

and here are the relevant parts of the new prototxt that I'm calling deploy_oneclass.prototxt that I will use to do net surgery:

name: "VGG_128_ft"
input: "data"
input_dim: 1
input_dim: 3
input_dim: 224
input_dim: 224
force_backward: true
layer {
  bottom: "data"
  top: "conv1"
  name: "conv1"
  type: "Convolution"
  convolution_param {
    num_output: 96
    kernel_size: 7
    stride: 2
  }
}
 
... omitted ...
 
layer {
  bottom: "fc7"
  top: "fc8_oneclass"
  name: "fc8_oneclass"
  type: "InnerProduct"
  inner_product_param {
    num_output: 1
  }
}

Then in python I load both nets with the same set of parameters.

# load full net
net = caffe.Net("deploy.prototxt", "model.caffemodel", caffe.TEST)
# load oneclass net
net_oneclass = caffe.Net("deploy_oneclass.prototxt", "model.caffemodel", caffe.TEST) 

and transplant the parameters for the class of interest from the full net to the new oneclass net. Assume that we are interested in class 31, then the following transplants the parameters:

net_oneclass.params['fc8_oneclass'][0].data[0] =  net.params['fc8'][0].data[31]
net_oneclass.params['fc8_oneclass'][1].data[0] =  net.params['fc8'][1].data[31]

Assume we have loaded an image into the variable 'im'. Then we can do a forward pass with the 'oneclass' net :

out = net_oneclass.forward_all(data=np.asarray([im])) 

We would now like to get the gradient of the activation of the final neuron in the oneclass net. However, there is no loss defined so if we just do net_oneclass.backward() no diffs will be generated. To get around this we manually set the gradient in the final layer of net_oneclass to 1, because the gradient of that function with respect to itself is always 1. Then we can call net.backward as normal.
 
net_oneclass.blobs['fc8_oneclass'].diff[0][0] = 1
diffs = net_oneclass.backward() 
 
Then, as described in the paper we can take the elementwise absolute value of all the gradients and then the max over the channels

sal_map = diffs['data'][0]
sal_map = np.absolute(sal_im).max(axis=0)
 
The resulting sal_map should be the saliency map. However, my results are not quite right, and seem random. My questions are:
  • Am I correct in doing net surgery and manually setting the gradient to be 1 or is there an easier way to obtain the derivatives of the activation of a specific neuron?
  • Can anyone verify that this approach produces decent output on a different net? (I'm using VGG_128 from the model zoo)?
  • Or is there an easier way to do saliency maps with a specific class?
  • This approach can also be used to generate "deep dreaming"-esque images. I'm working on plugging it into a nonconvex optimization solver like scipy's optimize. This requires a function value to maximize, which in our case is the activation of the neuron, and a Jacobian, which in our case is the diffs from the net. If anyone else wants to work on this and check whether or not it produces decent results, it would be very appreciated.

Abhishek Das

unread,
Aug 11, 2015, 6:42:44 AM8/11/15
to Caffe Users
Hi Dieterich,

There's no need to use two different nets and transplant parameters from one to the other. You can reuse the same network prototxt. Just set force_backward = true, change input dimension to 1, delete the final layer, and before the backward pass, set the gradients of the final layer to be 1 for the desired class and zero for others. Assuming you're interested in class 31, this would look something like:

label = np.zeros((1,1,1,1000))
label[0,0,0,31] = 1
net.blobs['fc8'].diff[0] = label

Cheers
Reply all
Reply to author
Forward
0 new messages