What' the output of LSTM in many-to-many scenario in Caffe (C or H)

235 views
Skip to first unread message

auro tripathy

unread,
Jul 28, 2016, 3:34:07 PM7/28/16
to Caffe Users
The code for a single LSTM cell is below (copied from Caffe src/layers). 

My question is, which of the top outputs is connected to the next layer (typically an embedding or a SoftMax). 

Is it the C-top or the H-top at time t?

From the code, here are the definitions of C and H. 


    Dtype* C = top[0]->mutable_cpu_data();
    Dtype* H = top[1]->mutable_cpu_data();

The LSTM cell code

    template <typename Dtype>
    void LSTMUnitLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom,
        const vector<Blob<Dtype>*>& top) {
      const int num = bottom[0]->shape(1);
      const int x_dim = hidden_dim_ * 4;
      const Dtype* C_prev = bottom[0]->cpu_data();
      const Dtype* X = bottom[1]->cpu_data();
      const Dtype* cont = bottom[2]->cpu_data();
      Dtype* C = top[0]->mutable_cpu_data();
      Dtype* H = top[1]->mutable_cpu_data();
      for (int n = 0; n < num; ++n) {
        for (int d = 0; d < hidden_dim_; ++d) {
          const Dtype i = sigmoid(X[d]);
          const Dtype f = (*cont == 0) ? 0 :
              (*cont * sigmoid(X[1 * hidden_dim_ + d]));
          const Dtype o = sigmoid(X[2 * hidden_dim_ + d]);
          const Dtype g = tanh(X[3 * hidden_dim_ + d]);
          const Dtype c_prev = C_prev[d];
          const Dtype c = f * c_prev + i * g;
          C[d] = c;
          const Dtype tanh_c = tanh(c);
          H[d] = o * tanh_c;
        }
        C_prev += hidden_dim_;
        X += x_dim;
        C += hidden_dim_;
        H += hidden_dim_;
        ++cont;
      }
    }

Question 

So if we define a layer like the one below, then the top (named '`lstm1`') is referring to which output, C or H?

    layer {
      name: "lstm1"
      type: "LSTM"
      bottom: "fc6-reshape"
      bottom: "reshape-cm"
      top: "lstm1"
      recurrent_param {
        num_output: 8
        weight_filler {
          type: "uniform"
          min: -0.01
          max: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }

FELIPE PETROSKI SUCH

unread,
Jul 29, 2016, 12:54:29 PM7/29/16
to Caffe Users
The correct answer would be both. The output is in the order they are specified. As you can see from the code below. If you only specify one you would get C
Dtype* C = top[0]->mutable_cpu_data();
Dtype* H = top[1]->mutable_cpu_data();
layer {
      name: "lstm1"
      type: "LSTM"
      bottom: "fc6-reshape"
      bottom: "reshape-cm"
      top: "C"
      top: "H"

FELIPE PETROSKI SUCH

unread,
Jul 29, 2016, 12:59:36 PM7/29/16
to Caffe Users
C is the normal output you would want, H is the Hidden outputs. Just to clarify it, normally you would only use one output (C)

auro tripathy

unread,
Jul 29, 2016, 2:14:19 PM7/29/16
to Caffe Users
Thank you, Felipe! I agree, if you only specify one (top), you would get C

Last question, 

Is it the C that should be connected to the next layer (an embedding) or the H?

C is the cell memory. H takes C thru the tanh activation and the sigmoid output gate activation. 

Virendra Kumar Pathak

unread,
Jan 27, 2018, 4:59:50 PM1/27/18
to Caffe Users
Hi Auro,

So what would be the correct answer to your question?
Reply all
Reply to author
Forward
0 new messages