Hi Everyone,
I recently posted this on StackOverflow but this is probably a better place for it:
I'm trying to put together a really simple 3-layer neural network in lasagne. 30 input neurons, 10-neuron hidden layer, 1-neuron output layer. I'm using the binary_crossentropy loss function and sigmoid nonlinearity. I want to put l1 regularization on the edges entering the output layer and l2 regularization on the edges from the input to the hidden layer. I'm using code very close to the example code on the regularization page of the lasagne documentation and in the MLP example.
The L1 regularization seems to work fine, but whenever I add the L2 regularization's penalty term to the loss function, it returns nan. Everything works fine when I remove the term l2_penalty * l2_reg_param from the last line below. Additionally, I'm able to perform L1 regularization on the hidden layer l_hid1 without any issues.
This is my first foray into theano and lasagne so I feel like the error is probably something pretty simple but I just don't know enough to see it.
Here's the net setup code:
l_in = lasagne.layers.InputLayer(shape=(942,1,1,30),input_var=input_var)
l_hid1 = lasagne.layers.DenseLayer(l_in, num_units=10, nonlinearity=lasagne.nonlinearities.sigmoid, W=lasagne.init.GlorotUniform())
network = lasagne.layers.DenseLayer(l_hid1, num_units=1, nonlinearity=lasagne.nonlinearities.sigmoid)
prediction = lasagne.layers.get_output(network)
l2_penalty = regularize_layer_params(l_hid1, l2)
l1_penalty = regularize_layer_params(network, l1)
loss = lasagne.objectives.binary_crossentropy(prediction, target_var)
loss = loss.mean()
loss = loss + l1_penalty * l1_reg_param + l2_penalty * l2_reg_param
I've been testing the l1 and l2 functions by doing the below: l2 works fine when I pass it a numpy array but not when it gets a weight matrix.
print(l2(l_hid1.W).eval())
>> nan
debug_arr = np.ones((30,10),dtype=np.float32)
print(l2(debug_arr).eval())
>> 300.0
I'm on Theano 0.9.dev2 and Lasagne 0.2.dev1, if that helps.
Thanks for any advice!