Binary-CrossEntropy - Works on Keras But Not on Lasagne?

108 views
Skip to first unread message

André Lopes

unread,
Apr 15, 2016, 11:29:10 AM4/15/16
to lasagne-users
Im using the same convolutional neural network structure on Keras and Lasagne. 
Right now, i just changed to  a simple network to see if it changed anything, but it didnt.

On Keras it works fine, it outputs values between 0 and 1 with a good accuracy.
On lasagne, the values dont come mostly wrong. It seems the output is the same as the input. 

Basically : It outputs and train fine on keras. But not on my lasagne Version



Structure On Lasagne : 


def structure(w=5, h=5):
try:

input_var = T.tensor4('inputs')
target_var = T.bmatrix('targets')

network = lasagne.layers.InputLayer(shape=(None, 1, h, w), input_var=input_var)

network = lasagne.layers.Conv2DLayer(
network, num_filters=64, filter_size=(3, 3), stride=1, pad=0,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())

network = lasagne.layers.Conv2DLayer(
network, num_filters=64, filter_size=(3, 3), stride=1, pad=0,
nonlinearity=lasagne.nonlinearities.rectify,
W=lasagne.init.GlorotUniform())

network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2), stride=None, pad=(0, 0), ignore_border=True)

network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=256,
nonlinearity=lasagne.nonlinearities.rectify, W=lasagne.init.GlorotUniform())

network = lasagne.layers.DenseLayer(
lasagne.layers.dropout(network, p=0.5),
num_units=1,
nonlinearity=lasagne.nonlinearities.sigmoid)

print "...Output", lasagne.layers.get_output_shape(network)

return network, input_var, target_var

except Exception as inst:
print ("Failure to Build NN !", inst.message, (type(inst)), (inst.args), (inst))

return None



On Keras :

def getModel(w,h):
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD

model = Sequential()

model.add(Convolution2D(64, 3, 3, border_mode='valid', input_shape=(1, h, w)))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Convolution2D(128, 3, 3, border_mode='valid'))
model.add(Activation('relu'))
model.add(Convolution2D(128, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
#
model.add(Flatten())
#
model.add(Dense(256))
model.add(Activation('relu'))
model.add(Dropout(0.25))

model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.25))

#
model.add(Dense(1))
model.add(Activation('sigmoid'))

sgd = SGD(lr=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='binary_crossentropy', optimizer='sgd')

return model


And to train on Keras..
model.fit(x, y, batch_size=512, nb_epoch=500, verbose=2, validation_split=0.2, shuffle=True, show_accuracy=True)




And to train and predict on Lasagne :



To Train :

prediction = lasagne.layers.get_output(network)

loss = lasagne.objectives.binary_crossentropy(prediction, target_var)
loss = loss.mean()

params = lasagne.layers.get_all_params(network, trainable=True)

# updates = lasagne.updates.sgd(loss, params, learning_rate=learning_rate)
updates = lasagne.updates.nesterov_momentum(loss_or_grads=loss, params=params, learning_rate=learning_rate, momentum=momentum_rho)

#
test_prediction = lasagne.layers.get_output(network, deterministic=True)

test_prediction = lasagne.layers.get_output(network, deterministic=True)
test_loss = lasagne.objectives.binary_crossentropy(test_prediction, target_var)
test_loss = test_loss.mean()

# Accuracy
test_acc = lasagne.objectives.binary_accuracy(test_prediction, target_var)
test_acc = test_acc.mean()

train_fn = theano.function([input_var, target_var], loss, updates=updates)
val_fn = theano.function([input_var, target_var], [test_loss, test_acc])


And im using these  iterators which i hope isnt the cause of it.. Maybe it is? 


def iterate_minibatches_getOutput(self, inputs, batchsize):
for start_idx in range(0, len(inputs) - batchsize + 1, batchsize):
excerpt = slice(start_idx, start_idx + batchsize)
yield inputs[excerpt]

def iterate_minibatches(self, inputs, targets, batchsize, shuffle=False):
assert len(inputs) == len(targets)
if shuffle:
indices = np.arange(len(inputs))
np.random.shuffle(indices)
for start_idx in range(0, len(inputs) - batchsize + 1, batchsize):
if shuffle:
excerpt = indices[start_idx:start_idx + batchsize]
else:
excerpt = slice(start_idx, start_idx + batchsize)
yield inputs[excerpt], targets[excerpt]



To Predict :

test_prediction = lasagne.layers.get_output(self.network, deterministic=True)
predict_fn = theano.function([self.input_var], test_prediction)


index = 0
for batch in self.iterate_minibatches_getOutput(inputs=submission_feature_x, batchsize=self.batch_size):
inputs = batch
y = predict_fn(inputs)
start = index * self.batch_size
end = (index + 1) * self.batch_size
predictions[index * self.batch_size:self.batch_size * (index + 1)] = y
index += 1

print "debug -->", predictions[0:10]
print "debug max ---->", np.max(predictions)
print "debug min ----->", np.min(predictions)

This Prints :
debug --> [[ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.3252553 ]
 [ 0.32534513]]
debug max ----> 1.0
debug min -----> 0.0

The results are totally wrong.
However, what confuses me, is that it outputs fine on keras.



Also, the validation acc never changes:

Epoch 2 of 30 took 9.5846s
  Training loss:                0.22714619
  Validation loss:              0.17278196
  Validation accuracy:          95.85454545 %
Epoch 3 of 30 took 9.6437s
  Training loss:                0.22646923
  Validation loss:              0.17249792
  Validation accuracy:          95.85454545 %
Epoch 4 of 30 took 9.6464s
  Training loss:                0.22563262
  Validation loss:              0.17235395
  Validation accuracy:          95.85454545 %
Epoch 5 of 30 took 10.5069s
  Training loss:                0.22464556
  Validation loss:              0.17226825
  Validation accuracy:          95.85454545 %
...



Please Help!
What am i doing wrong?






André Lopes

unread,
Apr 15, 2016, 11:45:49 AM4/15/16
to lasagne-users
Just wanted to add that i read all over the lasagne group.
Re-checked my code.
Posted on stackoverflow as well.
Googled stackoverflow.

And im out of ideas. Everything seems fine to me.

These is the shapes being used :

x_train.shape  (102746, 1, 17, 17)
y_train.shape  (102746, 1)
x_val.shape  (11416, 1, 17, 17)
y_val.shape  (11416, 1)

emolson

unread,
Apr 15, 2016, 12:33:59 PM4/15/16
to lasagne-users
one thing I notice - your keras model is not using the SGD optimizer you're configuring (you need to pass the object, not the string 'sgd').

if you're looking for debugging help, it's a much more useful to either post a small snippet, or a full executable script. there are a lot of possibly relevant details missing from what you've shown.

André Lopes

unread,
Apr 15, 2016, 12:37:53 PM4/15/16
to lasagn...@googlegroups.com
Well, the keras script is working..
Just lasagne isnt!

The full script.. Im not sure i can do that due to dependencies.

What other information do i need to post?


--
You received this message because you are subscribed to a topic in the Google Groups "lasagne-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lasagne-users/v7sQ-sW62cw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lasagne-user...@googlegroups.com.
To post to this group, send email to lasagn...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/lasagne-users/8af0c1a1-9f98-4562-a23c-b5dcac1f951b%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

goo...@jan-schlueter.de

unread,
Apr 15, 2016, 12:40:10 PM4/15/16
to lasagne-users
Also your models are not the same. The Keras one has more layers and lower dropout. But I'd first check that the inputs and targets are actually the same in both cases. As Eben said, you may also want to post full runnable scripts somewhere (e.g., gist.github.com), the snippets alone don't show the full picture.

André Lopes

unread,
Apr 15, 2016, 12:41:02 PM4/15/16
to lasagn...@googlegroups.com
The inputs and targets are the same.
I changed the model before posting , but even when they are the same, i get the same results.


2016-04-15 13:40 GMT-03:00 <goo...@jan-schlueter.de>:
Also your models are not the same. The Keras one has more layers and lower dropout. But I'd first check that the inputs and targets are actually the same in both cases. As Eben said, you may also want to post full runnable scripts somewhere (e.g., gist.github.com), the snippets alone don't show the full picture.

--
You received this message because you are subscribed to a topic in the Google Groups "lasagne-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lasagne-users/v7sQ-sW62cw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lasagne-user...@googlegroups.com.
To post to this group, send email to lasagn...@googlegroups.com.

André Lopes

unread,
Apr 15, 2016, 12:47:55 PM4/15/16
to lasagn...@googlegroups.com
I corrected the keras "sgd" to sgd and it still works fine.

I also changed the tensor output to fmatrix, but still nothing.

I dont understand whats going on...

emolson

unread,
Apr 15, 2016, 12:48:28 PM4/15/16
to lasagne-users
well, the keras script is not using the optimizer parameters you think it is! if you're using the same ones in Lasagne, that would cause a difference..
some relevant things: learning rate / momentum, training loop code, data loading / preprocessing.

personally I'm willing to poke at a runnable script, but I'm not going to carefully check for errors in incomplete code.
if sharing the data is an issue, you can replace it with randomly generated samples.

goo...@jan-schlueter.de

unread,
Apr 15, 2016, 12:49:56 PM4/15/16
to lasagne-users
Also, the validation acc never changes:

But both the training and validation loss go down during training. Maybe you need more patience, or a larger learning rate.

André Lopes

unread,
Apr 15, 2016, 1:01:00 PM4/15/16
to lasagn...@googlegroups.com
I changed the learning rate to 0.1 and the val acc changed and the results came perfectly!

--
Why is it so sensitive to the learning rate like this?

Also, Whats the correct output tensor to use? float32? If i use float64, can i get a better accuracy?

Im using y_train dtype as float32, should i also change it?


Any extra tips for binary classification is appreciated! 

--

Thanks a lot guys!
I really appreciate jan and Emolson!
I really love Lasagne. 
I really really really appreciate the patience and im really grateful!



2016-04-15 13:49 GMT-03:00 <goo...@jan-schlueter.de>:
Also, the validation acc never changes:

But both the training and validation loss go down during training. Maybe you need more patience, or a larger learning rate.

--
You received this message because you are subscribed to a topic in the Google Groups "lasagne-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/lasagne-users/v7sQ-sW62cw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to lasagne-user...@googlegroups.com.
To post to this group, send email to lasagn...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages