Accuracy always equal to zero

517 views
Skip to first unread message

zaff...@gmail.com

unread,
May 22, 2016, 3:28:43 PM5/22/16
to lasagne-users
Dear Lasagne community,
let me exmplain my case.

I would like get a regression (also a classification is fine) starting from a dataset having size equal to (something, 140, 50, 50).
In other words n examples, 140 channels, 50x50 images.
For training step I have more or less 16000 cases, 6000 for validation and 1000 for testing.

The labels ranges from 0.6 to 0.8.
Of course I can multiply them in order to get int.
Both regression or classification (e.g among 0.60, 0.65, 0.70, 0.75, 0.80) are fine for me.

The problem is that accuracy is always 0.0%.
The code is posted below.

Any idea?

Thank you very much.
Best.

Paolo


def build_cnn(single_entry_shape, input_var=None):

    network = lasagne.layers.InputLayer(shape=(None, single_entry_shape[0], single_entry_shape[1], single_entry_shape[2]),
                                        input_var=input_var)

    network = lasagne.layers.Conv2DLayer(
            network, num_filters=200, filter_size=(11, 11), 
            nonlinearity=lasagne.nonlinearities.leaky_rectify,
            W=lasagne.init.HeNormal())

    network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
    
    network = lasagne.layers.Conv2DLayer(
            network, num_filters=100, filter_size=(5, 5), 
            nonlinearity=lasagne.nonlinearities.leaky_rectify)
    network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
    
    network = lasagne.layers.Conv2DLayer(
            network, num_filters=50, filter_size=(3, 3), 
            nonlinearity=lasagne.nonlinearities.leaky_rectify)
    network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
    
    network = lasagne.layers.DenseLayer(
           lasagne.layers.dropout(network, p=.5),
            num_units=20,
           nonlinearity=lasagne.nonlinearities.linear)
    
    network = lasagne.layers.DenseLayer(
            lasagne.layers.dropout(network, p=.5),
            num_units=4,
            nonlinearity=lasagne.nonlinearities.linear)

    return network

num_epochs = 100
batchsize = 300
single_entry_shape = X_train.shape[1:]

input_var = T.tensor4('inputs')
target_var = T.ivector('targets')

network = build_cnn(single_entry_shape, input_var)

prediction = lasagne.layers.get_output(network)
loss = lasagne.objectives.categorical_crossentropy(prediction, target_var)
loss = loss.mean()

params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.nesterov_momentum(
        loss, params, learning_rate=0.00000001, momentum=0.9)

test_prediction = lasagne.layers.get_output(network, deterministic=True)
test_loss = lasagne.objectives.categorical_crossentropy(test_prediction, target_var) # 
test_loss = test_loss.mean()

test_acc = T.mean(T.eq(T.argmax(test_prediction, axis=1), target_var),
                    dtype=theano.config.floatX)

train_fn = theano.function([input_var, target_var], loss, updates=updates, allow_input_downcast=True)

val_fn = theano.function([input_var, target_var], [test_loss, test_acc], allow_input_downcast=True)

print("Starting training...")

for epoch in range(num_epochs):
    train_err = 0
    train_batches = 0
    start_time = time.time()
    for batch in iterate_minibatches(X_train, y_train, batchsize, shuffle=True):
        inputs, targets = batch
        train_err += train_fn(inputs, targets)
        train_batches += 1

    val_err = 0
    val_acc = 0
    val_batches = 0
    for batch in iterate_minibatches(X_val, y_val, batchsize, shuffle=False):
        inputs, targets = batch
        err, acc = val_fn(inputs, targets)
        val_err += err
        val_acc += acc
        val_batches += 1

    print("Epoch {} of {} took {:.3f}s".format(
        epoch + 1, num_epochs, time.time() - start_time))
    print("  training loss:\t\t{:.6f}".format(train_err / train_batches))
    print("  validation loss:\t\t{:.6f}".format(val_err / val_batches))
    print("  validation accuracy:\t\t{:.2f} %".format(
        val_acc / val_batches * 100))

test_err = 0
test_acc = 0
test_batches = 0
for batch in iterate_minibatches(X_test, y_test, batchsize, shuffle=False):
    inputs, targets = batch
    err, acc = val_fn(inputs, targets)
    test_err += err
    test_acc += acc
    test_batches += 1
print("Final results:")
print("  test loss:\t\t\t{:.6f}".format(test_err / test_batches))
print("  test accuracy:\t\t{:.2f} %".format(
    test_acc / test_batches * 100))

dario tonelli

unread,
May 22, 2016, 5:49:54 PM5/22/16
to lasagne-users
Hi labels must be int between 0 to num classes.
I don't understand if your problem is regression or classification but reading your code I think classification.
Try to adjust your label to int and all works fine

zaff...@gmail.com

unread,
May 23, 2016, 11:19:16 AM5/23/16
to lasagne-users, zaff...@gmail.com
Thank you very much for your help.

The labels are int now (3 classes).
If I run it, from the first epoch I get (also with learning_rate=0.000000000000000001):
training loss: nan
validation loss: nan
validation accuracy: 29.76 %

Any Idea?
I normalized the data, in order to have mean equal to zero and std equal to 1.

What am I missing?

Best.
Paolo

dario tonelli

unread,
May 23, 2016, 2:42:24 PM5/23/16
to lasagne-users, zaff...@gmail.com
Hi Paolo make sure your labels are 0,1,2 and not 1,2,3
In which manner have you standardized data?
The classic manner is accumulate statistic (mean, std) for only the train data set and then standardize train,valid,test set with the statistic training set.
There is a usefull library from scikit called Preprocessing that doing exactly what I wrote above.
Try to make training without standardization and try to change update rules to adam with lr = 0.001 

zaff...@gmail.com

unread,
May 23, 2016, 6:03:58 PM5/23/16
to lasagne-users, zaff...@gmail.com
Editing a little bit the network the accuracy is not zero now.

Yes, my labels are 0,1 and 2.
I normalized the data taking away the mean and dividing by the std (computed only on the training dataset and applied to validation and testing too).

The problem now is that I got always the label 0.
I predict a new case by using:

new_case = X_test[some_index,:,:,:].reshape((1,140,50,50))
lasagne.layers.get_output(network, new_case)

Any idea?
Is something wrong?
Thanks a lot for your help!

Best.
Paolo


On Sunday, May 22, 2016 at 9:28:43 PM UTC+2, zaff...@gmail.com wrote:
Dear Lasagne community,
let me exmplain my case.
acy 

dario tonelli

unread,
May 23, 2016, 6:58:58 PM5/23/16
to lasagne-users
Why you are using linear nonlinearities in a classification problem? You have to use softmax non linearwith cross entropy loss. If you want regression you have to put linear in the last layer and X_train=y_train.
If you want classification this is the way

Input (None, somethings, x, y)
Dense rectify
Dense rectify
Output softmax


Input var = tensor4
Targets= ivector
Loss = cat cross entropy(net out, target ).mean
Update rule= adam, nestorov, adagard
Lr = consistent to update rule

zaff...@gmail.com

unread,
May 24, 2016, 8:10:05 AM5/24/16
to lasagne-users
Hi,
currently my network is this one below:

def build_cnn(single_entry_shape, input_var=None):

    network = lasagne.layers.InputLayer(shape=(None, single_entry_shape[0], single_entry_shape[1], 
                                               single_entry_shape[2]),
                                        input_var=input_var)

    network = lasagne.layers.Conv2DLayer(
            network, num_filters=600, filter_size=(11, 11), 
            nonlinearity=lasagne.nonlinearities.leaky_rectify,
            W=lasagne.init.HeNormal())
    network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
    
    network = lasagne.layers.Conv2DLayer(
            network, num_filters=300, filter_size=(5, 5), 
            nonlinearity=lasagne.nonlinearities.leaky_rectify)
    network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))
    
    network = lasagne.layers.Conv2DLayer(
            network, num_filters=200, filter_size=(3, 3), 
            nonlinearity=lasagne.nonlinearities.leaky_rectify)
    network = lasagne.layers.MaxPool2DLayer(network, pool_size=(2, 2))

    network = lasagne.layers.DenseLayer(
           lasagne.layers.dropout(network, p=.5),
            num_units=100,
           nonlinearity=lasagne.nonlinearities.leaky_rectify)
    
    network = lasagne.layers.DenseLayer(
            lasagne.layers.dropout(network, p=.5),
            num_units=3,
            nonlinearity=lasagne.nonlinearities.leaky_rectify)

    return network




num_epochs = 100
batchsize = 200
single_entry_shape = X_train.shape[1:]

input_var = T.tensor4('inputs')
target_var = T.ivector('targets')

network = build_cnn(single_entry_shape, input_var)

prediction = lasagne.layers.get_output(network)
loss = lasagne.objectives.multiclass_hinge_loss(prediction, target_var)
loss = loss.mean()

params = lasagne.layers.get_all_params(network, trainable=True)
updates = lasagne.updates.nesterov_momentum(
        loss, params, learning_rate=0.01, momentum=0.9)

test_prediction = lasagne.layers.get_output(network, deterministic=True)
test_loss = lasagne.objectives.multiclass_hinge_loss(test_prediction, target_var)
test_loss = test_loss.mean()
test_acc = T.mean(T.eq(T.argmax(test_prediction, axis=1), target_var),
                    dtype=theano.config.floatX)

train_fn = theano.function([input_var, target_var], loss, updates=updates)

val_fn = theano.function([input_var, target_var], [test_loss, test_acc])
The labels are int between 0 and 2.
Data are normalized (minus mean and divided by std). 

Last printed lines are:

Epoch 98 of 100 took 462.350s
  training loss: 0.061247
  validation loss: 0.397025
  validation accuracy: 84.07 %
Epoch 99 of 100 took 462.469s
  training loss: 0.060429
  validation loss: 0.397124
  validation accuracy: 83.11 %
Epoch 100 of 100 took 462.125s
  training loss: 0.059886
  validation loss: 0.411047
  validation accuracy: 83.21 %
Final results:
  test loss: 2.743840
  test accuracy: 19.00 %


If I try to predict a new case by using

test = X_test[ind,:,:,:].reshape((1,140,50,50))
lasagne.layers.get_output(network, test)

I get always a leble equal to zero.

Thanks a lot.

Paolo

dario tonelli

unread,
May 24, 2016, 9:55:15 AM5/24/16
to lasagne-users, zaff...@gmail.com
It is incorrect, last layer must be softmax:

network = lasagne.layers.DenseLayer(
            lasagne
.layers.dropout(network, p=.5),
            num_units
=3,

            nonlinearity
=lasagne.nonlinearities.softmax)

And loss function category cross entropy 

loss = lasagne.objectives.categorical_crossentropy(prediction, target_var).mean()�


In addition you test accuracy function probably is not correct because you do argmax on axis 1 with a tensor 4 input.

zaff...@gmail.com

unread,
May 24, 2016, 6:03:43 PM5/24/16
to lasagne-users, zaff...@gmail.com
Thank you very much Dario!
I modified the code according to your tips and it's much better now.

Regarding the test accuracy function, axis=1 is the only one setting working properly.

Just another question:
what part must be modified in order to run regression?
In with format must be the labels passed?

Thank you very much.
Best.

Paolo



On Sunday, May 22, 2016 at 9:28:43 PM UTC+2, zaff...@gmail.com wrote:

dario tonelli

unread,
May 25, 2016, 2:17:03 AM5/25/16
to lasagne-users, zaff...@gmail.com
Regression means that targets = inputs, you have to modify your last layer of your net with a linear nonlinearitiy instead softmax, and you have to put inputs = tensor4 = targets. In addition your have to modify your loss with squared error

zaff...@gmail.com

unread,
May 27, 2016, 12:03:52 PM5/27/16
to lasagne-users, zaff...@gmail.com
Thank you very much...grazie!

Best.
Paolo


On Sunday, May 22, 2016 at 9:28:43 PM UTC+2, zaff...@gmail.com wrote:

Jan Schlüter

unread,
Jun 2, 2016, 8:41:52 AM6/2/16
to lasagne-users
Regression means that targets = inputs,

Just to clarify this point: No, targets = inputs is auto-encoding. Regression means you have real-valued targets, which is usually solved with linear output units and mean squared error (as you said). For real-valued input data, an auto-encoder solves a regression task, but for binary data or near-binary data (such as MNIST), the auto-encoder solves binary classification tasks. So auto-encoding and regression are neither the same, nor special cases of one another.

zaff...@gmail.com

unread,
Jun 2, 2016, 1:02:56 PM6/2/16
to lasagne-users, zaff...@gmail.com
Thanks a lot!
So for regression I changed objective and the last nonlinearity.
Must be the labels float or int (between 0 and 1 or between 0 and 100)?

This part must be unchanged, right?
input_var = T.tensor4('inputs')
target_var = T.ivector('targets')

The last "num_units=" must be set to the total amount of lable classes to predict, right?

Thanks a lot.
Best.

Paolo


On Sunday, May 22, 2016 at 9:28:43 PM UTC+2, zaff...@gmail.com wrote:
Reply all
Reply to author
Forward
0 new messages