How to mask binary crossentropy loss?

1,919 views

Skip to first unread message

eric...@gmail.com

unread,

Mar 15, 2017, 2:00:23 PM3/15/17

to Keras-users

Hey,
so I'm trying to solve following problem:
I have sequential data, each sequence element corresponding to 1 item. I know all items which are in the sequence.
The task is to predict which items in the sequence will be bought.

The output needs to be a 1-D vector of size 50k (number of items) which has value 1 or 0 at the item indicis (1 = buy, 0 = no buy).

My network is an RNN with a Sigmoid layer at the end.

With binary crossentropy the network predicts only 0s, so the accuracy is 99.99...% but recall and precision 0. This is because there are only a few items and buys per sequence.

My idea was to train the network only on the items which are actually in the sequence.

Can this be done by masking the loss function so it ignores all output values except the ones corresponding to the items in the current sequence?

My current code is:

def binary_crossentropy(y_true, y_pred):
    shape = tf.shape(y_true)
    y_true = tf.reshape(y_true, [-1])
    y_pred = tf.reshape(y_pred, [-1])
    mask = tf.cast(tf.not_equal(y_true, -1), tf.float32)
    y_true = tf.multiply(y_true, mask)
    y_pred = tf.multiply(y_pred, mask)
    y_true = tf.reshape(y_true, shape)
    y_pred = tf.reshape(y_pred, shape)
    return K.mean(K.binary_crossentropy(y_pred, y_true), axis=-1)

Where y_true is -1 when the corresponding item is not in the sequence, 0 if the item is not bought and 1 if it is.
I'm basically setting the items in y_pred which are not in the sequence anyway to 0 (the correct value).

I also masked the accuracy:

def binary_accuracy(y_true, y_pred):
    shape = tf.shape(y_true)
    y_true = tf.reshape(y_true, [-1])
    y_pred = tf.reshape(y_pred, [-1])
    mask = tf.cast(tf.not_equal(y_true, -1), tf.float32)
    y_true = tf.multiply(y_true, mask)
    y_pred = tf.multiply(y_pred, mask)
    y_true = tf.reshape(y_true, shape)
    y_pred = tf.reshape(y_pred, shape)
    return K.mean(K.equal(y_true, K.round(y_pred)))

But it shows 100 % accuracy right from the beginning so I guess it's not working currently.

Thanks!

eric...@gmail.com

unread,

Mar 15, 2017, 6:02:43 PM3/15/17

to Keras-users, eric...@gmail.com

Edit: Solution below (not the best code quality)

def binary_crossentropy(y_true, y_pred):
    return K.mean(K.binary_crossentropy(tf.multiply(y_pred, tf.cast(tf.not_equal(y_true, -1), tf.float32)),
                                        tf.multiply(y_true, tf.cast(tf.not_equal(y_true, -1), tf.float32))), axis=-1)


def binary_accuracy(y_true, y_pred):
    t0 = tf.equal(y_true, 0)
    t1 = tf.equal(y_true, 1)
    p0 = tf.equal(tf.round(y_pred), 0)
    p1 = tf.equal(tf.round(y_pred), 1)
    everything = tf.reduce_sum(tf.cast(t0, tf.int32)) + tf.reduce_sum(tf.cast(t1, tf.int32))
    positives = tf.reduce_sum(tf.cast(tf.logical_and(t0, p0), tf.int32)) + tf.reduce_sum(tf.cast(tf.logical_and(p1, t1), tf.int32))
    return positives / everything


def precision(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(tf.multiply(y_true, tf.cast(tf.not_equal(y_true, -1), tf.float32)) * tf.multiply(y_pred, tf.cast(tf.not_equal(y_true, -1), tf.float32)), 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(tf.multiply(y_pred, tf.cast(tf.not_equal(y_true, -1), tf.float32)), 0, 1)))
    precision = true_positives / (predicted_positives + K.epsilon())
    return precision


def recall(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(tf.multiply(y_true, tf.cast(tf.not_equal(y_true, -1), tf.float32)) * tf.multiply(y_pred, tf.cast(tf.not_equal(y_true, -1), tf.float32)), 0, 1)))
    possible_positives = K.sum(K.round(K.clip(tf.multiply(y_true, tf.cast(tf.not_equal(y_true, -1), tf.float32)), 0, 1)))
    recall = true_positives / (possible_positives + K.epsilon())
    return recall

Reply all

Reply to author

Forward

0 new messages