Keras Siamese Network Implementation - Training Accuracy Constant Loss Fluctuating

rmsh...@gmail.com

unread,

Feb 8, 2020, 2:26:06 AM2/8/20

to Keras-users

Hi Team!

I'm having a bit of trouble training my Siamese Network Model written using the Keras Functional API. My training accuracy isnt improving (almost constant) and my loss is decreasing very slowly. My network is shown below:

# Siamese Networks
person1_inp = Input(shape=(38, 200))
x = Embedding(int(vocab_size), 32)(person1_inp)
x = Conv2D(32, 5, activation = 'relu')(x)
x = Reshape(target_shape=(256, 833))(x)
x = LSTM(32, dropout = 0.1, recurrent_dropout = 0.5, return_sequences=True)(x)
x = TimeDistributed(Dense(64))(x)
x = Conv1D(32, 7, activation = 'relu')(x)

person2_inp = Input(shape=(38, 200))
y = Embedding(int(vocab_size), 96)(person2_inp)
y = Conv2D(32, 5, activation = 'relu')(y)
y = Reshape(target_shape=(256, 833))(y)
y = LSTM(32, dropout = 0.1, recurrent_dropout = 0.5, return_sequences=True)(y)
y = TimeDistributed(Dense(64))(y)
y = Conv1D(32, 7, activation = 'relu')(y)

z = concatenate([x,y])
z = Bidirectional(LSTM(16, dropout = 0.1, recurrent_dropout = 0.5, return_sequences=True))(z)
z = GlobalMaxPooling1D()(z)
z = Dense(4, activation = 'relu')(z)
out = Dense(1, activation = 'softmax')(z)
model = Model([person1_inp, person2_inp], out)
model.compile(optimizer = 'rmsprop', loss = 'binary_crossentropy', metrics = ['acc'])

I am using this to find similarity of a dialogue between two individuals. I am using Gensim's Doc2Vec embeddings for transforming text-data into vectors (vocab size: 4117). My data is equally divided up into 56 positive cases and 64 negative cases. (yes I know the dataset is small - but that's all I have for the time being).

Here is my training history:

I also tested out Sklearn Statistical Models for Analysis on the same dataset - using Cosine Similarities between the two Doc2Vec Embedding Pairs, and the results are attached below:

And Finally, my vectorized Dataset looks like this:

Now I can't figure out if the Siamese Network the incorrect choice - or maybe I'm messing up my model layers - or perhaps the dataset is flawed. Any help here would be appreciated. Thank you!

Philip May

unread,

Feb 8, 2020, 6:25:19 AM2/8/20

to Keras-users

This is not a siamese network. It is a two headed input network but not siamese. You create 2 different networks and concatenate them later. See here for an example for a siamese network where weights are shared: https://towardsdatascience.com/one-shot-learning-with-siamese-networks-using-keras-17f34e75bb3d

rmsh...@gmail.com

unread,

Feb 8, 2020, 8:15:39 AM2/8/20

to Keras-users

Hi Phillip! Yes I figured that out earlier and edited my Keras Model as follows:

def euclidean_distance(vects):
    x, y = vects
    sum_square = K.sum(K.square(x - y), axis=1, keepdims=True)
    return K.sqrt(K.maximum(sum_square, K.epsilon()))

ch_inp = Input(shape=(1, 200))
csr_inp = Input(shape=(1, 200))

inp = Input(shape=(1, 200))
net = Embedding(int(vocab_size), 16)(inp)
net = Conv2D(16, 1, activation='relu')(net)
net = TimeDistributed(LSTM(8, return_sequences=True))(net)
out = Activation('relu')(net)

sia = Model(inp, out)

x = sia(csr_inp)
y = sia(ch_inp)

sub = Subtract()([x, y])
mul = Multiply()([sub, sub])

mul_x = Multiply()([x, x])
mul_y = Multiply()([y, y])
sub_xy = Subtract()([x, y])

euc = Lambda(euclidean_distance)([x, y])
z = Concatenate(axis=-1)([euc, sub_xy, mul])
z = TimeDistributed(Bidirectional(LSTM(4)))(z)
z = Activation('relu')(z)
z = GlobalMaxPooling1D()(z)
z = Dense(2, activation='relu')(z)
out = Dense(1, activation = 'sigmoid')(z)

model = Model([ch_inp, csr_inp], out)
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

I also increased my train-data samples from 120 to 1875, and my model accuracy increased to 70.28%. But for some reason, its still constant (train and validation accuracy not improving or changing). Anything else I should change?

Lance Norskog

unread,

Feb 9, 2020, 9:08:29 PM2/9/20

to rmsh...@gmail.com, Keras-users

These LSTM sizes seem very small. 4 units? I'm not sure I have seen an example of Keras with less than 16 units in an LSTM.

The fact that loss keeps dropping but accuracy stays constant says (to me) that this is as good as it can be.

Cheers,

Lance

--

Lance Norskog
lance....@gmail.com
Redwood City, CA

Reply all

Reply to author

Forward