Matrix size-incompatible

sam mohel

unread,

Sep 11, 2021, 7:02:45 PM9/11/21

to Keras-users

(0) Invalid argument: Matrix size-incompatible: In[0]: [53,1132], In[1]: [1240,512] [[node functional_1/dense/MatMul (defined at fit.py:324) ]] (1) Invalid argument: Matrix size-incompatible: In[0]: [53,1132], In[1]: [1240,512] [[node functional_1/dense/MatMul (defined at fit.py:324) ]] [[gradient_tape/functional_1/embedding/embedding_lookup/Reshape_1/_30]]

the model is

def define_model(vocab_size, max_length, curr_shape):

inputs1 = Input(shape=curr_shape) fe1 = Dropout(0.5)(inputs1)

fe2 = Dense(256, activation='relu')(fe1)

inputs2 = Input(shape=(max_length,))

se1 = Embedding(vocab_size, 256, mask_zero=True)(inputs2)

se2 = Dropout(0.5)(se1) se3 = LSTM(256)(se2)

decoder1 = add([fe2, se3])

decoder2 = Dense(256, activation='relu')(decoder1)

outputs = Dense(vocab_size, activation='softmax')(decoder2)

model = Model(inputs=[inputs1, inputs2], outputs=outputs)

model.compile(loss='categorical_crossentropy', optimizer='adam')

return model

which curr_shape is 1120

Appreciate any help

sam mohel

unread,

Sep 14, 2021, 2:36:27 PM9/14/21

to Keras-users

is there any help please ?

Lance Norskog

unread,

Sep 16, 2021, 3:05:15 AM9/16/21

to sam mohel, Keras-users

I don't know this problem, but here is what I have learned about tracking these problems down:

1) Make the simplest network that you can, that shows the problem.

2) Replace all of the size factors (256, 1120, etc.) with small prime numbers. This makes it much easier to see what input sizes were multiplied and added together.

Cheers,

Lance Norskog

--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/86e57d86-9193-498f-a12c-41f1f225619cn%40googlegroups.com.

--

Lance Norskog
lance....@gmail.com
Redwood City, CA

sam mohel

unread,

Sep 16, 2021, 5:27:11 PM9/16/21

to Keras-users

Thanks for replying .. I tried to handle the problem and got this performance after 50 epochs of image captioning Datasets: training 8000 .. testing 1000 . I tried also to use BatchNormalization and learning rate with 0.0001 but the performance become poor

Dennis S

unread,

Sep 16, 2021, 5:33:11 PM9/16/21

to sam mohel, Keras-users

Getting your network to run (your original query) is one question. Poor performance of a model is a completely different matter. You would have to provide a lot more information before anyone can really help you here and even then it might be tough understanding the core problem you're trying to solve. But you can ask yourself the following and try to debug yourself:

What is the core issue I'm trying to solve?
What makes me think that I have chosen the right approach/network? Are you duplicating a whitepaper or article?
Should I be trying other methods? NNs are cool but don't solve everything.
Do I have the right processing power for this problem? It sounds like you're doing image processing of some kind? I would recommend at least a Pentium V ;)

Good luck

Thanks,

Dennis

To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/42e60aaa-f270-481c-b13c-ea50b1a5c8cdn%40googlegroups.com.

--

Thanks,

Dennis

Lance Norskog

unread,

Sep 17, 2021, 12:27:25 AM9/17/21

to sam mohel, Keras-users

The one thing that jumps out at me is the Add of two very different vector spaces. Since they do not mean the same thing, adding the numbers may give garbage. If you Concatenate the two outputs, that lets the Dense network pick and choose what it finds interesting.

Also, a trick: the output vocabulary is usually a sparse encoding of the actual information going through the network. You often find an "accordion" style of Dense layers:

Dense(vocab_size, 'relu')

Dense(vocab_size//2, ;relu')

Dense(vocab_size//4, 'relu')

Dense(vocab_size//2, 'relu')

Dense(vocab_size, 'softmax') <- output

This has a way of condensing the information and re-encoding it. I have found that this trick makes some networks train noticeably faster.

To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/CAGrUL1KwBJx%2BP1Z%2Bj5h7sunDN9dvb84wCBfzrZi00w9uNC%3DGoA%40mail.gmail.com.

Reply all

Reply to author

Forward