Problem in implementing parallel CNN+LSTM

168 views
Skip to first unread message

Matteo Scucchia

unread,
Apr 28, 2021, 8:53:05 AM4/28/21
to Keras-users
Hi everyone, 
i'm trying to implement a specific architecture in Keras and i'm in trouble. The architecture is explained in this paper https://www.mdpi.com/2076-3417/10/16/5426 and i also add images here to be more explicative. 

I try to explain in the simplest way i can what i want:
- two cnn in parallel
- a concatenation layer for the outputs
- one dense layer 
- a lstm layer 
- two dense layer in parallel
- two output layer
 
This is my code for now

def build_model():
    rgb = MobileNetV2(input_shape=(640, 480, 6), include_top=False, weights= None, pooling='avg')
    depth = MobileNetV2(input_shape=(640, 480, 6), include_top=False, weights= None, pooling='avg')
    for layer in rgb.layers:
        layer._name = layer.name + str("_rgb")
    for layer in depth.layers:
        layer._name = layer.name + str("_depth")
    rgbd = Concatenate(name="rgbd_concatenate")([rgb.output, depth.output])
    out = Dense(rgbd.shape[1], activation='relu')(rgbd)
    cnn = Model(inputs=[rgb.input, depth.input], outputs=[out])

     """
    return Sequential(
        [
            TimeDistributed(cnn, input_shape=[???]),
            LSTM(256, return_sequences=True),
            TimeDistributed(Dense(256, 'relu')),
            TimeDistributed(Dense(256, 'relu')),
            TimeDistributed(Dense(7))
        ]
    )
   """
    
For now the model is ok, but then i try to use the commented part, i don't understand what shape i has to give to the input in order to concatenate the cnn with the lstm using TimeDistributed layer. In addition, I think TimeDistributed cannot accept two inputs so i don't know how to pass my input shape to TimeDistributed
Does anyone know how to implement this kind of architecture?

Thank in advance for the attention.
lstm.PNG
cnn_parallel.PNG

Lance Norskog

unread,
Apr 28, 2021, 12:14:00 PM4/28/21
to Matteo Scucchia, Keras-users
Have you seen this example? There is a special ConvLSTM2D layer, I think it is meant for this use case.


--
You received this message because you are subscribed to the Google Groups "Keras-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to keras-users...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/keras-users/9542a58d-be70-4759-a422-1c1bfa4c17d2n%40googlegroups.com.


--
Lance Norskog
lance....@gmail.com
Redwood City, CA

Matteo Scucchia

unread,
Apr 28, 2021, 1:47:27 PM4/28/21
to Keras-users
Hi, 
Thank you for the reply. For my experiment i need this specific architecture so i cannot use ConvLSTM2D. Finally I've solved by dividing in two TimeDistributed wrapper the two CNN in parallel.
This is my code, the model is correctly created

def build_model():
  rgb = MobileNetV2(input_shape=(640, 480, 6), include_top=False, weights= None, pooling='avg')
  depth = MobileNetV2(input_shape=(640, 480, 6), include_top=False, weights= None, pooling='avg')
  for layer in rgb.layers:
      layer._name = layer.name + str("_rgb")
  for layer in depth.layers:
      layer._name = layer.name + str("_depth")

  rgb_input = Input(shape=(32, 640, 480, 6))
  depth_input = Input(shape=(32, 640, 480, 6))
  rgb_cnn = TimeDistributed(rgb)(rgb_input)
  depth_cnn = TimeDistributed(depth)(depth_input)

  rgbd = concatenate([rgb_cnn, depth_cnn])
  dense = TimeDistributed(Dense(rgbd.shape[2], name="rgbd_dense"))(rgbd)
  lstm = LSTM(256, name="lstm_rgbd", return_sequences=True)(dense)
  
  rotation = TimeDistributed(Dense(256, name="rotation_dense_1"))(lstm)
  rotation = TimeDistributed(Dense(256, name="rotation_dense_2"))(rotation)
  rotation_output = TimeDistributed(Dense(4, name="rotation_output"))(rotation)

  translation = TimeDistributed(Dense(256, name="translation_dense_1"))(lstm)
  translation = TimeDistributed(Dense(256, name="translation_dense_2"))(translation)
  translation_output = TimeDistributed(Dense(3, name="translation_output"))(translation)

  model = Model(inputs=[rgb_input, depth_input], outputs=[rotation_output, translation_output])

  return model
Reply all
Reply to author
Forward
0 new messages