save/restore

21 views
Skip to first unread message

permezel

unread,
Jan 30, 2025, 4:48:33 AMJan 30
to Keras-users
I am using the model described in https://keras.io/examples/generative/text_generation_with_miniature_gpt/.  I train it with some text, save it, and also save the vocab.
I now want to restore it, and use it to generate text.

This simply does not work and I have been down all sorts of blind alleys attempting to get it to do so.

For example, here is the TransformerBlock:
```
class TransformerBlock(layers.Layer):
    def __init__(self, embed_dim, num_heads, ff_dim, rate=0.1):
        super().__init__()
        self.att = layers.MultiHeadAttention(num_heads, embed_dim)
        self.ffn = keras.Sequential(
            [
                layers.Dense(ff_dim, activation="relu"),
                layers.Dense(embed_dim),
            ]
        )
        self.layernorm1 = layers.LayerNormalization(epsilon=1e-6)
        self.layernorm2 = layers.LayerNormalization(epsilon=1e-6)
        self.dropout1 = layers.Dropout(rate)
        self.dropout2 = layers.Dropout(rate)

    def call(self, inputs):
        input_shape = ops.shape(inputs)
        batch_size = input_shape[0]
        seq_len = input_shape[1]
        causal_mask = causal_attention_mask(batch_size, seq_len, seq_len, "bool")
        attention_output = self.att(inputs, inputs, attention_mask=causal_mask)
        attention_output = self.dropout1(attention_output)
        out1 = self.layernorm1(inputs + attention_output)
        ffn_output = self.ffn(out1)
        ffn_output = self.dropout2(ffn_output)
        return self.layernorm2(out1 + ffn_output)
```
How should this be modified so that it works both in the compute/save python file and the restore python file?  I have tried adding the get_config and also decorating it, and this requires changes on the restore.py side, as __init__ is called with various different params.

Reply all
Reply to author
Forward
0 new messages