Getting good results while training, but can't get the same when loading a saved model

Alexandros Drymonitis

unread,

Jul 20, 2023, 10:57:22 AM7/20/23

to Keras-users

Hi,

I'm using an LSTM-RNN from this tutorial, trained on a dataset I created myself. The dataset consists of Bach chorals, in a form similar to Lilypond notation. Part of the dataset is the following:

\bar 1 {
\Soprano b'4 cis''4 d''4 cis''4
\Alto fis'4 fis'4 fis'4 fis'8 e'8
\Tenor d'4 cis'4 b4 ais4
\Bass b4 ais4 b4 fis4
}

\bar 2 {
\Soprano b'4 cis''8 d''16 e''16 d''4 cis''4
\Alto d'4 g'4 fis'4 fis'4
\Tenor b4 b4 b4 ais4
\Bass g8 fis8 e4 b,4 fis4
}

\bar 3 {
\Soprano d''4 cis''4 b'4 cis''4
\Alto fis'4 fis'4 fis'8 gis'8 ais'4
\Tenor b4 ais4 b4 e'4
\Bass b4 fis8 e8 d4 cis4
}

\bar 4 {
\Soprano d''4 e''4 fis''2
\Alto b'8 d''8 cis''8 b'8 ais'2
\Tenor fis'4 g'4 cis'2
\Bass b,8 a,8 g,4 fis,2
}

My dataset consists of approximately 260,000 characters, whereas in the Keras tutorial of the link above, the minimum size suggested is 100,000 characters.

While training the network, the generated output looks quite coherent. Here's an example of the network's output during training:

\bar 25 {
\soprano bes'4 bes'4 g'8 f'8
\alto aes'8 g'8 c''4 bes'8 aes'8 a'4
\tenor g4 f'4 bes4 bes4
\bass g4 c4 f,8 e8 d4
}

\bar 7 {
\soprano a'4 g'4 g'8 a'8
\alto e'4 d'4 c'4
\tenor a4 bes4 bes4
\bass f8 e8 d8 c8 bes,8 g,8 c4
}

\bar 7 {
\soprano c''4 d''4 e''2
\alto fis'4 a'4 a'8 g'8 a'2
\tenor c'8. bes16 g'

In the training chunk of the tutorial code, I have added the last bit, to save the model when the training is done:

for epoch in range(epochs):
        model.fit(x, y, batch_size=batch_size, epochs=1)
        print()
        print("Generating text after epoch: %d" % epoch)

        start_index = random.randint(0, len(text) - maxlen - 1)
        for diversity in [0.2, 0.5, 1.0, 1.2]:
            print("...Diversity:", diversity)

            generated = ""
            sentence = text[start_index : start_index + maxlen]
            print('...Generating with seed: "' + sentence + '"')

            for i in range(400):
                x_pred = np.zeros((1, maxlen, len(chars)))
                for t, char in enumerate(sentence):
                    x_pred[0, t, char_indices[char]] = 1.0
                preds = model.predict(x_pred, verbose=0)[0]
                next_index = sample(preds, diversity)
                next_char = indices_char[next_index]
                sentence = sentence[1:] + next_char
                generated += next_char

            print("...Generated: ", generated)
            print()
        
        if epoch == epochs-1 and save_model:
            model.save(model_name)
            print(f"saved model to {model_name}")

When I try to load the saved model in another Python file, the output I get when I seed the network with "\\bar" is like this:

i481e'eei'ie1ie'e1e','4481ee2
e188',1i1,,888e1ei4e''e
8ee81
i''eee4811i1
'28
eee21eeii
ei2ie81iei
e1eee1'8'
e'e12'di42i8e'e848
8ii18i''82e'eie8'188''1e,'8ee4181iiiei'8
1i
e'eee8'e481ee8i'8e8ie8e88e41e1'8ge'288'e4ee4ai1,8'e'',8e8ei48'ie2184'88i'8
e8eeeeie'88e1'eee1ee4i8i4,4e'8bee1'i1eb88ee4
1e'4ei,,ieeee2
ee8'8'44ie1e,ei48i1'4'i8ei'2,88e'e'eei8'4'ieee4eeiii'8ie,,1e'iei28i'ee'44'488e18ei2i,eeei'ii'1

As you can see, the output I get from the saved model is not even close to the output the network gives during training. The above output is a result from saving a model trained with 10 epochs. If I train the model for more epochs (in the tutorial it is mentioned that at least 20 epochs are required), the output is even worse. Below is an output from a saved model trained for 20 epochs:

8211'21'82'2'22''244''42222'824'8111'4'424228'828'1221824'812221424'2,'288212821i12122222121''42'1222221888122'218'2212i81241''2'88''1'42'42228881211'2442'18'22821242224'8842118'4482182241211'122121242281822244222212848881'482''''2284'228s2848128842'2''282'42i'222'221228,22222482'28''4'21'1'1122'224'2'1'8824822''4'42822'1i21'22182'122222''82'''122428812'42'2'2142844842111''21242'82212'2'2182112'22

And here is the code that loads the saved model

"""
Keras LSTM text generator taken from
https://keras.io/examples/generative/lstm_character_level_text_generation/
"""

from tensorflow import keras

import numpy as np
import io
import sys


# prepare the text sampling function
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype("float64")
    #print(f"dividing by {temperature}")
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)


if __name__ == "__main__":

    if len(sys.argv) < 3:
        print("Usage: lstm_predict.py /path/to/corpus.txt /path/to/model")
        exit()
    corpus_path = sys.argv[1]
    model_name = sys.argv[2]

    model = keras.models.load_model(model_name)
    
    with io.open(corpus_path, encoding="utf-8") as f:
        text = f.read().lower()
    #text = text.replace("\n", " ")  # We remove newlines chars for nicer display
    print("Corpus length:", len(text))

    chars = sorted(list(set(text)))
    print("Total chars:", len(chars))
    char_indices = dict((c, i) for i, c in enumerate(chars))
    indices_char = dict((i, c) for i, c in enumerate(chars))

    # cut the text in semi-redundant sequences of maxlen characters
    maxlen = 40

    while True:
        diversity = float(input("Provide diversity: "))
        sentence = input("Provide seed: ")
        generated = ""

        for i in range(400):
            x_pred = np.zeros((1, maxlen, len(chars)))
            for t, char in enumerate(sentence):
                x_pred[0, t, char_indices[char]] = 1.0
            preds = model.predict(x_pred, verbose=0)[0]
            next_index = sample(preds, diversity)
            next_char = indices_char[next_index]
            sentence = sentence[1:] + next_char
            generated += next_char
        print(generated)

Can anyone shed some light on this?

Thank you

Alexandros Drymonitis

unread,

Jul 20, 2023, 12:44:57 PM7/20/23

to Keras-users

An extra information, when I save the model, I get the following message

WARNING:absl:Found untraced functions such as lstm_cell_layer_call_fn, lstm_cell_layer_call_and_return_conditional_losses while saving (showing 2 of 2). These functions will not be directly callable after loading.

This looks harmless to me, as I don't intend to call any of these functions directly, but perhaps this has something to do with the issue?

Alexandros Drymonitis

unread,

Jul 21, 2023, 1:04:22 PM7/21/23

to Keras-users

Didn't find a solution, but found this tutorial which creates and saves a very well-functioning RNN that works as expected also when the saved model is loaded to a new Python script. So I give up with the tutorial of the OP and will continue with the one from this post.

Reply all

Reply to author

Forward