I'm working with Arabic `langdata_lstm`, where it only has 84 lines of training text in the `training_text` file, where I believe it is too small for building/training a reliable model. After reading the `training_text` file I can see a randomly generated text with no meaning, first I think that this is an Arabic problem, but later I found that it is the same for all other languages.
My questions are:
1. What specifications are followed when generating these `training_text` files (I can see for example that each line is no more than 60 characters long, is this one of the specification?)
2. Could I simply extend the `training_text` file then generate my training data with custom fonts and start training directly? or there are other files that should be changed after changing this file? if yes, what are they and how to regenerate them?
Best Regards