Hello,
I am new to the NLP domain and also a new member of fairseq.
I was trying to train a new model with a new parallel corpus. I read the documentation of fairseq on how to train a new model. In the preprocessing part, it uses this command: "bash prepare-iwslt14.sh".
My main question here is what if we do not have this file, then how should we prepare something like this for our own pair of languages? what are exactly the preprocessing steps, if we have no bash file like "bash prepare-iwslt14.sh"?
Thank you in advance for your help.