Alsalam Alaikum,
I am working on a semantic translation model focused on generating fluent and eloquent Arabic translations. I am currently using metrics like BERTScore to assess translation accuracy (ensuring that meaning is preserved), its hard to measure the fluency using a metric, especially in Arabic.
I am interested in incorporating Reinforcement Learning from Human Feedback (RLHF) to improve the model's fluency and overall translation quality. To this end, I am hoping to find what is the best way to achieve RLHF with fluent Arabic speakers/readers. Additionally, I am curious to know if you believe RLHF is the right approach to achieving better fluency in this case, or if there are alternative methods you would recommend.
Any guidance, recommendations, or relevant resources would be helpful.
Much appreciated,
Abdulmohsen Abanmy