new translate problem which can deal with corrections

54 views
Skip to first unread message

Arben Sabani

unread,
Apr 14, 2019, 8:21:49 AM4/14/19
to tensor2tensor
Hi Lukasz,

I am working on my bachelor thesis and the topic of my thesis is to train an MT-model using transformer and tensor2tensor.
the basic idea is pretty simple:
- you send source text to MT
- you get initial version of the target text from MT
- as in many or some cases there is post-editing required (MT output not perfect or human translator has something different in mind, not 100% happy with the output), he will start to make corrections to the output.
- let's further assume he starts form left to right and encounters the first word he doesn't like or wants to change
- He will change the word
- next... the idea is to send corrections made by human translator and source text to MT-model
- MT-model will generate, hopefully, a better next version of the translation taking into account the corrections and the source text

No the question is, why should the next version of the output be better?
- My assumption is that, if i can train the model to (re)generate the corrections made by the human translator, the probability that the next words which will come after the corrections and will be generated by MT-model should be closer to the version that the human translator has in mind. In a way the human translator gives to the model a hint (corrections) in which "direction" the mt-model should translate.
- If I am not mistaken or did not fully misunderstand the model, the output layer of the model generates the next word by calculating a conditional probability: what's the probability for word yn+1 given source text and words y1,...,yn
- Now if the model learns to re-generate the words y1,..,yn (corrections made by human-translator) the probability that yn+1 is closer to the version which human translator has in mind should be higher?

I just wanted to know if my assumption, from a theoretical point of view makes sense to you or if you could point me to some papers which deal with this topic?

I have already achieved some results by training an English-German model. The model performs pretty good given the data I used and time I spent for training:


This is an exmple from a real localization project where MT has been used and post-editing was required:
Source: "They can paint their own carnival mask and grow their own plants, Week 8 and 9." MT-Output (from one of the big MT providers): "Sie können ihre eigene Karnevalsmaske malen und ihre eigenen Pflanzen züchten, Woche 8 und 9."  

Reference translation (after human post-edit): "In den Wochen 8 und 9 können sie ihre eigene Karnevalsmaske malen und ihre eigenen Pflanzen züchten."


Now if you take the same source text and use the model I trained:

Initial version of MT-output (similar to the the example above):
Sie können ihre eigene Karneval-Maske malen und ihre eigenen Pflanzen züchten, Woche 8 und 9.

Now you make the following corrections, by just adding "In den" at the start of the initial mt-output or by modifying the first two words of the initial output and send the corrections and source text by pressing "Ctrl+Space" to MT-model:
In den Sie können ihre eigene Karneval-Maske malen und ihre eigenen Pflanzen züchten, Woche 8 und 9.
or
In den ihre eigene Karneval-Maske malen und ihre eigenen Pflanzen züchten, Woche 8 und 9.

next version of MT-output after making corrections "In den" (almost the same as the reference translation):
In den Wochen 8 und 9 können sie ihre eigene Karneval-Maske malen und ihre eigenen Pflanzen züchten.

Would be great to get some feedback from you, would help me a lot.

thanks a lot,

Arben

Arben Sabani

unread,
Apr 14, 2019, 9:04:34 AM4/14/19
to tensor2tensor
I mentioned Lukasz in this post, but every comment and feedback is welcome :-)

Addtininal note:

Initial BLEU (newstest2013, further training required):
BLEU_uncased =  25.52
BLEU_cased =  25.01

BLEU after corrections in intial mt-output (changed first word starting from left to right in each sentence, which differs from reference translation):
BLEU_uncased =  32.35
BLEU_cased =  32.07

And BLEU improvement is cause by the changed word or course but not only, the structure of the the output sentences has changed and MT-model has, where required, generated additional words.

Thanks,

Arben

Lukasz Kaiser

unread,
Apr 28, 2019, 12:48:00 PM4/28/19
to Arben Sabani, tensor2tensor
Hi Arben,

It looks like very nice work and you're already getting great results! As for related papers, it's a bit different but I think this recent paper is definitely worth reading:

Congratulations on your results!

Lukasz

--
You received this message because you are subscribed to the Google Groups "tensor2tensor" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tensor2tenso...@googlegroups.com.
To post to this group, send email to tensor...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tensor2tensor/0c1e85e5-c7c1-4953-9bda-664034c89253%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Arben Sabani

unread,
Apr 28, 2019, 4:42:38 PM4/28/19
to tensor2tensor
Thanks a lot Lukasz. I will have a look at the paper :-)


On Sunday, 28 April 2019 18:48:00 UTC+2, Lukasz Kaiser wrote:
Hi Arben,

It looks like very nice work and you're already getting great results! As for related papers, it's a bit different but I think this recent paper is definitely worth reading:

Congratulations on your results!

Lukasz

To unsubscribe from this group and stop receiving emails from it, send an email to tensor...@googlegroups.com.

est namoc

unread,
Jun 25, 2019, 7:34:25 AM6/25/19
to tensor2tensor
Hi arben, 

I am working on a similar topic, you might want to look at  https://arxiv.org/abs/1905.11006 as well. But for the editing model itself, check https://papers.nips.cc/paper/6775-deliberation-networks-sequence-generation-beyond-one-pass-decoding

Arben Sabani

unread,
Jul 13, 2019, 8:47:48 AM7/13/19
to tensor2tensor
Thanks a lot, very interesting articles. I had a similar idea or approach how translation should work and I like the approach in the first paper.

best

Arben
Reply all
Reply to author
Forward
0 new messages