Thanks for the kind words, Kevin.
I should point out, though, that my creative contribution to that project has been minimal. Yes, it was my idea to try varying translations across multiple axes, but it was Claude that actually came up with the specific axes for Terada Torahiko essays (other than two or three that I suggested). It also devised and implemented the idea of testing the validity of the parameters by having other LLMs evaluate sample translations at various settings. That would not have occurred to me.
Regarding the issue of human vs. LLM-driven machine translation: At least for the sorts of translations I do and have been testing with LLMs, the quality differences had already entered the range of individual preference a year or more ago. Once it became possible to have LLMs check each other’s translations for omissions and mistakes, the differences from human translation came down to subjective judgment and taste. When I use a multi-LLM workflow to translate a speech now, I still make a lot of changes to their final draft before submitting it. But the changes I make are no longer corrections of the text as a translation. Instead, they reflect my own preferences for English writing and speaking style; other human translators and editors might very well prefer the machines’ versions.
And regarding that experiment I did with having the same text translated sixteen different ways with various combinations of parameter settings: Are we safe in saying that it would be impossible to get such a set of translations from human translators? At least, I myself would not be able to create a set of translations with the variations controlled to certain specified parameters like that, and it would seem difficult to assemble a team of human translators to do so, too.
Tom Gally