In Google’s
announcement of Gemini 1.5 yesterday, they mentioned their
AI Studio, which I hadn’t known about. I had the afternoon free today, so I spent some time playing with it using the Gemini 1.0 Pro model.
One thing you can do in the AI Studio is create “structured prompts” that include uploaded examples of input and output for the model to refer to. I gave it a try with translation. I prepared a CSV file containing about 25,000字 of Japanese speeches and my own previous English translations of them, with one paragraph per row. I uploaded that CSV file to the AI Studio and had Gemini 1.0 Pro translate a similar speech into English with those examples as reference. I then compared the resulting translation both to the Gemini translation done through the usual web interface and to my own translation, which I did last month.
The version that had been prompted with my previous translations was a bit more accurate than the zero-shot Gemini translation, which included a summary at the end that I had not asked for. Otherwise, there wasn't too much of a difference. (My own translation was better than either, I hope.)
But the speeches I used are pretty general in content and don’t use much specialized vocabulary. It would be interesting to try this for translation tasks that require specific vocabulary and a particular sentence style; one could, for example, upload bilingual glossaries and sentence pairs for the input and output examples. If that works, then one could prepare a different prompt-and-example set for each type of translation job. The translations could be done either within the AI Studio or in a local program that calls Google’s API. (The Studio will produce the code for the local program.)
The context window for Gemini 1.0 Pro in the AI Studio is currently 30,720 tokens, and my uploaded samples came close to hitting that limit. Supposedly Gemini 1.5 allows
much larger context windows as well as superior performance. I have applied for developer access to Gemini 1.5. If I get it and the model does perform significantly better, I will let you all know.
The speeches I used are all available on the public web, so I didn’t feel hesitant about uploading them to the AI Studio. I would not do that with confidential or otherwise sensitive material.
Tom Gally