Looking for Google Project ready voice module

4 views
Skip to first unread message

Mehedi Hasan Shihab

unread,
Oct 27, 2025, 6:40:08 AMOct 27
to Google Developers

Hi,
I’m seeking for a Google-provided, production-ready voice module for my project, similar in function to OpenAI’s Whisper, but with a specific focus on real-time, speech-to-speech conversation.

Does Google (through Google Cloud, Gemini, or Vertex AI) offer a high-performance API or SDK that I can purchase and integrate?

The ideal service would need to:

Transcribe incoming audio from a user in real-time.

Process that transcription (e.g., send it to an AI for a response).

Synthesize the text response back into natural-sounding speech.

Deliver this synthesized voice back to the user with minimal latency to enable a fluid, real-time conversation.

In my research, I found a model named gemini-2.5-flash-native-audio-preview-09-2025. This seems promising, but I have a few specific questions:

Is this the correct model for my real-time speech-to-speech use case?

What is the status of this model? The “preview” tag suggests it might not be stable or ready for a live, production-level project. Can you confirm if this is project-ready?

If this is the right choice, what is the recommended integration path or SDK for building a full-duplex voice assistant around this model?

Regards,
Mehedi

Reply all
Reply to author
Forward
0 new messages