Monday, May 25, 2026
14.4 C
Kyiv

OpenAI introduced new voice AI models for live speech, translation, and transcription

OpenAI presented new audio models for real-time voice AI services

OpenAI announced the launch of three new audio models for its API, enabling voice AI services with real-time features such as translation, transcription, and support for complex dialogues.

The first of the new models, GPT-Realtime-2, offers an enhanced level of dialogue capable of maintaining longer and more complex conversations. The model can simultaneously use multiple tools, respond to changes in context, and work with specialized terminology. Additionally, developers can adjust the model’s reasoning level from minimal to high. In the Big Bench Audio and Audio MultiChallenge tests, GPT-Realtime-2 showed improved results compared to the previous version.

The second model, GPT-Realtime-Translate, is designed for instant voice translation. It supports over 70 input languages and 13 output languages and is already being tested in international calls and in customer support, including at Deutsche Telekom and startup BolnaAI.

The third model, GPT-Realtime-Whisper, is developed for real-time speech transcription, making it ideal for subtitling, note-taking during calls, and automating the work of voice agents.

All three models are now available for use through the Realtime API. The cost of using the models varies: GPT-Realtime-2 costs $32 per million input audio tokens and $64 on output; GPT-Realtime-Translate is $0.034 per minute, and GPT-Realtime-Whisper is $0.017 per minute.

This step is significant in the development of voice interaction in AI technologies, which can greatly facilitate international communications and automate many business processes. Experts predict that OpenAI’s new models could be a significant advancement in the field of natural language processing.

Model Function Cost
GPT-Realtime-2 Enhanced dialogue $32 on input, $64 on output
GPT-Realtime-Translate Real-time translation $0.034 per minute
GPT-Realtime-Whisper Real-time transcription $0.017 per minute

Popular this week

Paid for Content That Legally Doesn’t Exist: Who Is the True Author of AI-Created Works — Column

Copyright on AI-generated content remains undefined globally and in...

Hungary has imposed a ban on the import of agricultural products from Ukraine.

Hungary imposes a ban on the import of Ukrainian...

Ukraine ranked 43rd in the global startup ecosystem ranking according to StartupBlink.

Ukraine ranked 43rd in the global startup ecosystem ranking...

Topics

Similar articles

Popular categories