OpenAI introduced new voice AI models for live speech, translation, and transcription

OpenAI presented new audio models for real-time voice AI services

OpenAI announced the launch of three new audio models for its API, enabling voice AI services with real-time features such as translation, transcription, and support for complex dialogues.

The first of the new models, GPT-Realtime-2, offers an enhanced level of dialogue capable of maintaining longer and more complex conversations. The model can simultaneously use multiple tools, respond to changes in context, and work with specialized terminology. Additionally, developers can adjust the model’s reasoning level from minimal to high. In the Big Bench Audio and Audio MultiChallenge tests, GPT-Realtime-2 showed improved results compared to the previous version.

The second model, GPT-Realtime-Translate, is designed for instant voice translation. It supports over 70 input languages and 13 output languages and is already being tested in international calls and in customer support, including at Deutsche Telekom and startup BolnaAI.

The third model, GPT-Realtime-Whisper, is developed for real-time speech transcription, making it ideal for subtitling, note-taking during calls, and automating the work of voice agents.

All three models are now available for use through the Realtime API. The cost of using the models varies: GPT-Realtime-2 costs $32 per million input audio tokens and $64 on output; GPT-Realtime-Translate is $0.034 per minute, and GPT-Realtime-Whisper is $0.017 per minute.

This step is significant in the development of voice interaction in AI technologies, which can greatly facilitate international communications and automate many business processes. Experts predict that OpenAI’s new models could be a significant advancement in the field of natural language processing.

Model	Function	Cost
GPT-Realtime-2	Enhanced dialogue	$32 on input, $64 on output
GPT-Realtime-Translate	Real-time translation	$0.034 per minute
GPT-Realtime-Whisper	Real-time transcription	$0.017 per minute

Гарячі теми

Політика

Суспільство

Економіка

Технології

Гарячі теми

Політика

Суспільство

Економіка

Технології

OpenAI introduced new voice AI models for live speech, translation, and transcription

Topics

Similar articles

Company

Headlines

Newsletter