354
Audio & Video Production343
Automation & Workflow224
Software Development250
Marketing & Growth192
AI Infrastructure & MLOps173
Writing & Content Creation203
Data & Analytics140
Design & Creative169
Customer Support130
Photography & Imaging156
Sales & Outreach125
Voice & Speech135
Operations & Admin87
Education & Learning131
OpenAI launched new tools in its Realtime API for speaking, live translation, and speech-to-text, aimed at apps like customer support and education.
In short: OpenAI says developers can now build apps that talk, translate, and write down speech in real time using new features in its API.
OpenAI announced new “voice intelligence” features for its Realtime API. An API is a set of building blocks that companies and developers use to add a feature to their own apps, like plugging a new part into a machine.
One new option is GPT-Realtime-2, which OpenAI says can hold more realistic spoken conversations and handle more complex requests than its earlier version. This is meant for apps that need back-and-forth voice conversations, rather than simple question-and-answer.
OpenAI also introduced GPT-Realtime-Translate for live translation. It supports more than 70 input languages, meaning it can understand speech in those languages, and it can speak back in 13 output languages. The company also launched GPT-Realtime-Whisper, a live transcription tool that turns speech into text as someone talks (like captions that appear while you speak).
OpenAI said these tools could be used in customer service, education, media, events, and creator platforms. It also said it has safeguards to reduce misuse, such as spam or fraud, and that it can stop conversations if they appear to break its harmful content rules.
OpenAI’s pricing varies by feature. Translate and Whisper are billed by the minute, while GPT-Realtime-2 is billed by “tokens,” which are small chunks of text that models use for counting usage (similar to paying by the word).
More apps may start offering phone-call style help, live subtitles, or on-the-fly language translation without needing a human to type or interpret every line. At the same time, tools that can sound convincing can also be used to deceive people, so the effectiveness of OpenAI’s safeguards will matter.
Source: TechCrunch AI