Mindblowing ChatGPT-4o real-time translation should terrify Google

With Google I/O set to focus on the increasing talents of the Gemini AI app tomorrow, OpenAI is getting in there first by launching the latest version of Chat-GPT – ChatGPT-4o.

The new Chat GPT-4o – the ‘o’ stands for ‘omni’ because of its ability to handle audio, video and text – is headlined by the speed of real-time translation.

For the iteration of ChatGPT-4 the company says it “trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations.”

For people speaking different languages, this system could reap incredible rewards. It acts as a real-time go between with very little latency between hearing repeating the utterances back in the intended language.

If the demonstration showcased during OpenAI’s presentation today is the experience users get, it throws down the gauntlet to Google – the long time kings of mobile language translation through its powerful and brilliant Translate app.

One of the videos below video (there are other examples too) shows a man asking ChatGPT to act as a translator.

The man asks the AI to translate everything it hears in English into Italian, and then the other way around. Then, the OpenAI CTO Mira Murati speaks in Italian and the English response comes very rapidly, with an impressively conversational tone.

Interestignly, the AI refers to the speaker of the original language in the third person (“she said that…”) rather than simply translating the the utterance. It is informed by the nuances in the user’s voice and can generate voices in a “a range of different emotive styles”. OpenAI says it outperforms rivals like Google and Meta in terms of speed too.

Elsewhere videos published by the company shows users being able to interject and correct the AI and have it quickly shift course and respond in kind. Check out the faster counting video below. The company also showcased the ability the incredibly lifelike conversational tone of voice and the ability to recognise its surroundings.

OpenAI says text and image input for GPT-4o is coming today, while the voice and video input will be added to the API in the coming weeks.

