Skip to main content

Amazing iPad AI tutor demo points to an incredible new world for students

If you haven’t yet watched yesterday’s OpenAI event, I highly recommend doing so. The headline news was that the latest GPT-4o model works seamlessly with any combination of text, audio, and video.

That includes the ability to ‘show’ the GPT-4o app a screen recording you are taking of another app – and it’s this capability the company showed off with a pretty incredible iPad AI tutor demo …

GPT-4o

OpenAI said that the ‘o’ stands for ‘omni.’

GPT-4o (“o” for “omni”) is a step towards much more natural human-computer interaction—it accepts as input any combination of text, audio, and image and generates any combination of text, audio, and image outputs.

It can respond to audio inputs in as little as 232 milliseconds, with an average of 320 milliseconds, which is similar to human response time(opens in a new window) in a conversation […] GPT-4o is especially better at vision and audio understanding compared to existing models.

Even the voice aspect of this is a big deal. Previously, ChatGPT could accept voice input, but it converted it to text before working with it. GPT-4o, in contrast, actually understands speech, so completely skips the conversion stage.

As we noted yesterday, free users also get a lot of features previously limited to paying subscribers.

AI iPad tutor demo

One of the capabilities OpenAI demonstrated was the ability of GPT-4o to watch what you’re doing on your iPad screen (in split-screen mode).

The example shows the AI tutoring a student with a math problem. You can hear that, initially, GPT-4o understood the problem and wanted to immediately solve it. But the new model can be interrupted, and in this case it was asked to help the student solve it himself.

Another capability seen here is that the model claims to detect emotion in speech, and can also express emotions itself. For my tastes, this was rather overdone in the demo version, and that’s reflected here – the AI is maybe a bit on the condescending side. But that’s all tuneable.

Effectively, every student in the world could have a private tutor with this kind of capability.

How much of this will Apple incorporate?

We know that AI is the primary focus of iOS 18, and that it is finalizing a deal to bring OpenAI features to Apple devices. While at the time that was described as being for ChatGPT, it now seems pretty likely that the actual deal is for access to GPT-4o.

But we also know that Apple has been working on its own AI models, with its own data centers running its own chips. For example, Apple has been working on its own way to allow Siri to make sense of app screens.

So we don’t know exactly which GPT-4o capabilities the company will bring to its devices, but this one seems so perfectly Apple that I have to believe it will be included. This is truly using technology to empower people.

Image: OpenAI. Benjamin Mayo contributed to this report.

FTC: We use income earning auto affiliate links. More.

You’re reading 9to5Mac — experts who break news about Apple and its surrounding ecosystem, day after day. Be sure to check out our homepage for all the latest news, and follow 9to5Mac on Twitter, Facebook, and LinkedIn to stay in the loop. Don’t know where to start? Check out our exclusive stories, reviews, how-tos, and subscribe to our YouTube channel

Comments

Author

Avatar for Ben Lovejoy Ben Lovejoy

Ben Lovejoy is a British technology writer and EU Editor for 9to5Mac. He’s known for his op-eds and diary pieces, exploring his experience of Apple products over time, for a more rounded review. He also writes fiction, with two technothriller novels, a couple of SF shorts and a rom-com!


Ben Lovejoy's favorite gear

Manage push notifications

notification icon
We would like to show you notifications for the latest news and updates.
notification icon
Please wait...processing
notification icon
We would like to show you notifications for the latest news and updates.
notification icon
Please wait...processing