After first trying out the new Siri capabilities, and exploring the new Writing Tools, next in line for me on the Apple Intelligence front was the new call transcription and summary feature.
This was a feature I’d been keen to try, not least because it could completely transform the experience of interviewing someone by phone …
Recording a call
You can see the UI flow in the main image above. When you make or receive a call, there’s a new button top-left on the screen. Tap this, and an alert displays that all parties will be informed that the call is being recorded.
After a three-second countdown, a voice announcement is made:
This call will be recorded
This is a legal requirement in some US states and many countries around the world.
As recording starts, a banner appears, inviting you to take notes on the call. After that, the banner disappears and you’re just left with a waveform and a button to end recording.
The recording process really couldn’t be easier.
I believe the intention here is that audio call recording will be a system-wide feature, meaning it will work in third-party apps too, but that isn’t yet the case.
Transcription
Once the call is complete, and whether or not you accept the option to take notes, a new note opens with the audio recording embedded into it.
You can then transcribe this, which for a five-minute call took just seconds.
You can also play the recording, and get Apple Music-style time-synced highlighting of the transcription. Or you can do it the other way around: tap any part of the transcription and it will play that part of the recording.
As you might already be able to guess from the above sample, the current transcription performance is … uh … not good.
Greg’s “Yeah” was turned into “Clear straight,” and my question “What’s your normal policy on betas?” was somehow creatively reinterpreted as “What’s your normal Palestine beat?”
Things didn’t get any better from there. There were a lot of very odd substitutions, and line-breaks were rather random. For example:
Greg Gladwell
Thinking I suppose becauseGreg Gladwell
It isGreg Gladwell
One of the coolest things for awhile and will under undeniably be very , very useGreg Gladwell
Indicted today life [a mangling of “in day-to-day life”]Greg Gladwell
There’s gonna be you know just be able to summarize things and call or emails rather than enter
At this point it just lost half a sentence.
Something you can also see above is random formatting, like that space before the comma.
This is a first beta of a beta feature, and I have to say it looks like it!
For comparison, here’s the MacWhisper transcription of the same recording:
Very good.
Yeah.
Alright.
It’s nice and clear and straightforward.
Yeah.
What’s your normal policy on beta?
Are you doing this one a bit earlier than usual?
Out of interest a bit, yes.
I’m not going for the iPhone yet.
Just because things are a little bit critical to have it working properly.
Beta 2 is tidied up, I suppose you’d say, compared to beta 1.
And there’s fewer and fewer visual glitches and bits and pieces like that.
There were very old, random things like the Siri app selection widget had a very small and untidy board around the icons and so on.
That’s been tidied up.
Keyboard glitches, there’s one or two still present when you’re putting the visual keyboard on screen.
And just a few little niggles here and there.
It was more untidiness than functionality problems.
Right.
And you’ve gone for the Apple Intelligence thing as well, haven’t you?
Yes, I’ve tried the Apple Intelligence on the iPad because it’s M1.
Yeah.
And everything seemed to be fine.
Obviously that doesn’t know who’s speaking, but the quality is massively better. That may in part be due to the obviously greater power of the Mac processor, since all of this is done on-device for privacy.
Summaries
As soon as the transcript is complete, you can also tap on it to be offered a summary. Here’s what it produced for our conversation about the Apple Intelligence beta:
The “Palestine beat” part aside, it’s not terrible, just very, very generic. I’m not sure how useful it would be for most people to have such a general summary, though I guess if you’re a lawyer or someone else with hundreds or thousands of transcriptions, then perhaps indexing these would help you find the right one.
Mostly, then, I’m excited for the future
This is a very convenient way to record calls, so I’ll use it on the rare occasions I need to do so, but the current transcription capabilities are not really at the point of being useful.
But I am very excited about the potential for this once it works well. For example, I wrote a while back about how a MacWhisper transcription saved the day when I had an unusable audio track for a video, but hadn’t initially realized this – which made it far harder to sync with my backup recording.
Running the audio file though MacWhisper meant that, just 90 seconds later, I had a complete, time-stamped transcript. I could then search for a phrase used in the edit, and immediately jump to that part of the audio file to substitute it for the original. A few frame-level nudges saw the video and audio properly lip-synced. The whole process took just a few minutes.
I can absolutely see me using an iPhone as an additional audio recording device during interviews, making it really easy to find quotes and listen again to them.
For telephone interviews in particular, the sheer convenience of immediately having a time-synced transcription will be fantastic.
So … not usable yet, but given the performance of other transcription tools out there, I suspect it won’t take too long until it is.
FTC: We use income earning auto affiliate links. More.
Comments