A more intelligent and connected Siri has been one of the flagship features of recent iOS updates, and Steven Levy at Backchannel has a new interview with Apple execs that features a behind-the-scenes look at what goes in to Apple’s artificially intelligent assistant. The piece begins by detailing Siri’s voice recognition upgrade in 2014 when the system moved “to a neural-net based system” for US users and mentions yet another upgrade coming with iOS 10…
Initial reviews were ecstatic, but over the next few months and years, users became impatient with its shortcomings. All too often, it erroneously interpreted commands. Tweaks wouldn’t fix it.
So Apple moved Siri voice recognition to a neural-net based system for US users on that late July day (it went worldwide on August 15, 2014.) Some of the previous techniques remained operational — if you’re keeping score at home, this includes “hidden Markov models” — but now the system leverages machine learning techniques, including deep neural networks (DNN), convolutional neural networks, long short-term memory units, gated recurrent units, and n-grams. (Glad you asked.) When users made the upgrade, Siri still looked the same, but now it was supercharged with deep learning.
Levy says Apple didn’t publicize the backend upgrade (until now) but the improvements at the time were dramatic (this is around iPhone 6 time if you recall).
“This was one of those things where the jump was so significant that you do the test again to make sure that somebody didn’t drop a decimal place,” says Eddy Cue, Apple’s senior vice president of internet software and services.
The focus of the piece and the point of the interviews seems to be to communicate that Apple’s Siri is a real competitor to similar efforts by Google and Microsoft.
As we sat down, they handed me a dense, two-page agenda listing machine-learning-imbued Apple products and services — ones already shipping or about to — that they would discuss.
The message: We’re already here. A player. Second to none.
But we do it our way.
Here’s Schiller making the case:
“We’ve been seeing over the last five years a growth of this inside Apple,” says Phil Schiller. “Our devices are getting so much smarter at a quicker rate, especially with our Apple design A series chips. The back ends are getting so much smarter, faster, and everything we do finds some reason to be connected. This enables more and more machine learning techniques, because there is so much stuff to learn, and it’s available to [us].”
And in the piece, Federighi describes Apple’s approach to machine learning as applying to each team’s project versus having a single machine learning team.
“We don’t have a single centralized organization that’s the Temple of ML in Apple,” says Craig Federighi. “We try to keep it close to teams that need to apply it to deliver the right user experience.”
How many people at Apple are working on machine learning? “A lot,” says Federighi after some prodding.
Levy uses the iPad Pro’s Apple Pencil with palm rejection as an example:
One example of this is the Apple Pencil that works with the iPad Pro. In order for Apple to include its version of a high-tech stylus, it had to deal with the fact that when people wrote on the device, the bottom of their hand would invariably brush the touch screen, causing all sorts of digital havoc. Using a machine learning model for “palm rejection” enabled the screen sensor to detect the difference between a swipe, a touch, and a pencil input with a very high degree of accuracy. “If this doesn’t work rock solid, this is not a good piece of paper for me to write on anymore — and Pencil is not a good product,” says Federighi. If you love your Pencil, thank machine learning.
The piece is probably the most detailed look at Siri that we’ve seen today, and if there’s any news to be made from it, it’s this on Siri and iOS 10:
With iOS 10, scheduled for full release this fall, Siri’s voice becomes the last of the four components to be transformed by machine learning. Again, a deep neural network has replaced a previously licensed implementation. Essentially, Siri’s remarks come from a database of recordings collected in a voice center; each sentence is a stitched-together patchwork of those chunks. Machine learning, says Gruber, smooths them out and makes Siri sound more like an actual person.
That may explain Barbara Streisand’s September 30th comment from earlier this week if an upgrade to Siri is scheduled separately from iOS 10 which is expected earlier in the month. (I don’t think there was an iOS update, for example, on August 15, 2014 when the upgrade mentioned at the top went global.)
The whole piece is packed with interesting stories about Siri dating back to the team being acquired by Apple under Steve Jobs, Apple’s conservative approach to privacy, and what Apple is doing with Differential Privacy in iOS 10. Give it a read here.
Image Credit: Michelle Le