Apple details personalized ‘Hey Siri’ voice recognition in latest Machine Learning Journal entry

Zac Hall | Apr 16 2018 - 8:23 am PT

Apple’s Siri team has published a new Machine Learning Journal entry that details some of the process behind making voice-activated ‘Hey Siri’ work with just our voice. Apple previously documented part of the process behind pulling off voice-activated Siri in general last fall, and the first Machine Learning Journal entry of this year focuses on the challenge of speaker recognition.

As referenced in the previous entry, Apple says the phrase ‘Hey Siri’ was chosen in part because a number of users were already using it naturally when activating Siri with a hardware button.

The phrase “Hey Siri” was originally chosen to be as natural as possible; in fact, it was so natural that even before this feature was introduced, users would invoke Siri using the home button and inadvertently prepend their requests with the words, “Hey Siri.”

The new entry describes three challenges with activating Siri by voice: the main user saying a similar phrase to Hey Siri, another user saying Hey Siri, or another user saying a similar phrase to Hey Siri.

By limiting activation to the main user’s voice, the design ideally prevents two out of those three issues. The entry touches on the surface of how Apple approaches that problem:

We measure the performance of a speaker recognition system as a combination of an Imposter Accept (IA) rate and a False Reject (FR) rate. It is important, however, to distinguish (and equate) these values from those used to measure the quality of a key-phrase trigger system.

As with each Machine Learning Journal entry, the piece then takes a relatively detailed look at Apple’s implementation before touching on the unsolved problems with the feature: using Hey Siri in a noisy environment or large room.

One of our current research efforts is focused on understanding and quantifying the degradation in these difficult conditions in which the environment of an incoming test utterance is a severe mismatch from the existing utterances in a user’s speaker profile.

Voice-activated Siri started with the iPhone 6 as the piece notes, although the original version only worked when the device was charging. Today Hey Siri works on new iPhones, iPads, and Apple Watches without charging, and it’s the primary controller for HomePod. In the future, the same Hey Siri feature may be how we interact with AirPods as well.

The full entry — which is based on research submitted for the International Conference on Acoustics, Speech, and Signal Processing — offers a rare close look at the amount of thinking behind a feature that hopefully feels natural to the user.

Add 9to5Mac to your Google News feed.

FTC: We use income earning auto affiliate links. More.

Check out 9to5Mac on YouTube for more Apple news:

Comments

Author

Zac Hall apollozac

Zac covers Apple news, hosts the 9to5Mac Happy Hour podcast, and created SpaceExplored.com.

Zac Hall's favorite gear

Samsung Frame 4K TV (Save over 30%)

Enjoy powerful brightness and rich contrast with OLED HDR+. Discover pure blacks, bright whites and Pantone-validated color with OLED Technology. Starting at $1,299, save up to $1,900 for a limited time only!

Samsung X5 Thunderbolt SSD

Thunderbolt 3 delivers 40Gb/s for blazing fast data transfer. Pricing from $199.99.

Apple details personalized ‘Hey Siri’ voice recognition in latest Machine Learning Journal entry

Samsung S90D OLED TVs (Save up to $1,900!)

Samsung X5 Thunderbolt SSD

Samsung T7 Portable SSD (Save over 30%)

Comments

Guides

Siri

Machine Learning Journal

Author

Zac Hall's favorite gear

Samsung Frame 4K TV (Save over 30%)

Samsung X5 Thunderbolt SSD