Researchers at three universities have accused Apple of taking an ‘immense risk’ with the security of user data thanks to what they say is a poor implementation of differential privacy.
Differential privacy is a method of allowing Apple and other companies to analyse user data in a way intended to be completely anonymous. Enough noise is injected into the data that it is supposed to be impossible to match any of that data to a specific individual.
However, security researchers have for the second time questioned how well Apple’s implementation works in practice …
A cryptography professor from Johns Hopkins last year accused Apple of failing to adequately test its approach, and using a custom implementation whose safety could not be independently verified because Apple chooses not to share the code.
Wired reports that researchers at the University of Southern California, Indiana University, and China’s Tsinghua University have now gone much further than this. They reverse-engineered some of the differential privacy code used by Apple and said that the protection was so poor as to be pointless.
If you’re not up to speed on differential privacy, we ran an explainer last year. The bottom line is that Apple gathers personal data from everyone who opts in, and then adds some ‘noise’ to that data designed to make it impossible to work out which data came from which person.
The effectiveness of that noise is measured by something called the epsilon value, where the lower the number, the greater the protection. Most security researchers consider the ideal value to be 1. Any number above that puts data privacy at risk. The researchers say the numbers used by Apple are scary.
The research team determined that MacOS’s implementation of differential privacy uses an epsilon of 6, while iOS 10 has an epsilon of 14. As that epsilon value increases, the risk that an individual user’s specific data can be ascertained increases exponentially.
And it’s not just random researchers who hold this view, says the Wired piece: one of the inventors of differential privacy, Frank McSherry, agrees.
“Anything much bigger than one is not a very reassuring guarantee,” McSherry says. “Using an epsilon value of 14 per day strikes me as relatively pointless” as a safeguard […]
“Apple has put some kind of handcuffs on in how they interact with your data,” he says. “It just turns out those handcuffs are made out of tissue paper.”
McSherry explains the risk with a very practical example of someone who has a 1-in-a-million medical condition uploading data from the Health app.
After one upload obfuscated with an injection of random data, McSherry says, the company’s data analysts would be able to figure out with 50 percent certainty whether the person had the condition. After two days of uploads, the analysts would know about that medical condition with virtually 100 percent certainty.
Apple denies the claim, accusing the researchers of a fundamental misunderstanding in the way they calculated the epsilon values. The company says that they added together the various epsilon values to reach a total for a given upload, assuming that all that data could be correlated by Apple.
Apple says it doesn’t correlate those different kinds of data—that it’s not sure how disparate data types like emoji use and health data could be meaningfully correlated. And it adds that it also doesn’t assemble profiles of individuals over time, institutes limits on storing data that would make that correlation impossible, and throws out any data like IP addresses that could be used as unique identifiers to tie any findings to a particular person.
The researchers argue that this relies on users trusting Apple not to abuse the data, when the whole point of differential privacy is to provide mathematical guarantees that it isn’t possible to do so.
We ran a poll on differential privacy when Apple began using it to collect web-browsing and health data. At that point, the majority of you were either 100% or very comfortable with Apple doing so, while 9% were 100% or very uncomfortable.
We’ve reached out to Apple for comment, and will update with any response. If you want to opt out from Apple’s data collection, you can do so in Settings > Privacy then scroll to Analytics at the bottom and switch off the ‘Share iPhone & Watch Analytics Data’ toggle.