GDPR regulations give European citizens a legal right to see all the personal data held on them, among other rights. But a tech writer requesting data from Apple, Amazon, Facebook and Google found that the data isn’t as easy to understand as it should be …
The Verge’s Jon Porter conducted the test.
What I found suggested that while you can certainly get the raw data, actually understanding it is another matter, which makes it harder to make informed decisions about your data.
[The law requires that personal data] be provided to the individual in a “concise, transparent, intelligible and easily accessible form, using clear and plain language” in a “commonly used electronic format.”
But the extent to which companies comply with the ‘intelligible and easily accessible’ part of that varies, with Facebook ironically doing the best job, followed by Amazon, then Apple, and finally Google.
Facebook, wrote Porter, was the gold standard.
Facebook actually had the most comprehensible data of the four services. For starters, every single file Facebook gives you is an HTML file. Each is sorted into its own clearly labeled folder, and an index file gives you an overview of what each document contains. The files themselves are clearly laid out and formatted, and browsing them feels almost like browsing a page on Facebook itself, albeit one that’s stored entirely locally on your computer.
Apple provided a mix of different file formats, and didn’t make it easy to parse.
The majority of the data Apple provided was in file types that were easy to read and understand like CSV, TXT, and JPG, with only a couple of JSON files to confuse things.
But once you get into these files, there’s still a lot of information that’s difficult to understand. A file titled, “Apple ID Account Information” appeared to contain 11 nearly identical records about my Apple account, all created on exactly the same date in 2014, with no explanation as to what they were. Another CSV file with the ambiguous title of “Apps and Service Analytics” appears to contain an entire list of every single one of my App Store searches, but it has so many empty cells that I only noticed it had data in it when I saw its 6.7MB file size.
Amazon was a little better, but Google did very poorly.
All of my location data from Google was contained within a single 61MB JSON file, and opening it with Chrome revealed a bewildering array of fields labeled “timestampMs,” “latitudeE7,” “logitudeE7,” and estimations about whether I was sitting still or in some kind of transport (I assume).
I don’t doubt that this is all the location history information that Google has associated with my account, but without context, this data is meaningless. It’s a series of numbers that I’d have to make a serious effort to even begin to understand and import into another piece of software to properly parse.
Other Google data was equally hard to understand. Indeed, the company recently received the largest GDPR fine to date for lack of transparency.
You can argue that file formats like CSV comply with the letter of the law. They are commonly used, and most consumers have software that will open them – as text files if nothing else. CSV is also a good format if you want to run some kind of analysis on the data.
But the spirit of the law is that ordinary consumers can easily understand how much a company knows about them, and whether they are happy for it to have that data. I can’t help feeling that Facebook’s approach is the right one here. HTML lets consumers easily browse the data in the familiar environment of a web-browser, easily getting a sense of whether or not they are comfortable with what they see.