Thursday, May 14, 2015

Open letter to Google on RTBF processes

I'm one of 80 signatories to an open letter to Google requesting more transparency from the company over how it processes so-called ‘right to be forgotten’ requests. The letter was drafted and coordinated by Julia Powles at Cambridge University and Ellen P. Goodman at Rutgers University School of Law.

Copy below.
"What We Seek
Aggregate data about how Google is responding to the >250,000 requests to delist links thought to contravene data protection from name search results. We should know if the anecdotal evidence of Google’s process is representative: What sort of information typically gets delisted (e.g., personal health) and what sort typically does not (e.g., about a public figure), in what proportions and in what countries?
Why It’s Important
Google and other search engines have been enlisted to make decisions about the proper balance between personal privacy and access to information. The vast majority of these decisions face no public scrutiny, though they shape public discourse. What’s more, the values at work in this process will/should inform information policy around the world. A fact-free debate about the RTBF is in no one’s interest.
Why Google
Google is not the only search engine, but no other private entity or Data Protection Authority has processed anywhere near the same number of requests (most have dealt with several hundred at most). Google has by far the best data on the kinds of requests being made, the most developed guidelines for handling them, and the most say in balancing informational privacy with access in search. We address this letter to Google, but the request goes out to all search engines subject to the ruling.

One year ago, the European Court of Justice, in Google Spain v AEPD and Mario Costeja González, determined that Google and other search engines must respond to users’ requests under EU data protection law concerning search results on queries of their names. This has become known as the Right to Be Forgotten (RTBF) ruling. The undersigned have a range of views about the merits of the ruling. Some think it rightfully vindicates individual data protection/privacy interests. Others think it unduly burdens freedom of expression and information retrieval. Many think it depends on the facts.
We all believe that implementation of the ruling should be much more transparent for at least two reasons: (1) the public should be able to find out how digital platforms exercise their tremendous power over readily accessible information; and (2) implementation of the ruling will affect the future of the RTBF in Europe and elsewhere, and will more generally inform global efforts to accommodate privacy rights with other interests in data flows.
Google reports that it has received over 250,000 individual requests concerning one million URLs in the past year. It also reports that it has delisted from name search results just over 40% of the URLs that it has reviewed. In various venues, Google has shared some 40 examples of delisting requests granted and denied (including 22 examples on its website), and it has revealed the top sources of material requested to be delisted (amounting to less than 8% of total candidate URLs). Most of the examples surfaced more than six months ago, with minimal transparency since then. While Google’s decisions will seem reasonable enough to most, in the absence of real information about how representative these are, the arguments about the validity and application of the RTBF are impossible to evaluate with rigour.
Beyond anecdote, we know very little about what kind and quantity of information is being delisted from search results, what sources are being delisted and on what scale, what kinds of requests fail and in what proportion, and what are Google’s guidelines in striking the balance between individual privacy and freedom of expression interests.
The RTBF ruling addresses the delisting of links to personal information that is “inaccurate, inadequate, irrelevant, or excessive for the purposes of data processing,” and which holds no public interest. Both opponents and supporters of the RTBF are concerned about overreach. Because there is no formal involvement of original sources or public representatives in the decision-making process, there can be only incidental challenges to information that is delisted, and few safeguards for the public interest in information access. Data protection authorities seem content to rely on search engines’ application of the ruling’s balancing test, citing low appeal rates as evidence that the balance is being appropriately struck. Of course, this statistic reveals no such thing. So the sides do battle in a data vacuum, with little understanding of the facts — facts that could assist in developing reasonable solutions.
Peter Fleischer, Google Global Privacy Counsel, reportedly told the 5th European Data Protection Days on May 4 that, “Over time, we are building a rich program of jurisprudence on the [RTBF] decision.” (Bhatti, Bloomberg, May 6). It is a jurisprudence built in the dark. For example, Mr. Fleischer is quoted as saying that the RTBF is “about true and legal content online, not defamation.” This is an interpretation of the scope and meaning of the ruling that deserves much greater elaboration, substantiation, and discussion.
We are not the only ones who want more transparency. Google’s own Advisory Council on the RTBF in February 2015 recommended more transparency, as did the Article 29 Working Party in November 2014. Both recommended that data controllers should be as transparent as possible by providing anonymised and aggregated statistics as well as the process and criteria used in delisting decisions. The benefits of such transparency extend to those who request that links be delisted, those who might make such requests, those who produce content that is or might be delisted, and the wider public who might or do access such material. Beyond this, transparency eases the burden on search engines by helping to shape implementation guidelines and revealing aspects of the governing legal framework that require clarification.
Naturally, there is some tension between transparency and the very privacy protection that the RTBF is meant to advance. The revelations that Google has made so far show that there is a way to steer clear of disclosure dangers. Indeed, the aggregate information that we seek threatens privacy far less than the scrubbed anecdotes that Google has already released, or the notifications that it is giving to webmasters registered with Google webmaster tools. The requested data is divorced from individual circumstances and requests. Here is what we think, at a minimum, should be disclosed:
  1. Categories of RTBF requests/requesters that are excluded or presumptively excluded (e.g., alleged defamation, public figures) and how those categories are defined and assessed.
  2. Categories of RTBF requests/requesters that are accepted or presumptively accepted (e.g., health information, address or telephone number, intimate information, information older than a certain time) and how those categories are defined and assessed.
  3. Proportion of requests and successful delistings (in each case by % of requests and URLs) that concern categories including (taken from Google anecdotes): (a) victims of crime or tragedy; (b) health information; (c) address or telephone number; (d) intimate information or photos; (e) people incidentally mentioned in a news story; (f) information about subjects who are minors; (g) accusations for which the claimant was subsequently exonerated, acquitted, or not charged; and (h) political opinions no longer held.
  4. Breakdown of overall requests (by % of requests and URLs, each according to nation of origin) according to the WP29 Guidelines categories. To the extent that Google uses different categories, such as past crimes or sex life, a breakdown by those categories. Where requests fall into multiple categories, that complexity too can be reflected in the data.
  5. Reasons for denial of delisting (by % of requests and URLs, each according to nation of origin). Where a decision rests on multiple grounds, that complexity too can be reflected in the data.
  6. Reasons for grant of delisting (by % of requests and URLs, each according to nation of origin). As above, multi-factored decisions can be reflected in the data.
  7. Categories of public figures denied delisting (e.g., public official, entertainer), including whether a Wikipedia presence is being used as a general proxy for status as a public figure.
  8. Source (e.g., professional media, social media, official public records) of material for delisted URLs by % and nation of origin (with top 5–10 sources of URLs in each category).
  9. Proportion of overall requests and successful delistings (each by % of requests and URLs, and with respect to both, according to nation of origin) concerning information first made available by the requestor (and, if so, (a) whether the information was posted directly by the requestor or by a third party, and (b) whether it is still within the requestor’s control, such as on his/her own Facebook page).
  10. Proportion of requests (by % of requests and URLs) where the information is targeted to the requester’s own geographic location (e.g., a Spanish newspaper reporting on a Spanish person about a Spanish auction).
  11. Proportion of searches for delisted pages that actually involve the requester’s name (perhaps in the form of % of delisted URLs that garnered certain threshold percentages of traffic from name searches).
  12. Proportion of delistings (by % of requests and URLs, each according to nation of origin) for which the original publisher or the relevant data protection authority participated in the decision.
  13. Specification of (a) types of webmasters that are not notified by default (e.g., malicious porn sites); (b) proportion of delistings (by % of requests and URLs) where the webmaster additionally removes information or applies robots.txt at source; and (c) proportion of delistings (by % of requests and URLs) where the webmaster lodges an objection.
As of now, only about 1% of requesters denied delisting are appealing those decisions to national Data Protection Authorities. Webmasters are notified in more than a quarter of delisting cases (Bloomberg, May 6). They can appeal the decision to Google, and there is evidence that Google may revise its decision. In the remainder of cases, the entire process is silent and opaque, with very little public process or understanding of delisting.
The ruling effectively enlisted Google into partnership with European states in striking a balance between individual privacy and public discourse interests. The public deserves to know how the governing jurisprudence is developing. We hope that Google, and all search engines subject to the ruling, will open up."
Full list of signatories, who have the additional honour of riding high in the TechnoLlama approval ratings, available at the Guardian and medium.com.

No comments: