Google has been accepting requests from European users exercising their newfound right to be forgotten since May, but has offered little insight into the sites that users are asking to remove content from search engine results. In the intervening time, many have speculated that the freedom of the press will take a hit, but Google has broken its silence and shared data that shows that it isn’t media websites, but social networks that users want Google to forget most.
In its Transparency Report, Google provides some information on the kinds of right to be forgotten (RTBF) requests that it receives and how it processes them. From the launch of the official request process on May 29, Google has reviewed 146,938 requests and evaluated a total of 498,830 URLs for removal. It has removed 41.8% of the URLs reviewed, and opted not to remove 58.2%. The removal rate varied by country, with Google removing 53% of the URLs submitted by German users and only 24.2% of the URLs submitted by Italian users.
Google also provides examples of the types of requests that it has received through the official request process. Examples of requests that Google chose to honor include: “An individual requested that we remove close to 50 links to articles about an embarrassing private exchange that became public. The pages have been removed from search results for his name,” and further, “A woman requested that we remove a decades-old article about her husband’s murder, which included her name. The page has been removed from search results for her name.”
Examples of requests that Google did not honor include: “A financial professional asked us to remove more than 10 links to pages reporting on his arrest and conviction for financial crimes. We did not remove the pages from search results,” and lastly, “An individual asked us to remove links to articles on the Internet that reference his dismissal for sexual crimes committed on the job. We did not remove the pages from search results.”
Google also shared a breakdown of the sites that have been most impacted by RTBF requests. The top site affected, with 3,353 URLs removed, is Facebook. After Facebook, the site with the most URLs removed, with 3,299 removed, is Profile Engine — a site that many people have never heard of, but acts as a search engine to enable users to find people on social networks. (We’ll return to Profile Engine in more depth after a closer look at more numbers on the RTBF requests that Google is processing.)
While Google’s Transparency Report is a bit short on details, a service that’s making a business of helping European consumers exercise their right to be forgotten has done some reporting and shared some numbers of its own. Forget.me, a site run by Reputation VIP, helps European users to submit RTBF requests to Google and Bing. In a press release, Reputation VIP reports that it has seen more than 21,000 Forget.me registrations in less than three months. (Communications manager Marion Canette says that the site was launched on June 24, and has since analyzed “numerous requests” submitted via Forget.me.)
The press release and the service’s latest infographic, shared with the Cheat Sheet by Canette, detail what Forget.me found when it studied a selection of 10,787 URLs, submitted to Google through Forget.me by users in France, the United Kingdom, and Germany.
“We can see that the Press websites has not been affected by many Right to be forgotten requests, this category represents just 3.6% of requests. Wikipedia, too, corresponds to just 0.2% of requests. Requests are focused on other types of websites. A quarter of requests concern social networks or blogs. This is often explained by people being unfamiliar with confidentiality rules, posts they later regret or defamation between individuals.”
Additionally, about two-thirds of the URLs submitted in a request show up within the first three pages of Google Search results when a search on the individual’s name is performed — suggesting that there’s some validity to the idea that people don’t necessarily want all content removed, but that they just don’t want it to show up on the first few pages of search results.
While Forget.me’s data shows that 0.2% of the 10,787 URLs were from Wikipedia (a total of only 20 URLs), only 3.6% were from press websites (a total of 384 URLs), and 4.8% were from blogs (a total of 516 URLs), a much larger 14.1% were from directories (which lists people’s addresses or phone numbers and accounted for a total of 1,523 URLs). An even larger percentage — 23.3% — came from social networks (a total of 2,295 URLs) and 56.1% came from websites classified as “others,” a category that includes real estate, e-commerce, classified ads, and events sites (and accounts for a total of 6,049 URLs).
Forget.me also tracked Google’s refusal rate by the type of website, and found that the refusal rate for Wikipedia URLs was 100% and the rate for URLs of press websites was almost as high at 93%. Blog URLs saw a 75% refusal rate, “others” saw a 68% refusal rate, and social networks saw a 61% refusal rate. The lowest refusal rate went to directories, which saw a 28% refusal rate.
On the other side of the coin, the percentage of URLs submitted through Forget.me that were later deindexed was very low, at 8.4% for social networks, 10.1% for directories, 1.2% for blogs, and 18.1% for “others.” Among social networks, the top five domains that saw Google deindexing URLs were Facebook (at 24%), Google sites (including Google+ groups and Picasa and at 12%), YouTube (at 7%), Twitter (at 6%), and LinkedIn (at 6%).
Within the top five directories affected by RTBF requests was one worth noting: Profile Engine. As we mentioned, Profile Engine is a search engine that enables users to find people on social networks. As its website notes, Profile Engine began in 2007, called “Advanced Search” at the time, “as the world’s first social network search engine.”
“The Profile Engine makes it much easier to find your friends online and provides powerful new search tools for meeting new people, making friends and dating. More than ten million people created detailed searchable profiles on The Profile Engine so that others can find them more easily. A major social network supplied us with a further 420 million public profiles so that we can provide the powerful search features which are lacking on their own site.”
As Quartz reports, Profile Engine acquired the rights to “crawl” through the backend of Facebook and access its user data in 2008. While Profile Engine doesn’t mention Facebook once in its “About” section, Quartz notes that it was originally a search for Facebook and provided the “Advanced Search” functionality on the social network. The deal that enabled Profile Engine to access Facebook’s user data was in effect until 2010, when Facebook reportedly shut down Profile Engine’s access and Profile Engine sued Facebook.
Profile Engine’s complaint against Facebook revealed that, “Over 400 million profiles were aggregated, along with over 15 billion ‘friendship’ connections between people and 3 billion ‘likes.’” Profile Engine accused Facebook of falsely stating that it was “unsafe” or “spammy.” While Profile Engine claims that Facebook is obligated to keep the public information on Profile Engine up to date, it has not been updated since October 2010. Information that is deleted from Facebook is not deleted from Profile Engine.
According to Google’s Transparency Report, Google has received nearly 3,300 requests for Profile Engine content to be removed from search results, and those requests are from users who see outdated versions of their Facebook profiles popping up in Google search results, thanks to Profile Engine. It’s estimated that Profile Engine acquired and now holds the user data of as many as 450 million individuals.
While those 3,300 requests are from users looking to reverse the deal that Profile Engine made with Facebook to crawl their data, it’s worth noting that Facebook URLs were the target of even more requests. From the cursory look that the data provides, it seems that there are a couple of factors at play in the trend toward European Internet users asking Google to forget Facebook-related results: In some cases, people likely don’t understand or pay attention to, privacy settings, which can enable search engines to link to their profile. While it’s likely that in some cases, the typical warning of parents and teachers likely goes unheeded and social network users don’t post judiciously, it’s these privacy settings that expose unfortunate posts and exchanges to indexing by search engines like Google.
While people undoubtedly come to regret the personal posts that they, their friends, or maybe even their enemies have made on social networks — and bullying and defamation are unfortunately also a factor — it’s also unsettling that Profile Engine is surfacing old profiles, and that most of the results that come up in a Google search of Profile Engine itself detail how to remove your profile from the site. The proliferation of social media content in search engine results perhaps even lends a bit of credence to parents’ repeated warnings that you shouldn’t post anything online that you wouldn’t want your grandmother to read on the front page of The New York Times — or on Google, as the case may now be.
More from Tech Cheat Sheet:
- Last Week’s 3 Biggest Video Game Rumors
- 10 Upcoming Movies Based on Video Games
- 15 Great Apps, Gadgets, and Websites You Don’t Want to Miss
Want more great content like this? Sign up here to receive the best of Cheat Sheet delivered daily. No spam; just tailored content straight to your inbox.