Search engines support sites against SOPA and PIPA

censorship Search engines support sites against SOPA and PIPA

Following yesterday’s content blackout for some of the world’s biggest websites, what can organisations and SEOs learn from the precautions that were taken?

The decision to block content by organisations such as Wikipedia, BoingBoing, Mozilla and WordPress as a response to SOPA and PIPA bills – the legislation that aims to stop people having access to websites that illegally distribute copyrighted material – gained widespread media attention around the world.

But what would be the impact of censoring a site’s content on an organisation’s search engine optimisation? And what action did these large and well-ranking organisations take to prevent them dropping in the eyes of the search engines?

Potential impact of SOPA and PIPA

Depending on how it’s done, the approach of redirecting users can be considered very ‘black hat’, and many sites (large and small) have suffered in the past – BMW being a prime example.

Another issue is that when search engines crawl for new content (as they do regularly with larger sites) and find there little or no content available, the impact on rankings would normally be dramatic. This is especially true for a site like Wikipedia, where every single information page was ultimately censored for 24 hours.

How the search engines stepped in to help

While Bing/Yahoo carried on as normal with no major changes, other than a statement saying that it does “oppose the SOPA bill as currently drafted”, Google announced measures to aid webmasters in censoring their content without getting penalised.

A statement from Pierre Far, Webmaster Trends Analyst at Google UK, said:

“Hello webmasters! We realize many webmasters are concerned about the medium-term effects of today’s blackout. As a precaution, the crawl team at Google has configured Googlebot to crawl at a much lower rate for today [18th January 2012] only so that the Google results of websites participating in the blackout are less likely to be affected.”

Far went on to offer advice to webmasters on how to censor/take down the site content without suffering from negative effects, mainly by ensuring the URLs which were participating in the blackout returned a 503 HTTP Header (Service Unavailable).

This ensures that Google understands that the new content on the pages isn’t the ‘real’ content and, therefore, will not be indexed; while if all URLs from the site carry the same content (a message against the SOPA and PIPA bills, for example) it would not be considered duplicate information.

Far  went on to suggest that configuring a robots.txt to return a 503 would also cause Google to halt crawling until it received an acceptable code (200, for example). Plus, that webmasters most certainly shouldn’t block Googlebot by using a Disallow tag, as this can cause problems crawling over a longer period of time.

Not all sites that supported the blackout utilised this advice, however. Wikipedia was sporting 304 HTTP Header (Not Modified) which told the Googlebot that there was nothing to see and that content hadn’t been changed since the last time it was crawled. The likes of BoingBoing and Mozilla, meanwhile, used the seemingly accepted 503 method.

How can this approach be utilised in the future?

The helpful and transparent advice from Google, which many sites used throughout the blackout, along with the various alternative methods taken up by other sites, should prove that these approaches serve a much more significant purpose for SEO teams.

While their impact will need to be monitored to ensure there have been no long term negative effects, we can expect them to be used more frequently for campaigns, homepage takeovers, offers and promotions. In doing so, organisations can ensure their search engine optimisation will not harmed by short term marketing.

Jane Cragg, SEO manager, Rippleffect

email Search engines support sites against SOPA and PIPA

Leave a reply


nine - = 2