Thursday, April 3, 2014

Google's latest hit on scraper sites


Scraper sites -those sites that copy content others have formed and post it on their own site or blog as their own – have been the irritation of webmasters for many, many years.

Even though, logically, the originating content should rank number one for the content since they are the originating source, often times you'll find scraper sites ranking over the content originator, typically in conjunction with other spam methods to get the content ranking.

Even worse, sometimes the novel source of content vanishes from the search results while a scraper site's version continues to rank well

Google today has released a new Scraper Report form where webmasters can submit scraper sites that has copied their own content by providing Google with the source URL, where the content was taken from, and the URL of the scraper site where the content is being republished or repurposed, and the keywords where the scraper site was ranking on.

Google is also asking webmasters to verify that their site follows the webmaster guidelines before submitting, although chances are pretty good that those webmasters who find the scraper report form are also alert of the Google webmaster guidelines and how to find penalties in their Google Webmaster accounts.

Does this mean that scraper sites are becoming more of a problem now than they have in the past? Not necessarily, however that could be part of the reason.
Sometimes scraper sites aren't essentially ranking for the top money keywords, but there prevalent enough cluttering up the search results after the top 10 or so, and can be a lot more prevalent on long-tail search results as well as when you go beyond Page 1 or 2 of Google. And the only way to get scraper content out of Google search results is by filing a DMCA.

Google isn't saying accurately what they're doing with this data. Is this being used as an easy way for webmasters to get the scraper sites out of the index without having to use the DMCA? Are they using it to recover their algorithms to try and determine where the originating content is versus the scraper content? Google doesn't say, although I think it is being used to recover the algorithm by seeing how and why scrapers are ranking.

This absolutely has the mark of projects that one spam team member is working on. Back in August, Cutts also asked for examples of small websites that weren't ranking as well, despite being high quality, although that one particularly had a disclaimer saying that those submissions wouldn't affect the rankings.

It's great that Google is choosing to again look at scraper sites, because it has been pretty annoying for webmasters for so many years, even if they aren't necessarily ranking high.
This isn't the first time Google has asked for help with scrapers, and Google also tried to reduce the number of scrapers algorithmically in 2011.
Hopefully we will see a restore on how scrapers are handled in a future update of Google's search algorithm.

No comments:

Post a Comment