Google’s Duplicate Content Penalty
One of the foundations of any optimal content marketing strategy is churning out new and original content that will provide quality user experience and rank your website.
But domains often possess duplicate content, which is typically registered by multiple URLs. An example of this might be an original content page and a printer friendly copy of the same page, Session ID, or URL parameter, which create duplicate copies of the same webpage.
There are many examples of why duplicate content can pop up; maybe you republished a guest post you produced for a big blog site on your own site to to garner more recognition. Or maybe someone else published your content. There are many across the internet that believe that duplicate content is punished by Google and can hurt the rankings of your webpages or cause your domain to cease to be indexed.
Even if Google indexes your original content page, and a duplicate pops up, it’s the assumption that one of these duplicate content pages can hurt the rankings of your original content webpage. The latter only occurs in rare circumstances and most of this paranoia is the result of a pervasive myth…
The Duplicate Content Myth
Google’s Matt Cutts estimates that about 25-30% of the internet is duplicate content. According to the Google Search Console, “duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin.”
A quick search about this very topic will point you to a number of websites outlining why the Google duplicate content penalty is a myth. Obviously, Google ranks websites higher that produce good and factual content, even if they produce appreciably similar content.
This is because Google has stated numerous times that duplicate content does not penalize a website or web domain unless it is spammy or intentionally attempts to manipulate search rankings. Yet, many website administrators still hold on to this pervasive fear. As FDR stated, “the only thing we have to fear is fear itself.”
Say your web domain contains multiple URLs of the same content; Google will cluster these URLs and consolidate its internal signals, such as the links used within this cluster, to pick one URL to index and rank. It’s generally bad practice to block crawling and indexing of specific webpages within this cluster as it can disrupt the internal signals Google uses to index a particular URL within the cluster.
Google also seeks to index original content pages and most likely will for your web domain. Regardless, there are still a number of steps you can take to ensure that the correct webpage is canonicalized and submitted to the Google rankings process.
SEO Canonical Practices
By placing a 301 redirect tag on duplicate content pages you can redirect users, spiders, and search engines to the original landing page of a content block. This can be accomplished through the .htaccess file and can help create relevance for a webpage within the rankings process.
The meta tag is part of the HTML head of your webpage and tells search engines that all links and search metrics are to be provided to the particular URL you’ve canonicalized. If you fear larger blogs using your content or if you publish your content elsewhere, this is a good way to tell search engines that a particular URL is the original source of content.
Use Google Authorship
Google authorship allows you to digitally sign your name to a piece of content which you have authored. This still will not affect indexing, but for fears of plagiarism this could be good corrective action.
Adjust URL Parameters in Google’s Webmaster Tools
By adjusting different URL parameters and setting preferred domains you can directly tell Google which webpages you seek to index and compete in the rankings process.
Common Duplicate Content Questions
Should I worry about scrapers duplicating my content to outrank my website?
Most people need not fear for scrapers at all as they will neither help nor hurt your rankings. In fact, for scrapers that duplicate your content and links they hold no authority and you might even get a referral visit because of it. Google once launched a Scraper Report Tool, but it no longer accepts admissions.
Does non-original content hurt my rankings?
Many websites across the web constantly repost published articles strictly for user experience. These posts will not rank, but they will also not hurt your web domain’s credibility.
What if I syndicate my content to other websites?
Even if content appears across multiple domains, Google will still pick one particular URL to provide in search rankings for a more diversified user experience. Obviously, you don’t want to get multiple copies of the same article in a Google search. You can simply ask other websites that syndicate your material to place a noindex meta tag to prevent search engines from ranking that website over your own or you can take one of the steps outlined above to prevent this.
Google does not punish non-deceptive duplicate content and most times Google will sort through duplicate content to pick which URL it thinks provides the most relevance. Duplicate content is mainly an issue affecting user experience. If you find that another website has plagiarized your material you should file a request under the Digital Millennium Copyright Act.