The SEO industry has been plagued for years by a lack of consistency with SEO terms and definitions. One of the most prevalent inaccurate terms we hear is “duplicate content penalty.” While duplicate content is not something you should strive for on your website, there’s no search engine penalty for having it.
Duplicate content has been and always will be a natural part of the Web. It’s nothing to be afraid of. If your site has some dupe content for whatever reason, you don’t have to lose sleep every night worrying about the wrath of the Google gods. They’re not going to shoot lightning bolts at your site from the sky, nor are they going to banish your entire website from ever showing up for relevant searches.
They are simply going to filter out the dupes.
The search engines want to index and show to their users (the searchers) as much unique content as algorithmically possible. That’s their job, and they do it quite well considering what they have to work with: spammers using invisible or irrelevant content, technically challenged websites that crawlers can’t easily find, copycat scraper sites that exist only to obtain AdSense clicks, and a whole host of other such nonsense.
There’s no doubt that duplicate content is a problem for search engines. If a searcher is looking for a particular type of product or service and is presented with pages and pages of results that provide the same basic information, then the engine has failed to do its job properly. In order to supply users with a variety of information on their search query, search engines have created duplicate content “filters” (not penalties) that attempt to weed out the information they already know about. Certainly, if your page is one of those that is filtered, it may very well feel like a penalty to you, but it’s not it’s a filter.
Penalties Are for Spammers
Search engine penalties are reserved for pages and sites that are purposely trying to trick the search engines in one form or another. Penalties can be meted out algorithmically when obvious deceptions exist on a page, or they can be personally handed out by a search engineer who discovers the hanky-panky through spam reports and other means. To many people’s surprise, penalties rarely happen to the average website. Sites that receive a true penalty typically know exactly what they did to deserve it. If they don’t, they haven’t been paying attention.
Honestly, the search engines are not out to get you. If you have a page on your site that sells red hats and another very similar page selling blue hats, you aren’t going to find your site banished off the face of Google. The worst thing that will happen is that only the red hat page may show up in the search results instead of both pages showing up. If you need both to show up in the search engines, then you’ll need to make them substantially unique.
Suffice it to say that just about any content that is easily created without much human intervention (i.e., automated) is not a great candidate for organic SEO purposes.
Article Reprints
Another duplicate-content issue that many are concerned about is the republishing of online articles. Reprinting someone’s article on your site is not going to cause a penalty. While you probably don’t want every article on your site to be a reprint of someone else’s, if the reprints are helpful to your site visitors and your overall mission, then it’s not a problem for the search engines.
If your own bylined articles are getting published elsewhere, that’s a good thing. You don’t need to provide a different version to other sites or not allow them to be republished at all. The more sites that host your article, the more chances you have to build your credibility as well as to gain links back to your site through a short bio at the end of the article. In many cases, Google doesn’t even filter out duplicate articles in searches, but even if they eventually show only one version, it’s still okay.
Inadvertent Multiple URLs for the Same Content
Where duplicate content CAN be a problem is when a website shows essentially the same page, but on numerous URLs. WordPress blogs often fall victim to this when multiple tags or categories are chosen to label any one blog post. The blog software then creates numerous URLs for the same article, depending on which category or tag a user clicked to view it. While this type of duplicate content won’t cause a search engine penalty, it will often split the overall link popularity of the article, which is not recommended.
Any backend system or CMS that creates numerous URLs for any one piece of content can indeed be a problem for search engines, because it makes their spiders do more work. It’s silly to have the spider finding the same information over and over again, when you’d rather have it finding other, unique information to index. This type of unintended duplicate content should definitely be cleaned up either through 301-redirects or by using the canonical link element (rel=canonical).
When it comes to duplicate content, the search engines are not penalizing you or thinking that you’re a spammer; they’re simply trying to show some variety in their search results pages and don’t want to waste time indexing content they already have in their databases.