Types of duplicate content
Duplicate content includes copied content that appears exactly the same way on multiple web pages, but it can also include extremely similar content. For example, if the content is the same, but it has been rearranged or displayed differently on a page, it will still be considered a duplicate of the original content.
The two types of this kind of content are external or internal duplicate content. External repeat content includes content that is located on different websites. It can include all types of content, including text as well as graphics.
Internal repeat content includes content that is located within the same website or domain. It can include:
- The use of multiple addresses, such as http://www. and http://
- Print or PDF versions of a website page
- A new link structure when the old one is still in place
- Using a new domain without deleting the old one
- Utilizing different country domains
Duplicate content can occur on purpose in both instances. For example, a writer may give multiple websites permission to publish their content, and a print version of a website makes it easier for users to print content without all the extra images that are common on a webpage.
However, content can be duplicated without permission as well, which is why search engines, like Google, pay close attention to repeated content.
Note on Duplicate Content
Avoid duplicate content and let clickworker create unique content. Find out more about clickworker’s content services.
How it affects SEO
In the past, any duplicate content could pose a risk to a site’s position on the ranking page. Today, algorithms are more sophisticated, which means they can differentiate between different kinds of repeated content, but it does depend on the particular content in question.
Search engines can struggle with duplicate content when:
- It is unclear which version should be included in their indices, and which versions should be excluded.
- The link metrics of the page are unclear.
- It is unclear which version is the original, or which is the best version to rank highest on the results page.
It is every search engine’s job to provide the most relevant list of results to each user. Algorithms are created to pick up on content that is maliciously duplicated and punish it, as well as content that provides a poor user experience.
This can have important SEO consequences for specific webpages. If Google or another search engine determines the content on a website is not valuable, a more valuable version of the page is located elsewhere, or the content has been maliciously stolen and reposted, the ranking of that website can drop significantly or be removed altogether.
Even if no malicious activity is detected, websites can still drop in the search results page rankings. For example, if multiple versions of the same content are found on the web, Google will determine which one is the best to list first on the search results page. Every subsequent duplicate will be listed later, diluting the SEO effectiveness of the duplicates.
Linking problems, where multiple links lead to different versions of the same content instead of having all inbound links pointing to just one piece of content, can also dilute the results.
Discovering duplicate content
Website content that is copied and pasted to a webpage is obviously duplicate content, but it can be difficult to discover other kinds of repeated content. Examples include a piece of content that is written based on another and is a little too similar, or outdated programming that has not been deleted, which means old pages are still searchable on the Internet.
To make sure duplicate content is not present on a webpage, search specific pieces of the text on Google to see if any other results appear.
Duplicate content checkers can be helpful, as can hiring a web developer to dig into the code behind a website to discover if lines of code, domains, and other information needs to be deleted.
Fixing and preventing duplicate content
It is easier to prevent duplicate content than it is to fix it later. Syndicate repeated content carefully, never copy and paste content on a website, and clean up website linking structures every time the website is updated. Unique, high-value content is always better than repeated content.
If duplicate content is discovered using one of the methods listed above, Google has many recommendations for ensuring that it does not affect search engine rankings. For example, a travel site with different pages for two different, but nearby, cities that contain the same content could cause problems with Google’s algorithm. Instead, combine the pages into one page, or create unique content for each page.