Duplicate Content Indentification Tools
-
Does anyone have a recommendation for a good tool that can identify which elements on a page are duplicated content? I use Moz Analytics to determine which pages have the duplicated content on them, but it doesn't say which pieces of text or on-page elements are in fact considered to be duplicate.
Thanks Moz Community in advance!
-
Copyscape.com will tell you if you have duplicate content. If you have a big site with loads of pages I'd buy credits or you'll have difficulty because it only lets you check a few pages per day (I can't remember what the limit is). With the paid version you can upload your xml sitemap (s) and it'll check all the pages in that file. Then the report will highlight the bits of copy that is duplicate.
-
I use CopyScape but it's more of a plagiarism tool then an actual duplicate content identifier tool. I say that because just because a few lines of text are the same on a page, that doesn't mean Google will remove it from the SERPs. Generally duplicate content has to be a substantial portion of a webpage to be considered duplicate content.
I would first dig into Moz Analytics and see WHY you are generating duplicate content before I would worry about what part of the page is duplicate.
- Have you set canonicals on your pages?
- Does your site produce session IDs?
- Do you have pagination?
- Are you copying and pasting text from page to page to fill up your site?
Google has said time and time again, duplicate content issues are rarely a penalty. It is more about Google knowing which page they should rank and which page they should not. Take a look at why you are getting the duplicate content issue and then we can help you resolve it or give advice on what to do next.
-
Yes. I also agree that CopyScape is better for plagiarism. I am also reviewing the canonical tags we have in place for these pages. I am trying to view the marked pages from a few different angles to gain a fuller understanding of why indeed they are being marked with 'duplicate content' warnings on our analytics platform and for a deeper understanding of the situation so to create a process of checks for any future warnings.
-
Here is some guidelines from Google Webmasters Help on Duplicate Content with tips to resolve issues.
-
Thank you. These steps are a part of our process.