Duplicate content reported for totally different pages
-
Hi,
The Moz report is showing just over 21,500 duplicate page issues on our site. This is more or less every page we have. However when I look at the pages it says are duplicates they are totally different (it could for example report that a news page for 2009 is the same as a product page just added which has no relation when you read the content or view the page).
What sort of thing could it be picking up as duplicate content? I assume it must be something in the HTML for the site rather than the actual page content as there is no cross over at all on the pages highlighted. The only issue I can currently identify is that the menu for the mobile version of the site has a huge number of internal links which I will cut down. If the tools purely look at HTML content this could be seen as duplicate but shouldn't it be clever enough to realise what is content and what is site structure?
Thanks,
-
Hi Stephen!
Thanks for writing with a great question! Campaigns have up to a 90% tolerance for duplicate content. This includes all the source code on the page and not just the viewable text. So if a URL is at least up to 90% similar in code to another URL, this warning will appear. Although the pages in question are may appear to be different on the front end, they are actually duplicates based on this percentage.
We don't know what standard Google uses, but it's safe to say they are a bit more sophisticated than us - so you might be okay in this regard as long as you have a couple hundred words of unique text per page. Google won't say how much duplicate content is too much, so we like to be better safe than sorry.

Hope this helps!