Duplicate Content Question SEOmoz tool vs Google results
-
This post is deleted! -
You should definitely put canonical on your product pages and categories just in case, who knows, it might even rank you better for your category.
One thing I've realized after using SEOmoz for a bit, is that all the data you are receiving are hypothetical, and not all are in need of dire changes. I see a lot of people wondering why OSE isn't indexing a link, but that doesn't mean Google isn't. SEOmoz is more for reference.
So just cause SEOmoz is claiming duplicate content, its not necessarily affecting you but just precautions. In my opinion you should canonical the inner pages just to be sure.
I think it is worth trying just because you might rank better for the main category.
-
The risk in applying Canonical is that potentially we are losing searchers looking for the exact filter size?
-
From my experience and my knowledge that is not the case.(correct me if I'm wrong anyone)
Canonical is not a noindex. Canonical is saying what you PREFER, not 'do not index this page at all'. It just means that specific page is the original and will have more authority than the other links. But the sizes will still be indexed in Google search.
If you do a noindex on those pages, THEN you have a problem where the search user doesn't find the specific size they are looking for.
You can try it and see how it goes in 24-72hrs.
-
I noticed this issue, too, and asked the SEOmoz helpdesk what was going on. They said that on their tool, the threshold for duplicate content is 95% of the HTML. So if you have a lot of HTML that isn't actually your meaty content, the pages may be flagged as duplicate even if you have a couple hundred words of unique meaty content.
Google seems to be smarter about duplicate content on your own site. My guess is that it's checking the HTML other than the navigation and such. It might even be stripping the markup and just looking at words that are in headings and paragraphs. So it would only consider it a duplicate if the meaty content was extremely or exactly similar to other meaty content on your site.
Unfortunately, I had to do quite a bit of filtering to find the "true duplicate" content on my site -- pages that actually had the same meaty content, because someone threw in some boilerplate copy long ago to multiple category pages.
-
And this is why there is a Q&A forum. I learned that recently which is why it's nice to have webmaster tools set up.
-
Thanks @William that is how I understand it to -A canonical tells the search engine the preferred page for duplicate content, Engines prefer 301s and the Canonicals are a strong suggestion to the crawlers. I didn't think that meant they would index the size pages and I see you may be right.
-
Hi Rob,
Just to clarify, SEOmoz flags your content as duplicate if finds 95% HTML similarity. You can use an online tool to compare pages yourself. I like this one:
[http://www.webconfs.com/similar-page-checker.php](Just to clarify, SEOmoz flags your content as duplicate if finds 95% HTML similarity. You can use an online tool to compare pages yourself. I like this one: http://www.webconfs.com/similar-page-checker.php Google obviously uses a more sophisticated method than Moz, but it's still a good warning because pages without much unique content - even if they aren't true duplicates - often have a difficult time ranking for their targeted keywords.)
Google obviously uses a more sophisticated method than Moz, but it's still a good warning because pages without much unique content - even if they aren't true duplicates - often have a difficult time ranking for their targeted keywords.
Rob, for your specific examples, the content isn't an exact duplicate, but it's what we would call a "near-duplicate". These are pages that are very close to one another in theme and content. Having too many of these can actually hurt your site's ability to rank in a strong way. Dr. Pete goes into this in quite a bit of detail in this epic Duplicate Content post.
The problem on these pages is that the duplicate content is all above the fold - the primary spot Google cares about. You might consider:
- Moving your unique individual products to the top, and moving the repeating "boiler plate" intro below the fold
- Better yet, writing unique intros to each page. My guess is this would give you the most ranking boost for your buck.
So it's hard to say if this duplication is hurting you in a big way, but it might be - and it might be worth taking a look to make these pages more unique.
Hope this helps! Best of luck with your SEO.