Techniques to fix eCommerce faceted navigation
-
Hi everyone,
I've read a lot about different techniques to fix duplicate content problems caused by eCommerce faceted navigation (e.g. redundant URL combinations of colors, sizes, etc.). From what I've seen suggested methods include using AJAX or JavaScript to make the links functional for users only and prevent bots from crawling through them.
I was wondering if this technique would work instead?
If we detect that the user is a robot, instead of displaying a link, we simply display its anchor text.
So what would be for a human
COLOR
< li > < a href = red >red < /a > < /li >
< li > < a href = blue>blue < /a > < /li >Would be for a robot
COLOR
< li > red < /li >
< li > blue < /li >Any reason I shouldn't do this?
Thanks!
*** edit
Another reason to fix this is crawl budget since robots can waste their time going through every possible combination of facet. This is also something I'm looking to fix.
-
That would be cloaking, best not do that
A canonical tag would be best, thats what they are for
-
But is it really cloaking? We wouldn't be showing different content. Just disabling links. This article describes a technique that's more akin to cloaking and justifies it because of "intent": http://www.seomoz.org/ugc/dealing-with-faceted-navigation-a-case-study.
The problem with canonical is that the robots will still waste crawl budget going through all the combinations of facets we have. We have hundreds of categories with complex products with 10+ facets with 10+ options each...
-
I've been browsing sites looking at what the big players are doing
Homedepot.com seems to be doing exactly this; if you go to
And you click a facet to narrow the result, the page is refreshed via AJAX
If you go to the same page with a Googlebot user agent, even with JavaScript enabled, clicking the checkbox does nothing!
Is this cloaking? Why is this legit?
-
I share Alan's hesitation - it could look like cloaking, especially if a bot is making the call. If the pages aren't indexed yet, you could just "nofollow" the links - it sends the same signal transparently.
Home Depot is probably pulling it off with the AJAX/JS implementation, which is a bit harder for Google to parse. They also have a massive authority and link profile, so they can always squeak the small stuff by. You might not be so lucky. In general, it's best to stick to the standard practices and not get too tricky.