Product search URLs with parameters and pagination issues - how should I deal with them?
-
Hello Mozzers - I am looking at a site that deals with URLs that generate parameters (sadly unavoidable in the case of this website, with the resource they have available - none for redevelopment) - they deal with the URLs that include parameters with *robots.txt - e.g. Disallow: /red-wines/? **
Beyond that, they userel=canonical on every PAGINATED parameter page[such as https://wine****.com/red-wines/?region=rhone&minprice=10&pIndex=2] in search results.**
I have never used this method on paginated "product results" pages - Surely this is the incorrect use of canonical because these parameter pages are not simply duplicates of the main /red-wines/ page? - perhaps they are using it in case the robots.txt directive isn't followed, as sometimes it isn't - to guard against the indexing of some of the parameter pages???
I note that Rand Fishkin has commented: "“a rel=canonical directive on paginated results pointing back to the top page in an attempt to flow link juice to that URL, because “you'll either misdirect the engines into thinking you have only a single page of results or convince them that your directives aren't worth following (as they find clearly unique content on those pages).” **- yet I see this time again on ecommerce sites, on paginated result - any idea why? **
Now the way I'd deal with this is:
Meta robots tags on the parameter pages I don't want indexing (nofollow, noindex - this is not duplicate content so I would nofollow but perhaps I should follow?)
Use rel="next" and rel="prev" links on paginated pages - that should be enough.Look forward to feedback and thanks in advance, Luke
-
Luke,
Here's what I'd recommend doing:
- Lose the canonical tags, that's not the appropriate way to handle pagination
- Remove the disallow in the robots.txt file
- Add rel next/prev tags if you can; since parameter'd URLs are not separate pages, some CMSs are weird about adding tags to only certain versions of parameter
- Configure those parameters in Search Console ('the last item under the Crawl menu) - you can specific each parameter on the site and its purpose. You might find that some of these have already been established by Google, you can go in and edit those ones. You should configure your filtering parameters as well.
- You don't want to noindex these pages, for the same reason that you might not be able to add rel next/prev. You could risk that noindex tag applying to the root version of the URL instead of just the parameter version.
Google has gotten really good at identifying types of duplicate content due to things like paginated parameters, so they don't generally ding you for this kind of dupe.
-
To touch on your question about if you should follow or nofollow links...if the pages in question could help with crawling in any fashion at all...despite being useless for their own sake, if they can be purposeful for the sake of other pages in terms of crawling and internal pagerank distribution, then I would "follow" them. Only if they are utterly useless for other pages too and are excessively found throughout a crawling of the site would I "nofollow" them. Ideally, these URLs wouldn't be found at all as they are diluting internal pagerank.
-
This post is deleted! -
thanks for the feedback - it is Umbraco.
-
Yes I agree totally - some wise words of caution - thanks.
-
I've been having endless conversations about this over the last few days and in conclusion I agree with everything you say - thanks for your excellent advice. On this particular site next/prev was not set up correctly, so I'm working on that right now.
-
This post is deleted! -
Hi Logan,
I've seen your responses on several threads now on pagination and they are spot on so I wanted to ask you my question. We're an eCommerce site and we're using the rel=next and rel=prev tags to avoid duplicate content issues. We've gotten rid of a lot of duplicate issues in the past this way but we recently changed our site. We now have the option to view 60 or 180 items at a time on a landing page which is causing more duplicate content issues.
For example, when page 2 of the 180 item view is similar to page 4 of the 60 item view. (URL examples below) Each view version has their own rel=next and prev tags. Wondering what we can do to get rid of this issue besides just getting rid of the 180 and 60 item view option.
https://www.example.com/gifts/for-the-couple?view=all&n=180&p=2
https://www.example.com/gifts/for-the-couple?view=all&n=60&p=4
Thoughts, ideas or suggestions are welcome. Thanks!
-
Hi Zack,
Have you configured your parameters in Search Console? Looks like you've got your prev/next tags nailed down, so there's not much else you need to do. It's evident to search engines that these types of dupes are not spammy in nature, so you're not running a risk of getting dinged.