The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. No indexing url including query string with Robots txt

    No indexing url including query string with Robots txt

    Technical SEO Issues
    7 4 19.2k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • HMK-NL
      HMK-NL last edited by

      Dear all,

      how can I block url/pages with query strings like page.html?dir=asc&order=name with robots txt?

      Thanks!

      1 Reply Last reply Reply Quote 0
      • Matthew_Edgar
        Matthew_Edgar last edited by

        Hi,

        Here is an article explaining how to do this in robots.txt:
        http://sanzon.wordpress.com/2008/04/29/advanced-usage-of-robotstxt-w-querystrings/

        Depending on what you are trying to do, it might also be worth investigating parameter handling in Google Webmaster Tools:
        http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1235687

        Thanks,
        Matthew

        1 Reply Last reply Reply Quote 1
        • HMK-NL
          HMK-NL last edited by

          Dear all,

          thanks for responding. If I have a pages like

          1. www.sub.domain.com/collection.html exists, I want to index it, and

          2. www.sub.domain.com/collection.html?dir=desc&order=color which I don't want to index

          Is this the way to do this in de robots.txt?:

          Disallow: /collection/?*

          Thanks!

          Matthew_Edgar 1 Reply Last reply Reply Quote 0
          • cprasad
            cprasad last edited by

            Hi,

            Robots.txt works mainly on 2 rules. Those are User-agent: and Disallow:

            User-agent: the name of the robot you need to block

            Disallow: the url or folder or other url with conditions you need to block.

            As you have asked in your question you need to block a url with a condition. But you have to remember that Robot.txt is giving so critical results if you did not use it correctly.

            Anyway in your question, you wanted to block url/pages with query strings like page.html?dir=asc&order=name

            so you have to use following:

            User-agent: *

            Disallow: /*?

            So the above will block all the urls with a question mark (?) for all the search robots. This will not block only page.html?dir=asc&order=name it will alos block comments.html?dir=asc&order=name

            So use it so carefully.

            Hope this is the what you have looked for. If need more help you may ask.

            Regards

            Prasad

            1 Reply Last reply Reply Quote 0
            • Matthew_Edgar
              Matthew_Edgar @HMK-NL last edited by

              Hey,

              Should that second URL be www.sub.domain.com/collection/adresboeken.html?whatever=something If so, then by using /collection/?* you are saying that anything within /collection/ with a query string should not be indexed. If adresboeken.html always has a query string, it may not get indexed.

              The other options I'd consider before using robots.txt are telling Google to ignore dir=desc&order=color in Google Webmaster Tools parameter handling. This is the best way to handle query string issues. (Assuming you are trying to influence Google. Clearly Google Webmaster Tools won't affect Bing!)

              Another idea is to set a canonical URL on /collection/adresboeken.html referencing /collection/adresboeken.html without the query string. This tells the search engines that the query strings do not make a unique URL. (adresboeken.html?dir=desc&order=color is the same as adresboeken.html?dir=desc&order=price is the same as adresboeken.html?dir=asc&order=color is the same as adresboeken.html, and so on).

              I hope that helps. Thanks,
              Matthew

              1 Reply Last reply Reply Quote 0
              • kyleNeedham
                kyleNeedham last edited by

                You could always just use rel="canonical" which would be much better than completely blocking all URL parameters.

                1 Reply Last reply Reply Quote 0
                • HMK-NL
                  HMK-NL last edited by

                  Dear all, what is the best option? And are the option below good? A: Disallow

                  • sort-order (Only URLs with value = asc)

                  "A single URL may contain many parameters for each of which you can specify settings. More restrictive settings override less restrictive settings. For example, here are three parameters and their settings"

                  source:

                  http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1235687

                  B:  User-agent:

                  Googlebot Disallow: /*.=name$

                  for example www.sub.domain.com/collection.html?dir=desc&order=name source: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449

                  Thanks!

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • URL with query string being indexed over it's parent page?
                    Zohaibkhannn
                    Zohaibkhannn
                    0
                    3
                    59

                  • Robots.txt Syntax for Dynamic URLs
                    btreloar
                    btreloar
                    0
                    5
                    369

                  • Google indexing despite robots.txt block
                    john4math
                    john4math
                    0
                    13
                    654

                  • Exclude root url in robots.txt ?
                    mikehenze
                    mikehenze
                    0
                    5
                    122

                  • Google is indexing blocked content in robots.txt
                    bjs2010
                    bjs2010
                    0
                    5
                    147

                  • Blocked URL's by robots.txt
                    meralucian37
                    meralucian37
                    0
                    7
                    398

                  • Query string in url - duplicate content?
                    rhutchings
                    rhutchings
                    0
                    6
                    4.9k

                  • URL query strings and canonical tag
                    bizarro1000
                    bizarro1000
                    0
                    3
                    6.5k

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy