The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. What does Disallow: /french-wines/?* actually do - robots.txt

    What does Disallow: /french-wines/?* actually do - robots.txt

    Intermediate & Advanced SEO
    8 2 558
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • McTaggart
      McTaggart last edited by

      Hello Mozzers - Just wondering what this robots.txt instruction means: Disallow: /french-wines/?*

      Does it stop Googlebot crawling and indexing URLs in that "French Wines" folder - specifically the URLs that include a question mark?

      Would it stop the crawling of deeper folders - e.g. /french-wines/rhone-region/ that include a question mark in their URL?

      I think this has been done to block URLs containing query strings.

      Thanks, Luke

      1 Reply Last reply Reply Quote 0
      • LoganRay
        LoganRay last edited by

        Hi Luke,

        You are correct that this was done to block URLs with parameters. However, since there's no wildcard (the asterisk) before the folder name, the URL would have to start with /french-wines/. This disallow is really only preventing crawling on the single URL www.yoursite.com/french-wines/ with any parameters appended.

        McTaggart 1 Reply Last reply Reply Quote 0
        • McTaggart
          McTaggart @LoganRay last edited by

          Thanks Logan - I was just reading: Disallow: /*? # block any URL that includes a ? (and thus a query string) - do you know why the ? comes before the * in this case?

          LoganRay 1 Reply Last reply Reply Quote 0
          • LoganRay
            LoganRay @McTaggart last edited by

            Disallow: /*?

            This disallow literally says to crawlers 'if a URL starts with a slash (all URLs) and has a parameter, don't crawl it'. The * is a wildcard that says anything between / and ? is applicable to the disallow.

            It's very easy to disallow the wrong this especially in regards to parameters, for this reason I always do these 2 things rather than using robots.txt:

            1. Set the purpose of each parameter in Search Console - Go to Crawl > URL Parameters to configure for your site
            2. Self-referring canonicals - most people disallow URLs with parameters in robots.txt to prevent indexing, but this only prevents crawling. A self-referring canonical pointing to the root level of that URL will prevent indexing or URLs with parameters.

            Hope that's helpful!

            McTaggart 1 Reply Last reply Reply Quote 1
            • McTaggart
              McTaggart @LoganRay last edited by

              Thanks again Logan.

              What would Disallow: /?* do because that is what the site I am looking at has implemented. Perhaps it works both ways around?

              I imagine it's easy to disallow the wrong thing or possibly not disallow the right thing. Ugh.

              LoganRay 1 Reply Last reply Reply Quote 0
              • LoganRay
                LoganRay @McTaggart last edited by

                Disallow: /?* is the same thing as Disallow:/?, since the asterisk is a wildcard, both of those disallows prevent any URL that begins with /? from being crawled.

                And yes, it is incredibly easy to disallow the wrong thing! The robots.txt tester in Search Console (under the Crawl menu) is very helpful for figuring out what a disallow will catch and what it will let by. I highly recommend testing any new disallows there before releasing them into the wild.

                McTaggart 1 Reply Last reply Reply Quote 1
                • McTaggart
                  McTaggart @LoganRay last edited by

                  Thanks Logan for your help with this - much appreciated. Really helpful!

                  LoganRay 1 Reply Last reply Reply Quote 0
                  • LoganRay
                    LoganRay @McTaggart last edited by

                    Glad to help, Luke!

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post
                    • Robots.txt Disallowed Pages and Still Indexed
                      Igor.Go
                      Igor.Go
                      0
                      3
                      2.9k

                    • Best practice for disallowing URLS with Robots.txt
                      TimHolmes
                      TimHolmes
                      0
                      3
                      650

                    • Robots.txt, Disallow & Indexed-Pages..
                      thekiller99
                      thekiller99
                      0
                      5
                      341

                    • Should I use meta noindex and robots.txt disallow?
                      ntcma
                      ntcma
                      0
                      5
                      923

                    • How to Disallow Tag Pages With Robot.txt
                      monster99
                      monster99
                      0
                      6
                      4.0k

                    • Robots.txt: Can you put a /* wildcard in the middle of a URL?
                      irvingw
                      irvingw
                      0
                      2
                      410

                    • Disallow my store in robots.txt?
                      AlanMosley
                      AlanMosley
                      0
                      2
                      308

                    • Reciprocal Links and nofollow/noindex/robots.txt
                      DanCrean
                      DanCrean
                      0
                      4
                      1.2k

                    Get started with Moz Pro!

                    Unlock the power of advanced SEO tools and data-driven insights.

                    Start my free trial
                    Products
                    • Moz Pro
                    • Moz Local
                    • Moz API
                    • Moz Data
                    • STAT
                    • Product Updates
                    Moz Solutions
                    • SMB Solutions
                    • Agency Solutions
                    • Enterprise Solutions
                    • Digital Marketers
                    Free SEO Tools
                    • Domain Authority Checker
                    • Link Explorer
                    • Keyword Explorer
                    • Competitive Research
                    • Brand Authority Checker
                    • Local Citation Checker
                    • MozBar Extension
                    • MozCast
                    Resources
                    • Blog
                    • SEO Learning Center
                    • Help Hub
                    • Beginner's Guide to SEO
                    • How-to Guides
                    • Moz Academy
                    • API Docs
                    About Moz
                    • About
                    • Team
                    • Careers
                    • Contact
                    Why Moz
                    • Case Studies
                    • Testimonials
                    Get Involved
                    • Become an Affiliate
                    • MozCon
                    • Webinars
                    • Practical Marketer Series
                    • MozPod
                    Connect with us

                    Contact the Help team

                    Join our newsletter
                    Moz logo
                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                    • Accessibility
                    • Terms of Use
                    • Privacy