The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Robots.txt, does it need preceding directory structure?

    Robots.txt, does it need preceding directory structure?

    Intermediate & Advanced SEO
    3 2 156
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Milian
      Milian last edited by

      Do you need the entire preceding path in robots.txt for it to match?

      e.g:

      I know if i add Disallow: /fish to robots.txt it will block

      /fish
      /fish.html
      /fish/salmon.html
      /fishheads
      /fishheads/yummy.html
      /fish.php?id=anything

      But would it block?:

      en/fish
      en/fish.html
      en/fish/salmon.html
      en/fishheads
      en/fishheads/yummy.html
      **en/fish.php?id=anything

      (taken from Robots.txt Specifications)** I'm hoping it actually wont match, that way writing this particular robots.txt will be much easier!

      As basically I'm wanting to block many URL that have BTS- in such as:

      http://www.example.com/BTS-something
      http://www.example.com/BTS-somethingelse
      http://www.example.com/BTS-thingybob

      But have other pages that I do not want blocked, in subfolders that also have BTS- in, such as:

      http://www.example.com/somesubfolder/BTS-thingy
      http://www.example.com/anothersubfolder/BTS-otherthingy

      Thanks for listening

      1 Reply Last reply Reply Quote 0
      • PinpointDesigns
        PinpointDesigns last edited by

        You're right in with the **Disallow: /fish **in the robots file blocking all those initial links, but if you wanted to block everything inside the /en/ folder, you would need to do disallow: /en/fish

        You could use a wildcard in the robots.txt file to do something along the lines of Disallow: /BTS-*

        This _'should' _work, but it's always worth checking using a tool to make sure it's all implemented correctly. Distilled did a post a while back about a JS tool which allows you to test if robots.txt files work correctly which can be found here - http://www.distilled.net/blog/seo/js-bookmarklet-for-checking-if-a-page-is-blocked-by-robots-txt/

        In addition to this, you could also use the 'blocked URLs' tool in GWT to see if the pages are successfully blocked once you've implemented the code.

        Hope this helps!

        1 Reply Last reply Reply Quote 0
        • Milian
          Milian last edited by

          Yes this is what I thought, but wanted some second opinions.

          Although I wouldn't actually need a wild card after BTS, as just leaving it open is the same as using a wildcard:

          /fish*..........  Equivalent to "/fish" -- the trailing wildcard is ignored. https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt Thanks for the link, I'll take a look

          1 Reply Last reply Reply Quote 0
          • 1 / 1
          • First post
            Last post
          • Is robots met tag a more reliable than robots.txt at preventing indexing by Google?
            Bobbi_Tschumper
            Bobbi_Tschumper
            1
            7
            3.0k

          • Robots.txt advice
            Martijn_Scheijbeler
            Martijn_Scheijbeler
            0
            3
            105

          • Robots.txt Help
            GlobeRunner
            GlobeRunner
            0
            5
            162

          • Need help with Robots.txt
            MattAntonino
            MattAntonino
            1
            3
            127

          • Robots.txt assistance
            theLotter
            theLotter
            0
            9
            280

          • Robots.txt: how to exclude sub-directories correctly?
            MickEdwards
            MickEdwards
            1
            10
            48.0k

          • Robots.txt unblock
            Elchanan
            Elchanan
            0
            5
            4.3k

          • Should I robots block this directory?
            irvingw
            irvingw
            0
            3
            666

          Get started with Moz Pro!

          Unlock the power of advanced SEO tools and data-driven insights.

          Start my free trial
          Products
          • Moz Pro
          • Moz Local
          • Moz API
          • Moz Data
          • STAT
          • Product Updates
          Moz Solutions
          • SMB Solutions
          • Agency Solutions
          • Enterprise Solutions
          • Digital Marketers
          Free SEO Tools
          • Domain Authority Checker
          • Link Explorer
          • Keyword Explorer
          • Competitive Research
          • Brand Authority Checker
          • Local Citation Checker
          • MozBar Extension
          • MozCast
          Resources
          • Blog
          • SEO Learning Center
          • Help Hub
          • Beginner's Guide to SEO
          • How-to Guides
          • Moz Academy
          • API Docs
          About Moz
          • About
          • Team
          • Careers
          • Contact
          Why Moz
          • Case Studies
          • Testimonials
          Get Involved
          • Become an Affiliate
          • MozCon
          • Webinars
          • Practical Marketer Series
          • MozPod
          Connect with us

          Contact the Help team

          Join our newsletter
          Moz logo
          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
          • Accessibility
          • Terms of Use
          • Privacy