The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Robots.txt

    Robots.txt

    Technical SEO Issues
    7 3 1.0k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • AL123al
      AL123al last edited by

      Hello,

      My client has a robots.txt file which says this:

      User-agent: *
      
      Crawl-delay: 2
      
      I put it through a robots checker which said that it must have a **disallow command**. So should it say this:
      
      

      User-agent: *

      Disallow:

      crawl-delay: 2

      What effect (if any) would not having a disallow command make?

      Thanks

      1 Reply Last reply Reply Quote 0
      • MichaelC-15022
        MichaelC-15022 last edited by

        I'd be really, REALLY careful about a disallow statement like that: you run the risk of disallowing your entire website.

        FYI I'm not sure putting a crawl delay in your robots.txt file is the right answer.  I saw an example a week or so ago where Google (I think, but maybe it was Bing) explicitly said somewhere that it had ignored the crawl delay in the robots.txt. I would specify the crawl delay in Webmaster Tools instead.  It's hard to find, but it's there 🙂

        • in Webmaster Tools, select the site you want to set the crawl rate for
        • click the Gear icon in the upper right
        • you'll see the option there to set the crawl rate
        1 Reply Last reply Reply Quote 1
        • ThompsonPaul
          ThompsonPaul last edited by

          Your second version is correct, ALBA123 - the robots protocol does require you to include a disallow statement in order to be correctly configured, even if it's blank to indicate crawling the full site.

          I really question the wisdom of having a crawl delay in place though. What's the reason for doing so? I never want anything to get in the way of the search crawlers "doing their thing" as effectively as possible.

          It's also rather strange to go to a crawl delay, but not be blocking the crawling of any of the non-essential sections of the site. Usually a crawl delay is in place to reduce the resource use by crawlers (vastly better to improve the efficiency of the site or get stronger hosting) but delaying crawl for the whole site instead of saving resources by blocking the non-essential areas first is pretty heavy-handed.

          Doers that make sense?

          Paul

          1 Reply Last reply Reply Quote 2
          • AL123al
            AL123al last edited by

            Thanks to both of you. I will recommend that the Robots.txt is changed to:

            User-agent: *
            Disallow:

            in order to configure it right and miss out the crawl delay.

            Caroline

            1 Reply Last reply Reply Quote 0
            • MichaelC-15022
              MichaelC-15022 last edited by

              Caroline,

              REMOVE THE DISALLOW LINE.

              I am concerned that that line will match all URLs on the site, and disallow the ENTIRE site.

              Michael.

              ThompsonPaul 1 Reply Last reply Reply Quote 0
              • ThompsonPaul
                ThompsonPaul @MichaelC-15022 last edited by

                Michael - you are _incorrect, _I'm afraid! You need to read up on the specifics of the robots exclusion protocol.

                A blank Disallow directive absolutely does NOT match all URLs on the site. In order to match all URLs on the site, the configuration would have to be:

                User-agent: *
                Disallow: /
                

                Note the slash denoting the root of the site. If the field after disallow: is blank, that specifically means no URLs should be blocked. To quote www.robotstxt.org:

                Any empty value, indicates that all URLs can be retrieved. At least one Disallow field needs to be present in a record.

                The second part of that statement is equally important. For a record to be valid, it must include at least one user agent declaration and at least one disallow statement. If you want the file to not block any URLs, you must include the disallow: statement, but leave its value empty.

                For more proof of this, here's the exact example, also from robotstxt.org:

                To allow all robots complete access
                User-agent: *
                Disallow:
                

                (or just create an empty "/robots.txt" file, or don't use one at all)

                The main reason for including a robots.txt which doesn't block anything is to help clean up a server's error logs. With no robots.txt in place, an error will be inserted into the logs every time a crawler visits and can't find the file, bloating the logs and obscuring the real errors that might be present. A blank file may lead someone to believe that the robots.txt just hasn't been configured, leading to unnecessary confusion. So a file configured as above is preferable even if no blocking is desired.

                Hope that clears things up?

                Paul

                [edited to replace line breaks in the code examples that were stripped out by Moz text editor]

                MichaelC-15022 1 Reply Last reply Reply Quote 1
                • MichaelC-15022
                  MichaelC-15022 @ThompsonPaul last edited by

                  Oops, good catch Paul, you're correct!

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • Robots.txt error
                    jocameron
                    jocameron
                    0
                    8
                    138

                  • Meta Robots Noindex and Robots.txt File
                    Devanur-Rafi
                    Devanur-Rafi
                    0
                    2
                    125

                  • The use of robots.txt
                    ICON_Malta
                    ICON_Malta
                    0
                    3
                    84

                  • Do I need robots.txt and meta robots?
                    Cyrus-Shepard
                    Cyrus-Shepard
                    0
                    7
                    1.1k

                  • Robots.txt
                    BailHotline
                    BailHotline
                    0
                    5
                    760

                  • Robots.txt
                    JordanGodbey
                    JordanGodbey
                    0
                    6
                    619

                  • What is the sense of robots.txt?
                    RyanKent
                    RyanKent
                    0
                    3
                    702

                  • Robots.txt
                    Tom-Anthony
                    Tom-Anthony
                    0
                    4
                    1.1k

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy