The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Robots.txt Blocking - Best Practices

    Robots.txt Blocking - Best Practices

    Intermediate & Advanced SEO
    7 4 456
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • ReunionMarketing
      ReunionMarketing last edited by

      Hi All,

      We have a web provider who's not willing to remove the wildcard line of code blocking all agents from crawling our client's site (user-agent: *, Disallow: /). They have other lines allowing certain bots to crawl the site but we're wondering if they're missing out on organic traffic by having this main blocking line. It's also a pain because we're unable to set up Moz Pro, potentially because of this first line.

      We've researched and haven't found a ton of best practices regarding blocking all bots, then allowing certain ones. What do you think is a best practice for these files?

      Thanks!

      User-agent: *
      Disallow: /
      
      User-agent: Googlebot
      Disallow:
      Crawl-delay: 5
      
      User-agent: Yahoo-slurp
      Disallow: 
      
      User-agent: bingbot
      Disallow:
      
      User-agent: rogerbot
      Disallow:
      
      User-agent: *
      Crawl-delay: 5
      Disallow: /new_vehicle_detail.asp
      Disallow: /new_vehicle_compare.asp
      Disallow: /news_article.asp
      Disallow: /new_model_detail_print.asp
      Disallow: /used_bikes/
      Disallow: /default.asp?page=xCompareModels
      Disallow: /fiche_section_detail.asp
      
      
      1 Reply Last reply Reply Quote 0
      • DmitriiK
        DmitriiK last edited by

        Hi.

        Super weird client - that's for sure.

        User-agent: * Disallow: /

        Every bot will be blocked off! how in the world are they ranking?

        https://moz.com/blog/controlling-search-engine-crawlers-for-better-indexation-and-rankings-whiteboard-friday

        watch that video, there are good ideas of bot and crawlers controlling. As well as you can consider that as best practices. And yes, what they have now is ridiculous.

        https://moz.com/community/q/should-we-use-google-s-crawl-delay-setting

        Here is a q/a about crawler delays. As far as I know Google ignores delays anyway, plus there is nothing good about it anyway.

        Hope this helps.

        DmitriiK 1 Reply Last reply Reply Quote 1
        • DmitriiK
          DmitriiK @DmitriiK last edited by

          Here is another video from Matt - https://www.youtube.com/watch?v=I2giR-WKUfY

          Lots of good points there too.

          ReunionMarketing 1 Reply Last reply Reply Quote 1
          • GreenStone
            GreenStone last edited by

            In general, I definitely wouldn't recommend the way the web-provider is handling this.

            • Disallowing all while adding exceptions should never be the norm. Allowing all to crawl while adding exceptions for other crawlers aside from google would be best practice generally,
              • It makes a lot more sense to just allow crawlers full access, and then add crawl delays for non google crawlers, in addition to disallowing those specific sub-folders:  Disallow: /new_vehicle_detail.asp Disallow: /new_vehicle_compare.asp Disallow: /news_article.asp Disallow: /new_model_detail_print.asp Disallow: /used_bikes/ Disallow: /default.asp?page=xCompareModels Disallow: /fiche_section_detail.asp.
            • Googlebot Disallow: Crawl-delay: 5, does not do you any good, as google does not obey these commands. Only Search Console can control this.
            • You can test what is visible to googlebot within search console's "robots" subsection, in order to verify what they can access.
            Martijn_Scheijbeler ReunionMarketing 2 Replies Last reply Reply Quote 1
            • Martijn_Scheijbeler
              Martijn_Scheijbeler @GreenStone last edited by

              Completely agree, I really wouldn't want to host my stuff with a company that can't figure out what really the best practices are ;-). This is very well layed out why you shouldn't want to set up your robots.txt like it is right now.

              1 Reply Last reply Reply Quote 1
              • ReunionMarketing
                ReunionMarketing @DmitriiK last edited by

                Thanks, Dmitrii for your response! From our research we've seen similar recommendations and it helps to have more evidence to back it up. Hopefully these guys will give in a bit!

                1 Reply Last reply Reply Quote 0
                • ReunionMarketing
                  ReunionMarketing @GreenStone last edited by

                  Thanks for taking the time to respond in depth, GreenStone. We appreciate the advice and have passed your response along to the web hosting company (along with a frustrated email) explaining they're not adhering to anyone's best practices. Hopefully this will convince them!

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • Robots.txt blocked internal resources Wordpress
                    Mat_C
                    Mat_C
                    1
                    5
                    985

                  • Block session id URLs with robots.txt
                    Mat_C
                    Mat_C
                    1
                    4
                    130

                  • SEO Best Practices regarding Robots.txt disallow
                    mememax
                    mememax
                    0
                    5
                    1.1k

                  • Best practice for disallowing URLS with Robots.txt
                    TimHolmes
                    TimHolmes
                    0
                    3
                    650

                  • Blocking poor quality content areas with robots.txt
                    KaneJamison
                    KaneJamison
                    0
                    4
                    128

                  • What content should I block in wodpress with robots.txt?
                    ENSO
                    ENSO
                    0
                    4
                    518

                  • Robots.txt is blocking Wordpress Pages from Googlebot?
                    Desiree-CP
                    Desiree-CP
                    0
                    4
                    10.7k

                  • Blocking Dynamic URLs with Robots.txt
                    TaitLarson
                    TaitLarson
                    1
                    4
                    5.1k

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy