The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Link Explorer
    4. Why doesn't Moz crawler follow robots.txt?

    Why doesn't Moz crawler follow robots.txt?

    Link Explorer
    10 4 879
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Tylerj
      Tylerj last edited by

      It is crawling the entire site, and there is stuff we do not want it to. Please advise.

      1 Reply Last reply Reply Quote 0
      • moz_support
        moz_support last edited by

        Hi there! Moz's crawler, rogerbot, does follow robots.txt. When he's not following robots.txt, it's usually because the robots.txt protocol is formatted improperly. Learn more about formatting your page here: https://moz.com/learn/seo/robotstxt

        For more information on Roger, including how to block him, head here: https://moz.com/help/guides/moz-procedures/what-is-rogerbot

        And if you want to test your formatting, try the Robots Checker here: https://support.google.com/webmasters/answer/6062598

        If you're still unable to determine why rogerbot is crawling your site, feel free to write in to help@moz.com!

        Tylerj 1 Reply Last reply Reply Quote 1
        • Vijay-Gaur
          Vijay-Gaur last edited by

          All major search engines, including Moz's crawler Rogerbot and Internet Archives, respect Robots.txt as a standard “robots exclusion protocol” to communicate with web crawlers and web robots.

          In  case you wish to exclude some specific information from all Search Engines, you can use the following sample code as reference to block specific directories.

          User-agent: *
          Disallow: /cgi-bin/
          Disallow: /tmp/
          Disallow: /junk/

          However, if you want to specifically block Mz's Rogerbot from crawling specific sections of your website. You may take the following reference code to block specific areas / directories in your website from rogerbot:

          User-agent: Rogerbot
          Disallow: /cgi-bin/
          Disallow: /tmp/
          Disallow: /junk/

          I hope this helps, If you have specific questions, please feel free to respond, I will be happy to answer them.

          Regards,

          Vijay

          1 Reply Last reply Reply Quote 0
          • Andy-Halliday
            Andy-Halliday last edited by

            Hi

            Have to agree with the above, Rogerbot does listen to robot.txt file, unlike Bing - while they are getting better Bing ignores the robots.txt file frequently.

            Ive analysed quite a few server logs over the years and Roger has always listened to the file - its usually a mistake the in the robots file.

            There is an option to test your robots.txt file in GCS - while this is testing to see if Google will crawl the page - usually Roger has the same instructions as Google.

            However if you are still pretty certain that Roger is ignoring robots.txt please DM your Server Logs and your website and I will take a look and analyse it for you (free of course).

            Thanks

            Andy

            1 Reply Last reply Reply Quote 0
            • Tylerj
              Tylerj @moz_support last edited by

              So I made a mistake it isn't the robots.txt that is the issue. I am getting hit with a ton of duplicate content penalties so I figured that was it. The problem is that I have pages with rel=canonical tags that it is ignoring. Does Roger not read those?

              Vijay-Gaur Andy-Halliday 2 Replies Last reply Reply Quote 0
              • Vijay-Gaur
                Vijay-Gaur @Tylerj last edited by

                Hi There,

                Rel=canonical tags tell robots, which page is actually to index out of many.

                For SEOs, canonicalization refers to individual web pages that can be loaded from multiple URLs. This is a problem because when multiple pages have the same content but different URLs, links that are intended to go to the same page get split up among multiple URLs. This means that the popularity of the pages gets split up. Unfortunately for web developers, this happens far too often because the default settings for web servers create this problem.

                _https://moz.com/learn/seo/canonicalization_

                I feel you have not used it correctly, check the above article and see if it helps.

                Thanks,

                Vijay

                Tylerj 1 Reply Last reply Reply Quote 0
                • Andy-Halliday
                  Andy-Halliday @Tylerj last edited by

                  rel=canonical is not really an robots instruction file - rel=canonical is to help with duplicate copy where you have the same or similar pages and your telling search engines which pages is the preferred page.

                  If you don't want pages crawling you have to tell Search engines in the robots file

                  1 Reply Last reply Reply Quote 0
                  • Tylerj
                    Tylerj @Vijay-Gaur last edited by

                    It has been used correctly. The site is a Magento site and they have it built in. There are a lot of filters for products so it uses rel=canonical to tell Google which to index.

                    Andy-Halliday 1 Reply Last reply Reply Quote 0
                    • Andy-Halliday
                      Andy-Halliday @Tylerj last edited by

                      Yes,  it doesn't tell them which pages not to crawl - just not to index them

                      Tylerj 1 Reply Last reply Reply Quote 0
                      • Tylerj
                        Tylerj @Andy-Halliday last edited by

                        Which I am ok with, but why am I getting duplicate content?

                        1 Reply Last reply Reply Quote 0
                        • 1 / 1
                        • First post
                          Last post
                        • Moz can't crawl our site
                          Natalie-Alexis
                          Natalie-Alexis
                          0
                          2
                          22

                        • Moz doesn't index my backlink - Why?
                          dave.kudera
                          dave.kudera
                          0
                          2
                          83

                        • MOZ doesn't work for .dating and .chat domain extensions
                          eli.myers
                          eli.myers
                          1
                          2
                          108

                        • My title has a TM symbol and Moz says I don't have the keyword in my title
                          bREALmarketing
                          bREALmarketing
                          0
                          5
                          398

                        • Can't make sense of OSE and MOZ.
                          kevin.loesken
                          kevin.loesken
                          0
                          3
                          140

                        • WHY still Moz OSE doesn't show updated data DA, PA & overall backlinks whereas Ahrefs.com shows all?
                          Pooja_Chauhan
                          Pooja_Chauhan
                          0
                          5
                          498

                        • Why does OSE redirect to another url when the url you're testing doesn't redirect in your browser?
                          Linda-Vassily
                          Linda-Vassily
                          0
                          2
                          194

                        • Why does the number of the total external links and the followed linking root domains between Open site explorer and the campaign in Moz pro doesn't match?
                          DavidLee
                          DavidLee
                          0
                          2
                          125

                        Get started with Moz Pro!

                        Unlock the power of advanced SEO tools and data-driven insights.

                        Start my free trial
                        Products
                        • Moz Pro
                        • Moz Local
                        • Moz API
                        • Moz Data
                        • STAT
                        • Product Updates
                        Moz Solutions
                        • SMB Solutions
                        • Agency Solutions
                        • Enterprise Solutions
                        • Digital Marketers
                        Free SEO Tools
                        • Domain Authority Checker
                        • Link Explorer
                        • Keyword Explorer
                        • Competitive Research
                        • Brand Authority Checker
                        • Local Citation Checker
                        • MozBar Extension
                        • MozCast
                        Resources
                        • Blog
                        • SEO Learning Center
                        • Help Hub
                        • Beginner's Guide to SEO
                        • How-to Guides
                        • Moz Academy
                        • API Docs
                        About Moz
                        • About
                        • Team
                        • Careers
                        • Contact
                        Why Moz
                        • Case Studies
                        • Testimonials
                        Get Involved
                        • Become an Affiliate
                        • MozCon
                        • Webinars
                        • Practical Marketer Series
                        • MozPod
                        Connect with us

                        Contact the Help team

                        Join our newsletter
                        Moz logo
                        © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                        • Accessibility
                        • Terms of Use
                        • Privacy