The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Moz Tools
    4. Rogerbot Ignoring Robots.txt?

    Rogerbot Ignoring Robots.txt?

    Moz Tools
    6 4 2.1k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • kellydallen
      kellydallen last edited by

      Hi guys,

      We're trying to block Rogerbot from spending 8000-9000 of our 10000 pages per week for our site crawl on our zillions of PhotoGallery.asp pages. Unfortunately our e-commerce CMS isn't tremendously flexible so the only way we believe we can block rogerbot is in our robots.txt file.

      Rogerbot keeps crawling all these PhotoGallery.asp pages so it's making our crawl diagnostics really useless.

      I've contacted the SEOMoz support staff and they claim the problem is on our side. This is the robots.txt we are using:

      User-agent: rogerbot

      Disallow:/PhotoGallery.asp

      Disallow:/pindex.asp

      Disallow:/help.asp

      Disallow:/kb.asp

      Disallow:/ReviewNew.asp

      User-agent: *

      Disallow:/cgi-bin/

      Disallow:/myaccount.asp

      Disallow:/WishList.asp

      Disallow:/CFreeDiamondSearch.asp

      Disallow:/DiamondDetails.asp

      Disallow:/ShoppingCart.asp

      Disallow:/one-page-checkout.asp

      Sitemap: http://store.jrdunn.com/sitemap.xml

      For some reason the Wysiwyg edit is entering extra spaces but those are all single spaced.

      Any suggestions? The only other thing I thought of to try is to something like "Disallow:/PhotoGallery.asp*" with a wildcard.

      1 Reply Last reply Reply Quote 0
      • Malarowski
        Malarowski last edited by

        Try

        Disallow: /PhotoGallery.asp

        I put wild cards all over usually just to be sure and had no issues so far.

        kellydallen 1 Reply Last reply Reply Quote 1
        • kellydallen
          kellydallen @Malarowski last edited by

          Thanks so much for the tip. Unfortunately still unsuccessful. (shrug)

          1 Reply Last reply Reply Quote 0
          • Cyrus-Shepard
            Cyrus-Shepard last edited by

            Hi Kelly,

            Thanks for letting us know. Could be a couple of things right off the bat. Is this your exact robots.txt file? If so, it's missing some formatting like proper spacing to be perfectly compliant. You can run a check of your robots.txt file at serveral places.

            http://tool.motoricerca.info/robots-checker.phtml

            http://www.searchenginepromotionhelp.com/m/robots-text-tester/robots-checker.php

            http://www.sxw.org.uk/computing/robots/check.html

            Also, it's generally a good idea to put specific inclusions towards the bottom, so I might flip the order and put the rogerbot directives last and the User-agent: * first.

            Hope this helps. Let us know if any of this points in the right direction.

            kellydallen 1 Reply Last reply Reply Quote 0
            • kellydallen
              kellydallen @Cyrus-Shepard last edited by

              Thanks Cyrus,

              No, for some reason the editor double-spaced the file when I pasted. Other than that, it's the same though.

              Yes, I actually tried ordering the exclusions both ways. Neither works.

              The robots.txt checkers report no errors. I had actually checked them before posting.

              Before I posted this, I was pretty convinced the problem wasn't in our robots.txt but the Seomoz support staff says essentially, "We don't think the problem is with Rogerbot, so it must be in your robots.txt file, but we can't look at that, so if by some chance your robots.txt file is fine, then there's nothing we can do for you because we're just going to assume the problem is on your side."

              I figured, with everything I've already tried, and if the fabulous SEOMoz community can't come up with a solution, that'll be the best I can do.

              1 Reply Last reply Reply Quote 0
              • Mihas07
                Mihas07 last edited by

                I  have just encountered an interesting thing about Moz Link Search and its bot: if you do a search for Domains linking to Google.com , you find a  list of about 900 000 domains, among which I was surprised  to find  webcache.googleusercontent.com

                See the proof  below in attache screen shot.

                At the same time, the  webcache.googleusercontent.com policy  for robots is as shown in the second attachment.

                In my opinion, there is only one possible explanation: Moz Bot does ignore  robots.txt files...

                e9f7db874c 87ce35be1c

                1 Reply Last reply Reply Quote 0
                • 1 / 1
                • First post
                  Last post
                • Our crawler was not able to access the robots.txt file on your site.
                  Optimal_Strategies
                  Optimal_Strategies
                  0
                  3
                  80

                • Htaccess and robots.txt and 902 error
                  SEOguy1
                  SEOguy1
                  0
                  6
                  1.1k

                • Website blocked by Robots.txt in OSE
                  edlondon
                  edlondon
                  0
                  4
                  190

                • Will moz crawl pages blocked by robots.txt and nofollow links?
                  Ryan_Watson
                  Ryan_Watson
                  0
                  2
                  184

                • Moz campaign works around my robots.txt settings
                  VinceWicks
                  VinceWicks
                  0
                  4
                  149

                • How to remove URLS from from crawl diagnostics blocked by robots.txt
                  GrouchyKids
                  GrouchyKids
                  0
                  2
                  416

                • Blocking all robots except rogerbot
                  ignician
                  ignician
                  0
                  5
                  2.0k

                • To block with robots.txt or canonicalize?
                  STPseo
                  STPseo
                  0
                  2
                  522

                Get started with Moz Pro!

                Unlock the power of advanced SEO tools and data-driven insights.

                Start my free trial
                Products
                • Moz Pro
                • Moz Local
                • Moz API
                • Moz Data
                • STAT
                • Product Updates
                Moz Solutions
                • SMB Solutions
                • Agency Solutions
                • Enterprise Solutions
                • Digital Marketers
                Free SEO Tools
                • Domain Authority Checker
                • Link Explorer
                • Keyword Explorer
                • Competitive Research
                • Brand Authority Checker
                • Local Citation Checker
                • MozBar Extension
                • MozCast
                Resources
                • Blog
                • SEO Learning Center
                • Help Hub
                • Beginner's Guide to SEO
                • How-to Guides
                • Moz Academy
                • API Docs
                About Moz
                • About
                • Team
                • Careers
                • Contact
                Why Moz
                • Case Studies
                • Testimonials
                Get Involved
                • Become an Affiliate
                • MozCon
                • Webinars
                • Practical Marketer Series
                • MozPod
                Connect with us

                Contact the Help team

                Join our newsletter
                Moz logo
                © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                • Accessibility
                • Terms of Use
                • Privacy