The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Other Research Tools
    4. Moz "Crawl Diagnostics" doesn't respect robots.txt

    Moz "Crawl Diagnostics" doesn't respect robots.txt

    Other Research Tools
    5 5 916
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Vitalized
      Vitalized last edited by

      Hello, I've just had a new website crawled by the Moz bot.  It's come back with thousands of errors saying things like:

      • Duplicate content
      • Overly dynamic URLs
      • Duplicate Page Titles

      The duplicate content & URLs it's found are all blocked in the robots.txt so why am I seeing these errors?
      Here's an example of some of the robots.txt that blocks things like dynamic URLs and directories (which Moz bot ignored):

      Disallow: /?mode=
      Disallow: /?limit=
      Disallow: /?dir=
      Disallow: /?p=*&
      Disallow: /?SID=
      Disallow: /reviews/
      Disallow: /home/

      Many thanks for any info on this issue.

      1 Reply Last reply Reply Quote 0
      • MattAntonino
        MattAntonino last edited by

        Is the / actually in the URL at that spot?  Or is your link like http://www.example.com/abcd?p=147

        If you give an example full URL that includes one of your blocked dynamic URLs we can take a better look.  If your robots is setup correctly, it shouldn't find that stuff but give us more info if you're able.

        1 Reply Last reply Reply Quote 1
        • helgeolaussen
          helgeolaussen last edited by

          If you have an "index,(no)follow" meta on those pages I think they will be crawled even though you have them blocked in robots.txt. So by adding "noindex" on those pages it might work as you want it to.

          1 Reply Last reply Reply Quote 1
          • ChiarynMiranda
            ChiarynMiranda last edited by

            Hey Si,

            Thanks for writing in. It doesn't seem that we are having an overarching issue with our crawler ignoring robots.txt files so I did some research in Google Webmaster Tools and it looks like most crawlers require an asterisk in the disallow directive to recognize that all pages of a dynamic URL are being disallowed. If you look in the "Pattern Matching" section of this resource here: http://support.google.com/webmasters/bin/answer.py?hl=en&answer=156449, that should give you more information about setting up the robots.txt with the correct disallow directives to block those pages.

            If you add in the astrisk to the disallow directive and you are still seeing these pages crawled, it would help if you sent in an email with your campaign information to our support desk at help@moz.com so we can have our engineers look into this more directly.

            I hope this helps.

            Chiaryn

            1 Reply Last reply Reply Quote 0
            • Christy-Correll
              Christy-Correll last edited by

              Hi Si, has this issue been resolved?

              1 Reply Last reply Reply Quote 0
              • 1 / 1
              • First post
                Last post
              • Crawl-test not doesn't finish
                Dr-Pete
                Dr-Pete
                2
                2
                58

              • Why doesn't the Keyword Explorer "Explore By Site" work?
                eli.myers
                eli.myers
                0
                4
                54

              • Moz Bar doesn't show any data and keeps asking me to log in when actually I'm logged in.
                jocameron
                jocameron
                0
                4
                2.8k

              • I'm getting, "you're not using the rel="canonical" META attribute" in my crawl diagnotic
                webtheoria.com
                webtheoria.com
                0
                4
                141

              • I'm getting a Crawl error 605 Page Banned by robots.txt, X-Robots-Tag HTTP Header, or Meta Robots Tag
                DavidLee
                DavidLee
                1
                13
                939

              • When attempting to crawl my site, I'm getting the error: Oops! That URL doesn’t resolve, which means your report will be blank. Please fix the issue or change the URL. What's going on here?
                KeriMorgret
                KeriMorgret
                0
                4
                305

              • Moz Rank Tracker doesn't work with "PHRASE" Keywords!?
                SEOisSEO
                SEOisSEO
                0
                5
                200

              • Since the revised website was launched, I can't find the "Crawl Test" function showing Titles and Descriptions of other websites. Anyone know where that link is located?
                Karen_Dauncey
                Karen_Dauncey
                0
                4
                145

              Get started with Moz Pro!

              Unlock the power of advanced SEO tools and data-driven insights.

              Start my free trial
              Products
              • Moz Pro
              • Moz Local
              • Moz API
              • Moz Data
              • STAT
              • Product Updates
              Moz Solutions
              • SMB Solutions
              • Agency Solutions
              • Enterprise Solutions
              • Digital Marketers
              Free SEO Tools
              • Domain Authority Checker
              • Link Explorer
              • Keyword Explorer
              • Competitive Research
              • Brand Authority Checker
              • Local Citation Checker
              • MozBar Extension
              • MozCast
              Resources
              • Blog
              • SEO Learning Center
              • Help Hub
              • Beginner's Guide to SEO
              • How-to Guides
              • Moz Academy
              • API Docs
              About Moz
              • About
              • Team
              • Careers
              • Contact
              Why Moz
              • Case Studies
              • Testimonials
              Get Involved
              • Become an Affiliate
              • MozCon
              • Webinars
              • Practical Marketer Series
              • MozPod
              Connect with us

              Contact the Help team

              Join our newsletter
              Moz logo
              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
              • Accessibility
              • Terms of Use
              • Privacy