The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Getting Started
    4. Standard Syntax in robots.txt doesn't prevent Moz bot from crawling

    Standard Syntax in robots.txt doesn't prevent Moz bot from crawling

    Getting Started
    6 2 127
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • btreloar
      btreloar last edited by

      A client is getting many false positive site crawl errors for things like duplicate titles and duplicate content on pages that include /tag/ in the URL. An example is https://needquest.com/place_tag/autism-spectrum-disorder/page/4/

      To resolve this we have set up a disallow statement in the robots.txt file that says
      Disallow: /page/

      For some reason this appears not to work, as the site crawl errors continue to list pages like this. Does anyone understand why that would be and what we need to do to properly disallow crawling these pages?

      1 Reply Last reply Reply Quote 0
      • tawnycase
        tawnycase last edited by

        Hey there!

        Tawny from Moz's Help Team here.

        Adding a disallow directive for /tag/ won't help with the example URL you've provided — that URL doesn't have /tag/ in the URL pathway. To block us from seeing content like that URL you listed, you'd need a disallow directive for /place_tag/.

        If you include that disallow directive, that should stop us from seeing duplicate content on pages with /place_tag/ in the URL. 🙂

        Hope that helps! If you've still got questions, feel free to shoot us a note over at help@moz.com and we'll do our best to sort things out with you.

        1 Reply Last reply Reply Quote 0
        • btreloar
          btreloar last edited by

          Sorry, Tawny ... I did go back and correct  y question. We did apply Disallow: /page/ to address this issue. The /place_tag/ is found in many pages we DO want to crawl and index ... and we only want here to disallow those page 2, page 3, page 4, etc. pages.

          (We also disallowed /tag/, /category/, and a few other common issues that generate false positives in the site crawl.)

          1 Reply Last reply Reply Quote 0
          • btreloar
            btreloar last edited by

            Any reason the Disallow: /page/ isn't preventing URLs like
            https://needquest.com/place_tag/autism-spectrum-disorder**/page/**4/
            from generating duplicate descriptions and title errors in our site crawl? It was my hope that those pages wouldn't be crawled at all.

            tawnycase 1 Reply Last reply Reply Quote 0
            • tawnycase
              tawnycase @btreloar last edited by

              I'm not seeing that URL coming up with Duplicate Title or Duplicate Content issues — when I search by that URL I see no Content issues at that URL. I do see that URL in the All Crawled Pages section, but I can't find it bringing up Content issues in the app.

              That said, I took a look at your robots.txt file, and I think this could be a result of having an Allow command before the rest of the Disallow commands. I think possibly if you put that Allow command at the end of the block of Disallow commands, rogerbot would see the disallow for /page/ and stop crawling those URLs.

              If you're still running into trouble, I would suggest writing in to us at help@moz.com so we can take a closer look at the Campaign and what could be going on there.

              1 Reply Last reply Reply Quote 0
              • btreloar
                btreloar last edited by

                Thanks, Tawny,

                If you look at Duplicate titles, check the first one (https://needquest.com/place_tag/autism-spectrum-disorder/). All the URLs with a duplicate title have /page/ in them. I will suggest they move the Allow statement and see if that helps.

                1 Reply Last reply Reply Quote 0
                • 1 / 1
                • First post
                  Last post
                • When I crawl my site On Moz it says it can't access the robots.txt file, but crawl is fine on SEM Rush - Anyone know any reason for this?
                  meghanpahinui
                  meghanpahinui
                  0
                  3
                  60

                • Moz Site Crawl can't index WIX sites
                  samantha.chapman
                  samantha.chapman
                  2
                  2
                  680

                • Moz was unable to crawl your site - robots.txt
                  JRGRDZ
                  JRGRDZ
                  0
                  5
                  386

                • My question is, when you translate your website to another language, does moz crawl both or do i have to add another campaign to moz so that they can crawl it seperately?
                  RyanPurkey
                  RyanPurkey
                  0
                  4
                  130

                • Why can't I Ctrl + click on links on Moz any more?
                  max.favilli
                  max.favilli
                  1
                  2
                  96

                • Custom report doesn't list keywords.
                  ukandyh
                  ukandyh
                  0
                  2
                  27

                • New to MOZ, can't create a campaign.
                  DavidLee
                  DavidLee
                  0
                  6
                  149

                • How to get moz to crawl a staging domain that is blocked by robots.txt
                  SamWeber
                  SamWeber
                  0
                  2
                  143

                Get started with Moz Pro!

                Unlock the power of advanced SEO tools and data-driven insights.

                Start my free trial
                Products
                • Moz Pro
                • Moz Local
                • Moz API
                • Moz Data
                • STAT
                • Product Updates
                Moz Solutions
                • SMB Solutions
                • Agency Solutions
                • Enterprise Solutions
                • Digital Marketers
                Free SEO Tools
                • Domain Authority Checker
                • Link Explorer
                • Keyword Explorer
                • Competitive Research
                • Brand Authority Checker
                • Local Citation Checker
                • MozBar Extension
                • MozCast
                Resources
                • Blog
                • SEO Learning Center
                • Help Hub
                • Beginner's Guide to SEO
                • How-to Guides
                • Moz Academy
                • API Docs
                About Moz
                • About
                • Team
                • Careers
                • Contact
                Why Moz
                • Case Studies
                • Testimonials
                Get Involved
                • Become an Affiliate
                • MozCon
                • Webinars
                • Practical Marketer Series
                • MozPod
                Connect with us

                Contact the Help team

                Join our newsletter
                Moz logo
                © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                • Accessibility
                • Terms of Use
                • Privacy