The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Moz Tools
    4. Block Moz (or any other robot) from crawling pages with specific URLs

    Block Moz (or any other robot) from crawling pages with specific URLs

    Moz Tools
    3 2 844
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Blacktie
      Blacktie last edited by

      Hello!

      Moz reports that my site has around 380 duplicate page content. Most of them come from dynamic generated URLs that have some specific parameters. I have sorted this out for Google in webmaster tools (the new Google Search Console) by blocking the pages with these parameters. However, Moz is still reporting the same amount of duplicate content pages and, to stop it, I know I must use robots.txt. The trick is that, I don't want to block every page, but just the pages with specific parameters. I want to do this because among these 380 pages there are some other pages with no parameters (or different parameters) that I need to take care of. Basically, I need to clean this list to be able to use the feature properly in the future.

      I have read through Moz forums and found a few topics related to this, but there is no clear answer on how to block only pages with specific URLs. Therefore, I have done my research and come up with these lines for robots.txt:

      User-agent: dotbot
      Disallow: /*numberOfStars=0

      User-agent: rogerbot
      Disallow: /*numberOfStars=0

      My questions:

      1. Are the above lines correct and would block Moz (dotbot and rogerbot) from crawling only pages that have numberOfStars=0 parameter in their URLs, leaving other pages intact?

      2. Do I need to have an empty line between the two groups? (I mean between "Disallow: /*numberOfStars=0" and "User-agent: rogerbot")? (or does it even matter?)

      I think this would help many people as there is no clear answer on how to block crawling only pages with specific URLs. Moreover, this should be valid for any robot out there.

      Thank you for your help!

      1 Reply Last reply Reply Quote 0
      • Andy.Drinkwater
        Andy.Drinkwater last edited by

        Hi,

        What you have there will work absolutely fine with a little tweak. And no need to leave spaces between lines.

        Disallow: /numberOfStars=0

        However, no need to add the wildcard at the end if there is nothing more after that.

        The best way to test what works, is before you go and add it to live, use the Robots.txt test tool in Search Console (Webmaster Tools), add in the lines above and then check to make sure none of your other pages are blocked. They won't be, but it's a great way to test before going live.

        I hope this helps 🙂

        -Andy

        Blacktie 1 Reply Last reply Reply Quote 1
        • Blacktie
          Blacktie @Andy.Drinkwater last edited by

          Hello!

          Thanks a lot for your feedback and clearing this out! It worked well.

          The robots.txt tester is a good tip!

          Thanks!

          1 Reply Last reply Reply Quote 0
          • 1 / 1
          • First post
            Last post
          • Why doesn't Moz crawl whole pages of our website to report All On-Page issues?
            BigSlate
            BigSlate
            0
            3
            65

          • Moz Crawl Report more urls?
            MattRoney
            MattRoney
            0
            2
            240

          • Will moz crawl pages blocked by robots.txt and nofollow links?
            Ryan_Watson
            Ryan_Watson
            0
            2
            184

          • The pages that add robots as noindex will Crawl and marked as duplicate page content on seo moz ?
            DougRoberts
            DougRoberts
            0
            2
            210

          • Seo moz has only crawled 2 pages of my site. Ive been notified of a 403 error and need an answer as to why my pages are not being crawled?
            nitro-digital
            nitro-digital
            0
            9
            319

          • Only few pages (308 pages of 1000 something pages) have been crawled and diagnosed in 4 days, how many days till the entire website is crawled complete?
            DarinPirkey
            DarinPirkey
            0
            4
            288

          • Dynamic URL pages in Crawl Diagnostics
            cmaseattle
            cmaseattle
            0
            4
            681

          • Why is Roger crawling pages that are disallowed in my robots.txt file?
            MeltButterySpread
            MeltButterySpread
            0
            5
            928

          Get started with Moz Pro!

          Unlock the power of advanced SEO tools and data-driven insights.

          Start my free trial
          Products
          • Moz Pro
          • Moz Local
          • Moz API
          • Moz Data
          • STAT
          • Product Updates
          Moz Solutions
          • SMB Solutions
          • Agency Solutions
          • Enterprise Solutions
          • Digital Marketers
          Free SEO Tools
          • Domain Authority Checker
          • Link Explorer
          • Keyword Explorer
          • Competitive Research
          • Brand Authority Checker
          • Local Citation Checker
          • MozBar Extension
          • MozCast
          Resources
          • Blog
          • SEO Learning Center
          • Help Hub
          • Beginner's Guide to SEO
          • How-to Guides
          • Moz Academy
          • API Docs
          About Moz
          • About
          • Team
          • Careers
          • Contact
          Why Moz
          • Case Studies
          • Testimonials
          Get Involved
          • Become an Affiliate
          • MozCon
          • Webinars
          • Practical Marketer Series
          • MozPod
          Connect with us

          Contact the Help team

          Join our newsletter
          Moz logo
          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
          • Accessibility
          • Terms of Use
          • Privacy