The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Can URLs blocked with robots.txt hurt your site?

    Can URLs blocked with robots.txt hurt your site?

    Intermediate & Advanced SEO
    4 4 302
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • nicole.healthline
      nicole.healthline last edited by

      We have about 20 testing environments blocked by robots.txt, and these environments contain duplicates of our indexed content. These environments are all blocked by robots.txt, and appearing in google's index as blocked by robots.txt--can they still count against us or hurt us?

      I know the best practice to permanently remove these would be to use the noindex tag, but I'm wondering if we leave them they way they are if they can still hurt us.

      1 Reply Last reply Reply Quote 0
      • MichaelYork
        MichaelYork last edited by

        I don't believe they are going to hurt you, it is more of a warning that if you are trying to have these indexed that at the moment they can't be accessed. When you don't want them to be indexed i.e. in this case, I don't believe you are suffering because of it.

        1 Reply Last reply Reply Quote 1
        • MattAntonino
          MattAntonino last edited by

          I've seen people say that in "rare" cases, links blocked by Robots.txt will be shown as search results but there's no way I can imagine that would happen if it's duplicates of your content.

          Robots.txt lets a search engine know not to crawl a directory - but if another resource links to it, they may know it exists, just not the content of it.  They won't know if it's noindex or not because they don't crawl it - but if they know it exists, they could rarely return it.  Duplicate content would have a better result, therefore that better result will be returned, and your test sites should not be...

          As far as hurting your site, no way. Unless a page WAS allowed, is duplicate, is now NOT allowed, and hasn't been recrawled.  In that case, I can't imagine it would hurt you that much either.  I wouldn't worry about it.

          (Also, noindex doesn't matter on these pages. At least to Google.  Google will see the noindex first and will not crawl the page. Until they crawl the page it doesn't matter if it has one word or 300 directives, they'll never see it.  So noindex really wouldn't help unless a page had already slipped through.)

          1 Reply Last reply Reply Quote 0
          • workzentre
            workzentre last edited by

            90% not, first of all, check if google indexed them, if not, your robots.txt should do it, however I would reinforce that by making sure those URLs are our of your sitemap file and make sure your robots's disallows are set to ALL *, not just google for example.

            Google's duplicity policies are tough, but they will always respect simple policies such as robots.txt.

            I had a case in the past when a customer had a dedicated IP, and google somehow found it, so you could see both the domain's pages and IP's pages, both the same, we simply added a .htaccess rule to point the IP requests to the domain, and even when the situation was like that for long, it doesn't seem to have affected them. In theory google penalizes duplicity but not in this particular cases, it is a matter of behavior.

            Regards!

            1 Reply Last reply Reply Quote 1
            • 1 / 1
            • First post
              Last post
            • Block session id URLs with robots.txt
              Mat_C
              Mat_C
              1
              4
              130

            • Will disallowing URL's in the robots.txt file stop those URL's being indexed by Google
              Martijn_Scheijbeler
              Martijn_Scheijbeler
              0
              11
              1.6k

            • How to make Google index your site? (Blocked with robots.txt for a long time)
              SanjidaKazi
              SanjidaKazi
              0
              3
              128

            • Moving career site to new URL from main site. Will it hurt SEO for main page?
              hecklerponics
              hecklerponics
              0
              2
              78

            • Robots.txt: Can you put a /* wildcard in the middle of a URL?
              irvingw
              irvingw
              0
              2
              410

            • Search Engine Blocked by robots.txt for Dynamic URLs
              KeriMorgret
              KeriMorgret
              0
              2
              689

            • Should we block urls like this - domainname/shop/leather-chairs.html?brand=244&cat=16&dir=ascℴ=price&price=1 within the robots.txt?
              sferrino
              sferrino
              0
              2
              864

            • Blocking Dynamic URLs with Robots.txt
              TaitLarson
              TaitLarson
              1
              4
              5.1k

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            • Digital Marketers
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • Local Citation Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy