The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Massive URL blockage by robots.txt

    Massive URL blockage by robots.txt

    Intermediate & Advanced SEO
    4 4 160
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • moneywise_test
      moneywise_test last edited by

      Hello people,

      In May there has been a dramatic increase in blocked URLs by robots.txt, even though we don't have so many URLs or crawl errors. You can view the attachment to see how it went up. The thing is the company hasn't touched the text file since 2012. What might be causing the problem? Can this result any penalties? Can indexation be lowered because of this?

      ?di=1113766463681

      1 Reply Last reply Reply Quote 0
      • IM_Learner
        IM_Learner last edited by

        Check you robots file. Are there entries to block the crawling? If you can give the url then it would be helpful/

        Regards

        1 Reply Last reply Reply Quote 0
        • Chris.Menke
          Chris.Menke last edited by

          Le Fras,

          You don't only have to change the robots.txt file for Google to indicate that more URLs are being blocked by it. The robots.txt file tells the search engines not to crawl given URLs, but that they may keep them in the index and display the URLs in the search results.

          So the search engines do know of the URLs that are being blocked and they are able to indicate that more are being blocked as you add pages to your site that are restricted by the robots.txt file.

          1 Reply Last reply Reply Quote 0
          • CleverPhD
            CleverPhD last edited by

            Even though there are less pages indexed compared to those that are blocked, you still have a significant increase in indexed pages as well.  That is a good thing!  You technically have more pages that are indexed than before.   It looks like you possibly relaunched the site or something?  More pages blocked could be an indexing problem, or it might be a good thing - it all depends on what pages are being blocked.

            If you relaunched the site and used this great new whiz-bang CMS that created an online catalog that gave your users 54 ways to sort your product catalog, then the number of "pages" could increase with each sort.  Just imagine, sort your widgets by color, or by size or by price, or by price and size, or by size and color, or by color and price - you get the idea.  Very quickly you have a bunch of duplicate pages of a single page.  If your SEO was on his or her toes, they would account for this using a canonical approach or possibly a meta noindex or changing the robots.txt etc.  That would be good as you are not going to confuse Google with all the different versions of the same page.

            Ultimately, Shailendra has the approach that you need to take.  Look in robots.txt, look at the code on your pages.  What happened around 5/26/2013?  All those things need to be looked at to try and answer your question.

            1 Reply Last reply Reply Quote 1
            • 1 / 1
            • First post
              Last post
            • Block session id URLs with robots.txt
              Mat_C
              Mat_C
              1
              4
              130

            • Disallow URLs ENDING with certain values in robots.txt?
              Andy.Drinkwater
              Andy.Drinkwater
              0
              4
              1.9k

            • Robots.txt Blocked Most Site URLs Because of Canonical
              0
              1
              117

            • Can URLs blocked with robots.txt hurt your site?
              workzentre
              workzentre
              0
              4
              302

            • Robots.txt: Can you put a /* wildcard in the middle of a URL?
              irvingw
              irvingw
              0
              2
              410

            • Block all but one URL in a directory using robots.txt?
              Cyrus-Shepard
              Cyrus-Shepard
              0
              3
              3.5k

            • Does It Really Matter to Restrict Dynamic URLs by Robots.txt?
              CommercePundit
              CommercePundit
              0
              4
              902

            • Blocking Dynamic URLs with Robots.txt
              TaitLarson
              TaitLarson
              1
              4
              5.1k

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            • Digital Marketers
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • Local Citation Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy