The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Google is indexing blocked content in robots.txt

    Google is indexing blocked content in robots.txt

    Technical SEO Issues
    5 4 147
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • elisainteractive
      elisainteractive last edited by

      Hi,Google is indexing some URLs that i don't want to be indexed and also is indexing the same URLs with https. This URLs are blocked in the file robots.txt.I've tried to block this URLs through Google WebmasterTools but Google doesn't let me do it because this URL are httpsThe file robots.txt is correct so, what can i do to avoid this content to be indexed?

      1 Reply Last reply Reply Quote 0
      • EastEssence22
        EastEssence22 last edited by

        It seems you have added/modified Robot.txt file later. Wait for some time, Say 15 days.
        Also ensure syntax for robot.txt

        Regards,

        elisainteractive 1 Reply Last reply Reply Quote 0
        • elisainteractive
          elisainteractive @EastEssence22 last edited by

          Thank you, but that is not the problem. The file robots.txt is done since a long time ago.

          CleverPhD 1 Reply Last reply Reply Quote 0
          • CleverPhD
            CleverPhD @elisainteractive last edited by

            This will sound backwards but it works.

            1. Add the meta noindex tag to all pages you want out of the index.

            2. Take those same pages out of the robots.txt and allow them to be crawled.

            The meta noindex tells Google to remove the page from the index.  It is preferred over using robots.txt

            http://moz.com/learn/seo/robotstxt

            The robot.txt - blocks Google from crawling the page, but things can still show up if there are other pages linking to the page you are trying to remove.

            http://www.youtube.com/watch?v=KBdEwpRQRD0

            You have to allow Google to crawl the pages (by taking them out of the robots.txt) so it can read the noindex meta tags that then tell Google to take them out of the index.

            1 Reply Last reply Reply Quote 3
            • bjs2010
              bjs2010 last edited by

              I think you will find that the URL´s in Google´s index are either:

              1. indexed prior to putting in the robots.txt disallow in place - check in the google serp and click on "in cache" to see the date.
              2. Heavily linked to by other external domains.
              3. Both of the above.

              @cleverphd has a great solution. Follow that.

              1 Reply Last reply Reply Quote 1
              • 1 / 1
              • First post
                Last post
              • Google Indexing Development Site Despite Robots.txt Block
                DeanAndrews
                DeanAndrews
                0
                6
                990

              • Google indexing despite robots.txt block
                john4math
                john4math
                0
                13
                654

              • Google index dymamic webpages after block in robots.txt...
                Dr-Pete
                Dr-Pete
                0
                6
                247

              • Block Domain in robots.txt
                donford
                donford
                0
                6
                2.2k

              • I accidentally blocked Google with Robots.txt. What next?
                SebastianCowie
                SebastianCowie
                0
                7
                2.1k

              • Google (GWT) says my homepage and posts are blocked by Robots.txt
                stubby
                stubby
                0
                8
                930

              • Blocking robots.txt
                de4e
                de4e
                0
                4
                432

              Get started with Moz Pro!

              Unlock the power of advanced SEO tools and data-driven insights.

              Start my free trial
              Products
              • Moz Pro
              • Moz Local
              • Moz API
              • Moz Data
              • STAT
              • Product Updates
              Moz Solutions
              • SMB Solutions
              • Agency Solutions
              • Enterprise Solutions
              • Digital Marketers
              Free SEO Tools
              • Domain Authority Checker
              • Link Explorer
              • Keyword Explorer
              • Competitive Research
              • Brand Authority Checker
              • Local Citation Checker
              • MozBar Extension
              • MozCast
              Resources
              • Blog
              • SEO Learning Center
              • Help Hub
              • Beginner's Guide to SEO
              • How-to Guides
              • Moz Academy
              • API Docs
              About Moz
              • About
              • Team
              • Careers
              • Contact
              Why Moz
              • Case Studies
              • Testimonials
              Get Involved
              • Become an Affiliate
              • MozCon
              • Webinars
              • Practical Marketer Series
              • MozPod
              Connect with us

              Contact the Help team

              Join our newsletter
              Moz logo
              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
              • Accessibility
              • Terms of Use
              • Privacy