The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Can I rely on just robots.txt

    Can I rely on just robots.txt

    Technical SEO Issues
    6 3 194
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • spiralsites
      spiralsites last edited by

      We have a test version of a clients web site on a separate server before it goes onto the live server.

      Some code from the test site has some how managed to get Google to index the test site which isn't great!

      Would simply adding a robots text file to the root of test simply blocking all be good enough or will i have to put the meta tags for no index and no follow etc on all pages on the test site also?

      1 Reply Last reply Reply Quote 0
      • irvingw
        irvingw last edited by

        You cannot rely on robots.txt alone, you need to add the meta noindex tag to the pages as well to ensure that they will not get indexed.

        1 Reply Last reply Reply Quote 2
        • spiralsites
          spiralsites last edited by

          cheers, i thought as much

          1 Reply Last reply Reply Quote 0
          • ThompsonPaul
            ThompsonPaul last edited by

            You're actually up against a bit of a sticky wicket here, SS. You do need the no-index, no-follow meta tags on each page as Irving mentions.

            HOWEVER! If you also add a robots.txt directive not to index the site, the search crawlers will not crawl your pages and therefore will never see the noindex metatag to know to remove the incorrectly-indexed pages from their index.

            My recommendation is for a belt & suspenders approach.

            • implement the meta no-index, no-follow tags throughout the dev site, but do NOT immediately implement the robots.txt exclusion. Wait a day or two until the pages get recrawled and the bots discover the noindex metatags
            • Use the Remove URL tools in both Google and Bing Webmaster Tools to request removal of all the dev pages you are aware have been indexed.
            • Then add the exclusion directive to the robots.txt file to keep the crawlers out from then on (leaving the no-index, no-follow tags in place).
            • check back in the SERPS periodically to check that no other dev pages have been indexed. IF they have, do another manual removal request.

            Does that make sense?

            Paul

            P.S. As a last measure, run an inbound links check on the dev pages that got indexed to find out which external pages are linking to the dev pages. Get those inbound links removed ASAP so the search engines aren't getting any signals to index the dev site. Last option would be to simply password-protect the directory the dev site is in. A little less convenient, but guaranteed to keep the crawlers out.

            spiralsites 1 Reply Last reply Reply Quote 2
            • spiralsites
              spiralsites @ThompsonPaul last edited by

              thats a great help cheers

              wheres the best place to do an inbound link check?

              ThompsonPaul 1 Reply Last reply Reply Quote 0
              • ThompsonPaul
                ThompsonPaul @spiralsites last edited by

                You can do the inbound link check right here using SEOMoz's Open Site Explorer tool to check for links to the dev site, whether it's in a subdomain, subfolder or a separate site.

                Good luck!

                Paul

                1 Reply Last reply Reply Quote 0
                • 1 / 1
                • First post
                  Last post
                • How can I make it so that robots.txt is not ignored due to a URL re-direct?
                  rodelmo4
                  rodelmo4
                  0
                  4
                  56

                • Can I Block https URLs using Host directive in robots.txt?
                  LoganRay
                  LoganRay
                  0
                  4
                  760

                • Is there a limit to how many URLs you can put in a robots.txt file?
                  CraigBradford
                  CraigBradford
                  0
                  10
                  1.1k

                • Robots.txt
                  WesleySmits
                  WesleySmits
                  0
                  9
                  168

                • Do I need robots.txt and meta robots?
                  Cyrus-Shepard
                  Cyrus-Shepard
                  0
                  7
                  1.1k

                • What can I do if Google Webmaster Tools doesn't recognize the robots.txt file?
                  wrttnwrd
                  wrttnwrd
                  0
                  7
                  396

                • Robots txt
                  LadyApollo
                  LadyApollo
                  0
                  3
                  427

                • Can I Disallow Faceted Nav URLs - Robots.txt
                  AlanMosley
                  AlanMosley
                  0
                  5
                  914

                Get started with Moz Pro!

                Unlock the power of advanced SEO tools and data-driven insights.

                Start my free trial
                Products
                • Moz Pro
                • Moz Local
                • Moz API
                • Moz Data
                • STAT
                • Product Updates
                Moz Solutions
                • SMB Solutions
                • Agency Solutions
                • Enterprise Solutions
                • Digital Marketers
                Free SEO Tools
                • Domain Authority Checker
                • Link Explorer
                • Keyword Explorer
                • Competitive Research
                • Brand Authority Checker
                • Local Citation Checker
                • MozBar Extension
                • MozCast
                Resources
                • Blog
                • SEO Learning Center
                • Help Hub
                • Beginner's Guide to SEO
                • How-to Guides
                • Moz Academy
                • API Docs
                About Moz
                • About
                • Team
                • Careers
                • Contact
                Why Moz
                • Case Studies
                • Testimonials
                Get Involved
                • Become an Affiliate
                • MozCon
                • Webinars
                • Practical Marketer Series
                • MozPod
                Connect with us

                Contact the Help team

                Join our newsletter
                Moz logo
                © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                • Accessibility
                • Terms of Use
                • Privacy