The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Is robots met tag a more reliable than robots.txt at preventing indexing by Google?

    Is robots met tag a more reliable than robots.txt at preventing indexing by Google?

    Intermediate & Advanced SEO
    7 6 3.0k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • McTaggart
      McTaggart last edited by

      What's your experience of using robots meta tag v robots.txt when it comes to a stand alone solution to prevent Google indexing?

      I am pretty sure robots meta tag is more reliable - going on own experiences, I have never experience any probs with robots meta tags but plenty with robots.txt as a stand alone solution.

      Thanks in advance, Luke

      1 Reply Last reply Reply Quote 1
      • LoganRay
        LoganRay last edited by

        Hi Luke,

        It's a pretty common misconception that the robots.txt will prevent indexing. It's only purpose is actually to prevent crawling, anything disallowed in there is still up for indexing if it's linked to elsewhere. If you want something deindexed, your best bet is the robots meta tag, but make sure you allow crawling of the URLs to give search engine bots an opportunity to see the tag.

        1 Reply Last reply Reply Quote 1
        • GWMSEO
          GWMSEO last edited by

          If you've recently added the "noindex" meta, get the page fetched in GWT. Google can't act if it doesn't see the tag.

          1 Reply Last reply Reply Quote 0
          • BlueprintMarketing
            BlueprintMarketing last edited by

            Hi Luke,

            In order to exclude individual pages from search engine indices, the noindex meta tag

            is actually superior to robots.txt.

            But X-Robots-Tag header tag is the best but much hader to use.

            Block all web crawlers from all content
            User-agent: *
            Disallow: /
            

            Using the robots.txt file, you can tell a spider where it cannot go on your site. You can not tell a search engine which URLs it cannot show in the search results. This means that not allowing a search engine to crawl an URL – called “blocking” it – does not mean that URL will not show up in the search results. If the search engine finds enough links to that URL, it will include it; it will just not know what’s on that page.

            If you want to reliably block a page from showing up in the search results, you need to use a meta robots noindex tag. That means the search engine has to be able to index that page and find the noindex tag, so the page should not be blocked by robots.txt

            a robots.txt file does. In a nutshell, what it does is tell search engines to not crawl a particular page, file or directory of your website.

            Using this, helps both you and search engines such as Google. By not providing access to certain, unimportant areas of your website, you can save on your crawl budget and reduce load on your server.

            Please note that using the robots.txt file to hide your entire website for search engines is definitely not recommended.

            see big photo: http://i.imgur.com/MM7hM4g.png

            
            _(…)_
            
            _(…)_
            
            

            The robots meta tag in the above example instructs all search engines not to show the page in search results. The value of the name attribute (robots) specifies that the directive applies to all crawlers. To address a specific crawler, replace the robots value of the name attribute with the name of the crawler that you are addressing. Specific crawlers are also known as user-agents (a crawler uses its user-agent to request a page.) Google's standard web crawler has the user-agent name. Googlebot To prevent only Googlebot from crawling your page, update the tag as follows:

            This tag now instructs Google (but no other search engines) not to show this page in its web search results. Both the and name the attributescontent are non-case sensitive.

            Search engines may have different crawlers for different properties or purposes. See the complete list of Google's crawlers. For example, to show a page in Google's web search results, but not in Google News, use the following meta tag:

            If you need to specify multiple crawlers individually, it's okay to use multiple robots meta tags:

            If competing directives are encountered by our crawlers we will use the most restrictive directive we find.

            irective. This basically means that if you want to really hide something from the search engines, and thus from people using search, robots.txt won’t suffice.

            Indexer directives

            Indexer directives are directives that are set on a per page and/or per element basis. Up until July 2007, there were two directives: the microformat rel=”nofollow”, which means that that link should not pass authority / PageRank, and the Meta Robots tag.

            With the Meta Robots tag, you can really prevent search engines from showing pages you want to keep out of the search results. The same result can be achieved with the X-Robots-Tag HTTP header. As described earlier, the X-Robots-Tag gives you more flexibility by also allowing you to control how specific file(types) are indexed.

            Example uses of the X-Robots-Tag

            Using the X-Robots-Tag HTTP header

            The X-Robots-Tag can be used as an element of the HTTP header response for a given URL. Any directive that can be used in an robots meta tag can also be specified as an X-Robots-Tag. Here's an example of an HTTP response with an X-Robots-Tag instructing crawlers not to index a page:

            HTTP/1.1 200 OK
            Date: Tue, 25 May 2010 21:42:43 GMT
            _(…)_
            **X-Robots-Tag: noindex**
            _(…)_
            
            

            Multiple X-Robots-Tag headers can be combined within the HTTP response, or you can specify a comma-separated list of directives. Here's an example of an HTTP header response which has a noarchive X-Robots-Tag combined with an unavailable_after X-Robots-Tag.

            HTTP/1.1 200 OK
            Date: Tue, 25 May 2010 21:42:43 GMT
            _(…)_
            **X-Robots-Tag: noarchive
            X-Robots-Tag: unavailable_after: 25 Jun 2010 15:00:00 PST**
            _(…)_
            
            

            The X-Robots-Tag may optionally specify a user-agent before the directives. For instance, the following set of X-Robots-Tag HTTP headers can be used to conditionally allow showing of a page in search results for different search engines:

            HTTP/1.1 200 OK
            Date: Tue, 25 May 2010 21:42:43 GMT
            _(…)_
            **X-Robots-Tag: googlebot: nofollow
            X-Robots-Tag: otherbot: noindex, nofollow**
            _(…)_
            
            

            Directives specified without a user-agent are valid for all crawlers. The section below demonstrates how to handle combined directives. Both the name and the specified values are not case sensitive.

            • https://moz.com/learn/seo/robotstxt
            • https://yoast.com/ultimate-guide-robots-txt/
            • https://moz.com/blog/the-wonderful-world-of-seo-metatags
            • https://yoast.com/x-robots-tag-play/
            • https://www.searchenginejournal.com/x-robots-tag-simple-alternate-robots-txt-meta-tag/67138/
            • https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag

            I hope this helps,

            Tom

            MM7hM4g.png CfQwhBq.png lock-environment.png

            Bobbi_Tschumper 1 Reply Last reply Reply Quote 1
            • BlueprintMarketing
              BlueprintMarketing last edited by

              Test for what works for your site.

              Use tools below

              1. https://www.deepcrawl.com/ (will give you one free full crawl)
              2. https://www.screamingfrog.co.uk/seo-spider/ (free up to 500 URLs)
              3. http://urlprofiler.com/ (14 days free try)
              • https://www.deepcrawl.com/blog/best-practice/noindex-disallow-nofollow/
              • https://www.screamingfrog.co.uk/seo-spider/user-guide/general/#robots-txt
              • https://www.deepcrawl.com/blog/best-practice/noindex-and-google/

              So much info

              https://www.deepcrawl.com/blog/tag/robots-txt/

              Thomas

              1 Reply Last reply Reply Quote 1
              • Guest
                Guest last edited by

                This post is deleted!
                1 Reply Last reply Reply Quote 0
                • Bobbi_Tschumper
                  Bobbi_Tschumper @BlueprintMarketing last edited by

                  Hi there,

                  Regarding the X-Robots tag. We have had a couple of sites that were disallowed in the robots.txt have their PDF, Doc etc files get indexed. I understand the reasoning for this. I would like to remove the disallow in the robots.txt and  use the X-robots tag to noindex all pages  as well as PDF, Doc files etc. This is for a ngnix configuation. Does anyone know what the written x-robots tag would look like in this case?

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • If I block a URL via the robots.txt - how long will it take for Google to stop indexing that URL?
                    GastonRiera
                    GastonRiera
                    0
                    3
                    98

                  • Google robots.txt test - not picking up syntax errors?
                    McTaggart
                    McTaggart
                    0
                    3
                    108

                  • Robots.txt Disallowed Pages and Still Indexed
                    Igor.Go
                    Igor.Go
                    0
                    3
                    2.9k

                  • SSL and robots.txt question - confused by Google guidelines
                    Daniel_Morgan
                    Daniel_Morgan
                    0
                    3
                    1.5k

                  • How to make Google index your site? (Blocked with robots.txt for a long time)
                    SanjidaKazi
                    SanjidaKazi
                    0
                    3
                    128

                  • Google Indexing Duplicate URLs : Ignoring Robots & Canonical Tags
                    AlanBleiweiss
                    AlanBleiweiss
                    0
                    2
                    321

                  • Google showing high volume of URLs blocked by robots.txt in in index-should we be concerned?
                    TakeshiYoung
                    TakeshiYoung
                    0
                    4
                    302

                  • How to Disallow Tag Pages With Robot.txt
                    monster99
                    monster99
                    0
                    6
                    4.0k

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy