The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Robots.txt and canonical tag

    Robots.txt and canonical tag

    Technical SEO Issues
    7 3 2.1k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • seoug_2005
      seoug_2005 last edited by

      In the SEOmoz post - http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts, it's being said -

      If you have a robots.txt disallow in place for a page, the canonical tag will never be seen.

      Does it so happen that if a page is disallowed by robots.txt, spiders DO NOT read the html code ?

      1 Reply Last reply Reply Quote 0
      • Daylan
        Daylan last edited by

        Thats correct in most cases:

        It works likes this: a robot wants to vists a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds:

        User-agent: *
        Disallow: /

        The "User-agent: *" means this section applies to all robots. The "Disallow: /" tells the robot that it should not visit any pages on the site.

        Robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.

        More information available here about:

        http://www.robotstxt.org/

        seoug_2005 1 Reply Last reply Reply Quote 1
        • seoug_2005
          seoug_2005 @Daylan last edited by

          Thanks Daylan for your quick response. I just wanted a second opinion that canonical tag will never be seen if a page is disallowed.

          1 Reply Last reply Reply Quote 0
          • RyanKent
            RyanKent last edited by

            Daylan offered a great answer but I would like to add one exception. When crawlers from the major SEs visit your site they will honor your robots.txt file but sometimes they will follow links from other sites to an article on your site, and during that particular visit they will not see the robots.txt file and index your page.

            This is one of the reasons why your robots.txt file should be used as minimally as possible, and when it is used you should have a backup process in place such as the canonical or noindex tag on a page.

            seoug_2005 1 Reply Last reply Reply Quote 1
            • seoug_2005
              seoug_2005 @RyanKent last edited by

              If spiders follow links to an article on my site, will they read the contents then ?  If the canonical tag is on article page itself, will canonical  tag will be seen ?

              RyanKent seoug_2005 2 Replies Last reply Reply Quote 0
              • RyanKent
                RyanKent @seoug_2005 last edited by

                What we know is there have been many cases where a page that is blocked in robots.txt has appeared in search results. The explanation provided is that robots.txt blocks crawlers during normal site visits, but not necessarily on visits where they are following links from other sites.

                1 Reply Last reply Reply Quote 1
                • seoug_2005
                  seoug_2005 @seoug_2005 last edited by

                  Thanks Ryan for explaining things very clearly.

                  1 Reply Last reply Reply Quote 0
                  • 1 / 1
                  • First post
                    Last post
                  • No index tag robots.txt
                    Nigel_Carr
                    Nigel_Carr
                    0
                    11
                    3.3k

                  • Robots.txt
                    MarieHaynes
                    MarieHaynes
                    0
                    8
                    115

                  • Easy Question: regarding no index meta tag vs robot.txt
                    Everett
                    Everett
                    0
                    4
                    213

                  • Robots.txt
                    Dan-Lawrence
                    Dan-Lawrence
                    0
                    5
                    99

                  • Internal search : rel=canonical vs noindex vs robots.txt
                    Dr-Pete
                    Dr-Pete
                    0
                    9
                    4.5k

                  • Canonical Tag Here?
                    Dr-Pete
                    Dr-Pete
                    0
                    4
                    566

                  • Robots.txt and robots meta
                    TheEspresseo
                    TheEspresseo
                    0
                    5
                    1.1k

                  • Robots.txt
                    Tom-Anthony
                    Tom-Anthony
                    0
                    4
                    1.1k

                  Get started with Moz Pro!

                  Unlock the power of advanced SEO tools and data-driven insights.

                  Start my free trial
                  Products
                  • Moz Pro
                  • Moz Local
                  • Moz API
                  • Moz Data
                  • STAT
                  • Product Updates
                  Moz Solutions
                  • SMB Solutions
                  • Agency Solutions
                  • Enterprise Solutions
                  • Digital Marketers
                  Free SEO Tools
                  • Domain Authority Checker
                  • Link Explorer
                  • Keyword Explorer
                  • Competitive Research
                  • Brand Authority Checker
                  • Local Citation Checker
                  • MozBar Extension
                  • MozCast
                  Resources
                  • Blog
                  • SEO Learning Center
                  • Help Hub
                  • Beginner's Guide to SEO
                  • How-to Guides
                  • Moz Academy
                  • API Docs
                  About Moz
                  • About
                  • Team
                  • Careers
                  • Contact
                  Why Moz
                  • Case Studies
                  • Testimonials
                  Get Involved
                  • Become an Affiliate
                  • MozCon
                  • Webinars
                  • Practical Marketer Series
                  • MozPod
                  Connect with us

                  Contact the Help team

                  Join our newsletter
                  Moz logo
                  © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                  • Accessibility
                  • Terms of Use
                  • Privacy