The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Confused about robots.txt

    Confused about robots.txt

    Technical SEO Issues
    4 4 770
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Netpace
      Netpace last edited by

      There is a lot of conflicting and/or unclear information about robots.txt out there. Somehow, I can't make out what's the best way to use robots even after visiting the official robots website. For example I have the following format for my robots.

      User-agent: *
      Disallow: javascript.js
      Disallow: /images/
      Disallow: /embedconfig
      Disallow: /playerconfig
      Disallow: /spotlightmedia
      Disallow: /EventVideos
      Disallow: /playEpisode
      
      Allow: /
      
      Sitemap: http://www.example.tv/sitemapindex.xml
      Sitemap: http://www.example.tv/sitemapindex-videos.xml
      Sitemap: http://www.example.tv/news-sitemap.xml
      

      Is this correct and/or recommended? If so, then how come I see a list of over 200 or so links blocked by robots when Im checking out Google Webmaster Tools!

      Help someone, anyone! Can't seem to understand this robotic business!

      Regards,

      1 Reply Last reply Reply Quote 0
      • Entrusteddev
        Entrusteddev last edited by

        Hi,

        Allow: / isn't valid syntax in a robots.txt file, Anything that isn't disallowed is allowed by default.

        Other than that all looks good. Perhaps the 200 or so links to blocked pages were indexed before the robots.txt was last updated with the disallows?

        Regards

        Aran

        1 Reply Last reply Reply Quote 1
        • irvingw
          irvingw last edited by

          I would also recommend to go to the site configuration - crawler access page in Google Webmaster and test many of your sites URL's to ensure that robots can access them. Test every unique URL format on your site like the search results page, product pages, category pages, etc...  I always use this tool whenever I make any change in the robots.txt

          1 Reply Last reply Reply Quote 1
          • crvw
            crvw last edited by

            Google may still index pages excluded by robots.txt if the pages are backlinked either internally or externally.

            For best results, use meta noindex to tell search engines they're not allowed to show the link in results, and meta nofollow to tell robots not to follow any links on the page.

            Webmaster Tools Help: Using meta tags to block access to your site

            You can also explicitly address goooglebot in the meta tag, as opposed to just robots. If you use both a robots.txt and meta robots tags and there are conflicting directives,  googlebot will follow the most restrictive one.

            1 Reply Last reply Reply Quote 1
            • 1 / 1
            • First post
              Last post
            • Robots.txt
              MarieHaynes
              MarieHaynes
              0
              8
              115

            • Robots.txt
              WesleySmits
              WesleySmits
              0
              9
              168

            • Robots.txt
              BailHotline
              BailHotline
              0
              5
              760

            • Robots.txt
              JordanGodbey
              JordanGodbey
              0
              6
              619

            • Robots.txt
              Ontarioseo
              Ontarioseo
              0
              5
              737

            • Robots.txt
              Entrusteddev
              Entrusteddev
              0
              3
              642

            • Robots.txt and robots meta
              TheEspresseo
              TheEspresseo
              0
              5
              1.1k

            • Robots.txt
              Tom-Anthony
              Tom-Anthony
              0
              4
              1.1k

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            • Digital Marketers
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • Local Citation Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy