The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Robots file set up

    Robots file set up

    Technical SEO Issues
    3 3 155
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • mcwork
      mcwork last edited by

      The robots file looks like it has been set up in a very messy way.
      I understand the # will comment out a line, does this mean the sitemap would
      not be picked up?
      Disallow: /js/ should this be allowed like /*.js$
      Disallow: /media/wysiwyg/ - this seems to be causing alerts in webmaster tools as it can not access
      the images within.
      Can anyone help me clean this up please


      #Sitemap: https://examplesite.com/sitemap.xml

      Crawlers Setup

      User-agent: *
      Crawl-delay: 10

      Allowable Index

      Mind that Allow is not an official standard

      Allow: /index.php/blog/
      Allow: /catalog/seo_sitemap/category/

      Allow: /catalogsearch/result/

      Allow: /media/catalog/

      Directories

      Disallow: /404/
      Disallow: /app/
      Disallow: /cgi-bin/
      Disallow: /downloader/
      Disallow: /errors/
      Disallow: /includes/
      Disallow: /js/
      Disallow: /lib/
      Disallow: /magento/

      Disallow: /media/

      Disallow: /media/captcha/

      Disallow: /media/catalog/

      #Disallow: /media/css/
      #Disallow: /media/css_secure/
      Disallow: /media/customer/
      Disallow: /media/dhl/
      Disallow: /media/downloadable/
      Disallow: /media/import/
      #Disallow: /media/js/
      Disallow: /media/pdf/
      Disallow: /media/sales/
      Disallow: /media/tmp/
      Disallow: /media/wysiwyg/
      Disallow: /media/xmlconnect/
      Disallow: /pkginfo/
      Disallow: /report/
      Disallow: /scripts/
      Disallow: /shell/
      #Disallow: /skin/
      Disallow: /stats/
      Disallow: /var/

      Paths (clean URLs)

      Disallow: /index.php/
      Disallow: /catalog/product_compare/
      Disallow: /catalog/category/view/
      Disallow: /catalog/product/view/
      Disallow: /catalog/product/gallery/
      Disallow: */catalog/product/upload/
      Disallow: /catalogsearch/
      Disallow: /checkout/
      Disallow: /control/
      Disallow: /contacts/
      Disallow: /customer/
      Disallow: /customize/
      Disallow: /newsletter/
      Disallow: /poll/
      Disallow: /review/
      Disallow: /sendfriend/
      Disallow: /tag/
      Disallow: /wishlist/

      Files

      Disallow: /cron.php
      Disallow: /cron.sh
      Disallow: /error_log
      Disallow: /install.php
      Disallow: /LICENSE.html
      Disallow: /LICENSE.txt
      Disallow: /LICENSE_AFL.txt
      Disallow: /STATUS.txt
      Disallow: /get.php # Magento 1.5+

      Paths (no clean URLs)

      #Disallow: /.js$
      #Disallow: /
      .css$
      Disallow: /.php$
      Disallow: /
      ?SID=
      Disallow: /rss*
      Disallow: /*PHPSESSID

      Disallow: /:
      Disallow: /
      😘

      User-agent: Fatbot
      Disallow: /

      User-agent: TwengaBot-2.0
      Disallow: /

      1 Reply Last reply Reply Quote 0
      • rjonesx. 0
        rjonesx. 0 last edited by

        Looks like your intuitions are pretty good! I would remove the # before sitemap, as you have indicated. I would remove the line about /js/ as Google needs access to javascript these days and will throw a fit if you don't. I wouldnt worry about the wysiwyg directory if it only has images that you dont care about ranking.

        1 Reply Last reply Reply Quote 1
        • ecommercebc
          ecommercebc last edited by

          To add to this, I'd also recommend having a look around in /lib/ just to make sure you aren't blocking important javascript and css files (I've been bitten by this!).

          More guidance here: https://developers.google.com/webmasters/mobile-sites/mobile-seo/common-mistakes/blocked-resources?hl=en

          1 Reply Last reply Reply Quote 1
          • 1 / 1
          • First post
            Last post
          • Blocking subdomains with Robots.txt file
            PaulM01
            PaulM01
            0
            3
            641

          • Robots File
            Hurf
            Hurf
            0
            3
            60

          • Does this robots.txt file look right?
            ThompsonPaul
            ThompsonPaul
            0
            8
            188

          • Is having no robots.txt file the same as having one and allowing all agents?
            ITRIX
            ITRIX
            0
            2
            697

          • Do i have my robots.txt file set up properly
            ClaireH-184886
            ClaireH-184886
            1
            4
            319

          • Does Bing ignore robots txt files?
            Nightwing
            Nightwing
            0
            3
            2.8k

          • Is there a reason to set a crawl-delay in the robots.txt?
            NakulGoyal
            NakulGoyal
            0
            2
            737

          • Use of Robots.txt file on a job site
            jennita
            jennita
            0
            5
            850

          Get started with Moz Pro!

          Unlock the power of advanced SEO tools and data-driven insights.

          Start my free trial
          Products
          • Moz Pro
          • Moz Local
          • Moz API
          • Moz Data
          • STAT
          • Product Updates
          Moz Solutions
          • SMB Solutions
          • Agency Solutions
          • Enterprise Solutions
          • Digital Marketers
          Free SEO Tools
          • Domain Authority Checker
          • Link Explorer
          • Keyword Explorer
          • Competitive Research
          • Brand Authority Checker
          • Local Citation Checker
          • MozBar Extension
          • MozCast
          Resources
          • Blog
          • SEO Learning Center
          • Help Hub
          • Beginner's Guide to SEO
          • How-to Guides
          • Moz Academy
          • API Docs
          About Moz
          • About
          • Team
          • Careers
          • Contact
          Why Moz
          • Case Studies
          • Testimonials
          Get Involved
          • Become an Affiliate
          • MozCon
          • Webinars
          • Practical Marketer Series
          • MozPod
          Connect with us

          Contact the Help team

          Join our newsletter
          Moz logo
          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
          • Accessibility
          • Terms of Use
          • Privacy