The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Robots.txt and Magento

    Robots.txt and Magento

    Technical SEO Issues
    6 3 746
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • EcomLkwd
      EcomLkwd last edited by

      HI,

      I am working on getting my robots.txt up and running and I'm having lots of problems with the robots.txt my developers generated. www.plasticplace.com/robots.txt

      I ran the robots.txt through a syntax checking tool (http://www.sxw.org.uk/computing/robots/check.html) This is what the tool came back with:  http://www.dcs.ed.ac.uk/cgi/sxw/parserobots.pl?site=plasticplace.com  There seems to be many errors on the file.

      Additionally, I looked at our robots.txt in the WMT and they said the crawl was postponed because the robots.txt is inaccessible. What does that mean?

      A few questions:

      1. Is there a need for all the lines of code that have the “#” before it? I don’t think it’s necessary but correct me if I'm wrong.

      2. Furthermore, why are we blocking so many things on our website? The robots can’t get past anything that requires a password to access anyhow but again correct me if I'm wrong.

      3.  Is there a reason Why can't it just look like this:

      User-agent: *

      Disallow: /onepagecheckout/

      Disallow: /checkout/cart/

      I do understand that Magento has certain folders that you don't want crawled, but is this necessary and why are there so many errors?

      1 Reply Last reply Reply Quote 0
      • James77
        James77 last edited by

        I assume this is a robots.txt that has been automatically created by Magento? - or has it been created by a developer?

        I ran it through a tool and it showed 1 error and 10 warnings - so i would say you definitely need to do something about it.

        The reason for all those disallows is to try and stop search engine indexing them (whether they would even find them to index them if they were not there is debatable).

        What you could do is set up robots.txt as you have suggested and then stop the SE's indexing the directories or pages you don't want in appropriate webmaster tools.

        I don't like displaying a lot of 'don't index' paths in the robots texts as it is pretty much telling any hacker or nasty spider where your weak points may be.

        EcomLkwd 1 Reply Last reply Reply Quote 1
        • Felip3
          Felip3 last edited by

          3.  Is there a reason Why can't it just look like this:

          Yes, It would generate a lot of duplicates issues, for example your  robots.txt you have the follow line:

          Disallow: /catalog/category/view/ -> That's the "real" category URL, you can access any category on magento by /catalog/category/view/id or by the "pretty" URL.
          Because you disallow the "real: URL only the pretty URL will be viable for search engines. 
          
          This same rule apply for many other parts of the robots.txt.
          
          EcomLkwd 1 Reply Last reply Reply Quote 1
          • EcomLkwd
            EcomLkwd @James77 last edited by

            My developer said they custom configured it to block the files they needed according to Magento.

            You think I can simply make it look like this:

            User-agent: *

            Disallow: /onepagecheckout/

            Disallow: /checkout/cart/

            and then disable it in WMT?

            1 Reply Last reply Reply Quote 0
            • EcomLkwd
              EcomLkwd @Felip3 last edited by

              I am bit confused. Are you saying that technically my Magento site has two different urls that can both be indexed; one with a (messy) url and another with a vanity url?  This would create major duplicate content issues! The robots.txt would not solve such a complex issue.

              Am I missing something?

              Felip3 1 Reply Last reply Reply Quote 0
              • Felip3
                Felip3 @EcomLkwd last edited by

                Yes your short robots.txt idea would create a huge problem.

                In your Magento admin if you click in the menu Catalog > URL Rewrite Management

                You will see the magento feature  that creates all the "pretty urls", in that page you will see a table. If get value from  Target path column and copy and paste after your site domain, for example domain.com/value_in_target_path...

                You'll see that the page loads fine, you don't want Google to rank those pages with the "messy" URL so that's why you need all those stuff in your robots.txt

                1 Reply Last reply Reply Quote 0
                • 1 / 1
                • First post
                  Last post
                • Robots.txt
                  MichaelC-15022
                  MichaelC-15022
                  0
                  7
                  1.0k

                • Robots.txt
                  irvingw
                  irvingw
                  0
                  4
                  116

                • Robots.txt
                  BailHotline
                  BailHotline
                  0
                  5
                  760

                • Robots txt
                  LadyApollo
                  LadyApollo
                  0
                  3
                  427

                • Robots.txt for subdomain
                  de4e
                  de4e
                  0
                  3
                  15.0k

                • Robots.txt
                  JordanGodbey
                  JordanGodbey
                  0
                  6
                  619

                • Robots.txt
                  Ontarioseo
                  Ontarioseo
                  0
                  5
                  737

                • Robots.txt
                  Tom-Anthony
                  Tom-Anthony
                  0
                  4
                  1.1k

                Get started with Moz Pro!

                Unlock the power of advanced SEO tools and data-driven insights.

                Start my free trial
                Products
                • Moz Pro
                • Moz Local
                • Moz API
                • Moz Data
                • STAT
                • Product Updates
                Moz Solutions
                • SMB Solutions
                • Agency Solutions
                • Enterprise Solutions
                • Digital Marketers
                Free SEO Tools
                • Domain Authority Checker
                • Link Explorer
                • Keyword Explorer
                • Competitive Research
                • Brand Authority Checker
                • Local Citation Checker
                • MozBar Extension
                • MozCast
                Resources
                • Blog
                • SEO Learning Center
                • Help Hub
                • Beginner's Guide to SEO
                • How-to Guides
                • Moz Academy
                • API Docs
                About Moz
                • About
                • Team
                • Careers
                • Contact
                Why Moz
                • Case Studies
                • Testimonials
                Get Involved
                • Become an Affiliate
                • MozCon
                • Webinars
                • Practical Marketer Series
                • MozPod
                Connect with us

                Contact the Help team

                Join our newsletter
                Moz logo
                © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                • Accessibility
                • Terms of Use
                • Privacy