The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Behavior & Demographics
    4. Help with robots.txt on Magento

    Help with robots.txt on Magento

    Behavior & Demographics
    3 3 5.9k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • OptimizedGroup
      OptimizedGroup last edited by

      Hi everybody,

      I need your help in order to fix some problems with HTML errors and Crawling errors generated by Magento on my client's website www.casabiancheria.it

      I have some problems with duplicate meta informations due to the fact that there are a lot of links such as

      • /stampe-romagnole/tovaglie-con-tovaglioli**/colore/**beige,marrone,giallo,lilla/show/all.html

      • /stampe-romagnole/tovaglie-con-tovaglioli**/colore/**beige,marrone,lilla/show/all.html

      that are generated by the filter /colore/ and so they have duplicate content and meta information on them.

      I activated the canonicals on Magento but this hasn't fixed the problem yet.

      On the sitemap there are only 1 link for each product, so it seems that the canonicals are working, but bot Google Webmaster Tools and SEO Moz are giving me errors on duplicate content and meta informations.

      I would like to solve these problems by excluding from robots.txt all the urls that contain the filter parameters, such as /colore/, /price/, /dimensions/, etc. (take a look to the attachment).

      I tried different solutions in order to exclude these links from robots, but I wasn't able to succeed.

      Below you can find my current robots.txt... can someone help me in order to write the correct form of this file and finally exclude all these urls generated by filters on Magento?

      Finally, is it worth it to exclude also the images from Magento? (take a look to the final lines of the robots below).

      Thank you very much for your help!

      Alberto

      User-agent: *
      Disallow: /CVS
      Disallow: /.svn$
      Disallow: /
      .idea$
      Disallow: /.sql$
      Disallow: /
      .tgz$
      Disallow: /w1nL1f3L0g1c/
      Disallow: /app/
      Disallow: /downloader/
      Disallow: /errors/
      Disallow: /includes/
      Disallow: /lib/
      Disallow: /pkginfo/
      Disallow: /shell/
      Disallow: /var/
      Disallow: /404/
      Disallow: /cgi-bin/
      Disallow: /magento/
      Disallow: /report/
      Disallow: /scripts/
      Disallow: /shell/
      Disallow: /skin/
      Disallow: /stats/
      Disallow: /api.php
      Disallow: /cron.php
      Disallow: /cron.sh
      Disallow: /error_log
      Disallow: /get.php
      Disallow: /install.php
      Disallow: /LICENSE.html
      Disallow: /LICENSE.txt
      Disallow: /LICENSE_AFL.txt
      Disallow: /README.txt
      Disallow: /RELEASE_NOTES.txt
      Disallow: /?dir
      Disallow: /?dir=desc
      Disallow: /
      ?dir=asc
      Disallow: /?limit=all
      Disallow: /
      ?mode*
      Disallow: /index.php/
      Disallow: /?SID=
      Disallow: /checkout/
      Disallow: /onestepcheckout/
      Disallow: /customer/
      Disallow: /customer/account/
      Disallow: /customer/account/login/
      Disallow: /catalogsearch/
      Disallow: /catalog/product_compare/
      Disallow: /catalog/category/view/
      Disallow: /catalog/product/view/
      Disallow: /cgi-bin/
      Disallow: /cleanup.php
      Disallow: /apc.php
      Disallow: /memcache.php
      Disallow: /phpinfo.php
      Disallow: /control/
      Disallow: /customize/
      Disallow: /newsletter/
      Disallow: /poll/
      Disallow: /review/
      Disallow: /sendfriend/
      Disallow: /tag/
      Disallow: /wishlist/
      Disallow: /catalog/product/gallery/
      Disallow: /
      ?*
      Disallow: //colore/
      Disallow: /
      /price/
      Disallow: //misura/
      Disallow: /
      /marca/
      Disallow: //sort-by/
      Disallow: /
      /combinazione/
      Disallow: /*/seleziona-colore/
      Disallow: /colore/
      Disallow: /price/
      Disallow: /misura/
      Disallow: /marca/
      Disallow: /sort-by/
      Disallow: /combinazione/
      Disallow: /seleziona-colore/
      Disallow: /*colore/
      Disallow: /*price/
      Disallow: /*misura/
      Disallow: /*marca/
      Disallow: /*sort-by/
      Disallow: /*combinazione/
      Disallow: /*seleziona-colore/

      UmuEX4z

      1 Reply Last reply Reply Quote 0
      • LynnPatchett
        LynnPatchett last edited by

        Hi,

        If the duplicated content urls are already in the google index then excluding them with the robots.txt will not remove them but just stop the google bot from crawling them again. You could do a bit of conditional logic on your head.phtml template file to check for the relevant url part and output a noindex,follow meta tag on the pages you don't want indexed. This is a more reliable way to make sure they are removed and not indexed in the future (be sure to test first!).

        1 Reply Last reply Reply Quote 1
        • Wickedwildweb
          Wickedwildweb last edited by

          This post is deleted!
          1 Reply Last reply Reply Quote 0
          • 1 / 1
          • First post
            Last post
          • Help Me With These Horrible Crawl Errors
            MattRoney
            MattRoney
            0
            8
            257

          • Does anyone know of a predictive demographics software that helps a website predict its audience based on cookies, or whatever info it has?
            Joel_Glenn_Wright
            Joel_Glenn_Wright
            0
            6
            405

          • Disallow robots on a url effect?
            NikolasNikolaou
            NikolasNikolaou
            0
            4
            371

          • I need Help with Google!!!!
            Debdulal
            Debdulal
            0
            7
            461

          • How long til meta robots noindex takes effect?
            gfreeman23
            gfreeman23
            0
            4
            4.7k

          • HELP! 75% Drop in Traffic with no Explaination
            Zachary_Russell
            Zachary_Russell
            0
            10
            938

          • Did a Unique Experiment and am seeing odd results. Need help.
            GCSMasone
            GCSMasone
            0
            2
            248

          • Google Penalisation - Any help would be appreciated!
            ChrisHolgate
            ChrisHolgate
            0
            13
            1.1k

          Get started with Moz Pro!

          Unlock the power of advanced SEO tools and data-driven insights.

          Start my free trial
          Products
          • Moz Pro
          • Moz Local
          • Moz API
          • Moz Data
          • STAT
          • Product Updates
          Moz Solutions
          • SMB Solutions
          • Agency Solutions
          • Enterprise Solutions
          • Digital Marketers
          Free SEO Tools
          • Domain Authority Checker
          • Link Explorer
          • Keyword Explorer
          • Competitive Research
          • Brand Authority Checker
          • Local Citation Checker
          • MozBar Extension
          • MozCast
          Resources
          • Blog
          • SEO Learning Center
          • Help Hub
          • Beginner's Guide to SEO
          • How-to Guides
          • Moz Academy
          • API Docs
          About Moz
          • About
          • Team
          • Careers
          • Contact
          Why Moz
          • Case Studies
          • Testimonials
          Get Involved
          • Become an Affiliate
          • MozCon
          • Webinars
          • Practical Marketer Series
          • MozPod
          Connect with us

          Contact the Help team

          Join our newsletter
          Moz logo
          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
          • Accessibility
          • Terms of Use
          • Privacy