The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Robots.txt & Disallow: /*? Question!

    Robots.txt & Disallow: /*? Question!

    Intermediate & Advanced SEO
    8 7 321
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • vetofunk
      vetofunk last edited by

      Hi,

      I have a site where they have:

      Disallow: /*?

      Problem is we need the following indexed:

      ?utm_source=google_shopping

      What would the best solution be? I have read:

      User-agent: *
      Allow: ?utm_source=google_shopping
      Disallow: /*?

      Any ideas?

      1 Reply Last reply Reply Quote 0
      • effectdigital
        effectdigital last edited by

        With this kind of thing, it's really better to pick the specific parameters (or parameter combinations) which you'd like to exclude, e.g:

        User-agent: *
        
        

        Disallow: /shop/product/&size=*

        Disallow: */shop/product/*?size=* 
        
        

        Disallow: /stockists?product=*

        ^ I just took the above from a robots.txt file which I have been working on, as these particular pages don't have 'pretty' URLs with unique content on. Very soon now that will change and the blocks will be lifted

        If you are really 100% sure that there's only one param which you want to let through, then you'd go with:

        User-agent: *
        
        

        Disallow: /?

        Allow: /?utm_source=google_shopping

        Allow: /*&utm_source=google_shopping*
        

        (or something pretty similar to that!)

        Before you set anything live, get down a list of URLs which represent the blocks (and allows) which you want to achieve. Test it all with the Robots.txt tester (in Search Console) before you set anything live!

        1 Reply Last reply Reply Quote 0
        • NickSamuel
          NickSamuel last edited by

          Hi Jeff,

          Robots.txt tester as per the above link is definitely worth playing with and is the easiest route to achieving what you want.

          Another reactive way of managing this is in some cases is to simply see the range of parameters Google has naturally crawled within Search Console.

          You can see this in the old search console for now. So login and go to Crawl --> URL Parameters.

          If Googlebot has encountered any ?=params it will list them. You'll then have an option how to manage them or exclude them from the index.

          It can be a decent way of cleaning up a site with lot's of indexed pages (1,000+), although please be sure to read this documentation before using it: https://support.google.com/webmasters/answer/6080548?hl=en

          1 Reply Last reply Reply Quote 0
          • SAjad687
            SAjad687 last edited by

            User-agent: *
            Disallow: /cgi-bin/
            Disallow: /wp-admin/
            Disallow: /archives/
            Disallow: /*?*
            Allow: /comments/feed/
            Disallow: /refer/
            Disallow: /index.php
            Disallow: /wp-content/plugins/
            Allow: /wp-admin/admin-ajax.php
            
            User-agent: Mediapartners-Google*
            Allow: /
            
            User-agent: Googlebot-Image
            Allow: /wp-content/uploads/
            
            User-agent: Adsbot-Google
            Allow: /
            
            User-agent: Googlebot-Mobile
            Allow: /
            
            Sitemap: https://site.com/sitemap_index.xml
            
            use this it will help you
            
            Regards
            [Saad](https://clicktestworld.com/)
            
            Hoslaa 1 Reply Last reply Reply Quote 0
            • Hoslaa
              Hoslaa @SAjad687 last edited by

              User-agent: * Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /archives/ Disallow: /? Allow: /comments/feed/ Disallow: /refer/ Disallow: /index.php Disallow: /wp-content/plugins/ Allow: /wp-admin/admin-ajax.php User-agent: Mediapartners-Google* Allow: / User-agent: Googlebot-Image Allow: /wp-content/uploads/ User-agent: Adsbot-Google Allow: / User-agent: Googlebot-Mobile Allow: / Sitemap: https://site.com/sitemap_index.xml

              this will work ??
              Regards
              Sajad

              Hoslaa 1 Reply Last reply Reply Quote 0
              • Hoslaa
                Hoslaa @Hoslaa last edited by

                This post is deleted!
                kojabacha 1 Reply Last reply Reply Quote 0
                • kojabacha
                  kojabacha @Hoslaa last edited by

                  This post is deleted!
                  1 Reply Last reply Reply Quote -1
                  • BabaBha0173
                    BabaBha0173 last edited by

                    User-agent: * Disallow: /cgi-bin/ Disallow: /wp-admin/ Disallow: /archives/ Disallow: /? Allow: /comments/feed/ Disallow: /refer/ Disallow: /index.php Disallow: /wp-content/plugins/ Allow: /wp-admin/admin-ajax.php User-agent: Mediapartners-Google* Allow: / User-agent: Googlebot-Image Allow: /wp-content/uploads/ User-agent: Adsbot-Google Allow: / User-agent: Googlebot-Mobile Allow: / Sitemap: https://site.com/sitemap_index.xml

                    use this it will help you and your problem will solve

                    Regards

                    Chotapao

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post
                    • Robots.txt was set to disallow for 14 days
                      jc4254
                      jc4254
                      0
                      3
                      58

                    • Best practice for disallowing URLS with Robots.txt
                      TimHolmes
                      TimHolmes
                      0
                      3
                      650

                    • Application & understanding of robots.txt
                      Yoav-Blustein
                      Yoav-Blustein
                      0
                      5
                      191

                    • Robots.txt Question
                      Dr-Pete
                      Dr-Pete
                      0
                      4
                      231

                    • How to Disallow Tag Pages With Robot.txt
                      monster99
                      monster99
                      0
                      6
                      4.0k

                    • Disallow my store in robots.txt?
                      AlanMosley
                      AlanMosley
                      0
                      2
                      308

                    • Robots.txt disallow subdomain
                      oznappies
                      oznappies
                      0
                      7
                      1.9k

                    • Should we block urls like this - domainname/shop/leather-chairs.html?brand=244&cat=16&dir=ascℴ=price&price=1 within the robots.txt?
                      sferrino
                      sferrino
                      0
                      2
                      864

                    Get started with Moz Pro!

                    Unlock the power of advanced SEO tools and data-driven insights.

                    Start my free trial
                    Products
                    • Moz Pro
                    • Moz Local
                    • Moz API
                    • Moz Data
                    • STAT
                    • Product Updates
                    Moz Solutions
                    • SMB Solutions
                    • Agency Solutions
                    • Enterprise Solutions
                    • Digital Marketers
                    Free SEO Tools
                    • Domain Authority Checker
                    • Link Explorer
                    • Keyword Explorer
                    • Competitive Research
                    • Brand Authority Checker
                    • Local Citation Checker
                    • MozBar Extension
                    • MozCast
                    Resources
                    • Blog
                    • SEO Learning Center
                    • Help Hub
                    • Beginner's Guide to SEO
                    • How-to Guides
                    • Moz Academy
                    • API Docs
                    About Moz
                    • About
                    • Team
                    • Careers
                    • Contact
                    Why Moz
                    • Case Studies
                    • Testimonials
                    Get Involved
                    • Become an Affiliate
                    • MozCon
                    • Webinars
                    • Practical Marketer Series
                    • MozPod
                    Connect with us

                    Contact the Help team

                    Join our newsletter
                    Moz logo
                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                    • Accessibility
                    • Terms of Use
                    • Privacy