The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Best practices for robotx.txt -- allow one page but not the others?

    Best practices for robotx.txt -- allow one page but not the others?

    Intermediate & Advanced SEO
    13 4 1.2k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • nicole.healthline
      nicole.healthline last edited by

      So, we have a page, like domain.com/searchhere, but results are being crawled (and shouldn't be), results look like domain.com/searchhere?query1. If I block /searchhere? will it block users from crawling the single page /searchere (because I still want that page to be indexed).

      What is the recommended best practice for this?

      1 Reply Last reply Reply Quote 0
      • john4math
        john4math last edited by

        What you outlined sounds to me like it should work.  Disallowing /searchhere? shouldn't disallow the top-level search page at /searchhere, but should disallow all the search result pages with queries after the ?.

        nicole.healthline 1 Reply Last reply Reply Quote 1
        • RyanKent
          RyanKent last edited by

          The best practice would be to add the noindex tag to the search result pages but not the /searchhere page.

          Typically speaking, the best robots.txt file is a blank one. The file should only be used as a last resort with respect to blocking content.

          nicole.healthline 2 Replies Last reply Reply Quote 3
          • anhvietprotocol
            anhvietprotocol last edited by

            what about if you use "<a title="Click for Help!">Canonical URL" tag ?</a>

            You can put this code: in /searchhere? page.

            1 Reply Last reply Reply Quote 0
            • nicole.healthline
              nicole.healthline @john4math last edited by

              Thank you. Are you sure about that?

              john4math 1 Reply Last reply Reply Quote 0
              • nicole.healthline
                nicole.healthline @RyanKent last edited by

                Hi Ryan,

                Wouldn't that cause issues with crawl efficiency?

                Also, webmaster guidelines say "Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don't add much value for users coming from search engines."

                RyanKent nicole.healthline 3 Replies Last reply Reply Quote 0
                • nicole.healthline
                  nicole.healthline @RyanKent last edited by

                  http://support.google.com/webmasters/bin/answer.py?hl=en&answer=35769

                  1 Reply Last reply Reply Quote 0
                  • RyanKent
                    RyanKent @nicole.healthline last edited by

                    Hi Michelle,

                    The concept of crawl efficiency is highly misunderstood. Are all your site's pages being indexed? Is new content or changes indexed in a timely manner? If so, that would indicate your site is being crawled efficiently.

                    Regarding the link you shared, you are on the right track but need to dig a bit deeper. On the page you shared, find the discussion related to robots.txt. There is a link which will lead you to the following page:

                    https://developers.google.com/webmasters/control-crawl-index/docs/faq#h01

                    There you will find a more detailed explanation along with several examples of when not to use robots.txt.

                    robots.txt: Use it if crawling of your content is causing issues on your server. For example, you may want to disallow crawling of infinite calendar scripts. You should not use the robots.txt to block private content (use server-side authentication instead), or handle canonicalization (see our Help Center). If you must be certain that a URL is not indexed, use the robots meta tag or X-Robots-Tag HTTP header instead.

                    SEOmoz offers a great guide on this topic as well: http://www.seomoz.org/learn-seo/robotstxt

                    If you desire to go beyond the basic Google and SEOmoz explanation and learn more about this topic, my favorite article related to robots.txt, written by Lindsay, can be found here: http://www.seomoz.org/blog/serious-robotstxt-misuse-high-impact-solutions

                    1 Reply Last reply Reply Quote 2
                    • john4math
                      john4math @nicole.healthline last edited by

                      Yeah, but Ryan's answer is the best one if you can go that route. 🙂

                      1 Reply Last reply Reply Quote 0
                      • nicole.healthline
                        nicole.healthline last edited by

                        Thanks for the links and help.

                        How does seomoz keep search results from being indexed? They don't block search results with robots.txt and it doesn't appear that they add the noindex tag to the search result pages.(ex: view-source:http://www.seomoz.org/pages/search_results#stq=testing&stp=1)

                        RyanKent 1 Reply Last reply Reply Quote 0
                        • nicole.healthline
                          nicole.healthline @nicole.healthline last edited by

                          And, because google can currently crawl these search result pages, there are a number of soft 404 pages popping up. Would adding a noindex tag to these pages fix the issue?

                          1 Reply Last reply Reply Quote 0
                          • RyanKent
                            RyanKent @nicole.healthline last edited by

                            If Google is viewing the search result pages as soft 404s, then yes, adding the noindex tag should resolve the problem.

                            1 Reply Last reply Reply Quote 1
                            • RyanKent
                              RyanKent @nicole.healthline last edited by

                              SEOmoz used to use Google Search for the site. I am confident Google has a solid method for keeping their own results clean.

                              It appears SEOmoz recently changed their search widget. If you examine the URL you shared, notice none of the search results actually appear in the HTML of the page. For example, load the view-source URL and perform a find (CTRL+F) for "testing" which is the subject of the search. There are no results. Since the results are not in the page's HTML, they would not get indexed.

                              1 Reply Last reply Reply Quote 1
                              • 1 / 1
                              • First post
                                Last post
                              • Best practice to 301 NON-WWW pages?
                                mememax
                                mememax
                                0
                                4
                                91

                              • At Listing Page Load More Functionality or Pagination which one best?
                                Kelly_Edwards
                                Kelly_Edwards
                                1
                                2
                                464

                              • Best practice for retiring old product pages
                                SharewarePros
                                SharewarePros
                                0
                                2
                                1.1k

                              • Best practice for the brand name in Page Titles
                                AndreVanKets
                                AndreVanKets
                                0
                                5
                                298

                              • What is the best practice to optimize page content with strong tags?
                                KevinBudzynski
                                KevinBudzynski
                                0
                                3
                                202

                              • Do 404 pages pass link juice? And best practices...
                                TomRayner
                                TomRayner
                                1
                                2
                                1.8k

                              • Best practice for removing pages
                                PeterAlexLeigh
                                PeterAlexLeigh
                                0
                                5
                                659

                              • Best practice to change the URL of all my site pages
                                RyanKent
                                RyanKent
                                1
                                4
                                698

                              Get started with Moz Pro!

                              Unlock the power of advanced SEO tools and data-driven insights.

                              Start my free trial
                              Products
                              • Moz Pro
                              • Moz Local
                              • Moz API
                              • Moz Data
                              • STAT
                              • Product Updates
                              Moz Solutions
                              • SMB Solutions
                              • Agency Solutions
                              • Enterprise Solutions
                              • Digital Marketers
                              Free SEO Tools
                              • Domain Authority Checker
                              • Link Explorer
                              • Keyword Explorer
                              • Competitive Research
                              • Brand Authority Checker
                              • Local Citation Checker
                              • MozBar Extension
                              • MozCast
                              Resources
                              • Blog
                              • SEO Learning Center
                              • Help Hub
                              • Beginner's Guide to SEO
                              • How-to Guides
                              • Moz Academy
                              • API Docs
                              About Moz
                              • About
                              • Team
                              • Careers
                              • Contact
                              Why Moz
                              • Case Studies
                              • Testimonials
                              Get Involved
                              • Become an Affiliate
                              • MozCon
                              • Webinars
                              • Practical Marketer Series
                              • MozPod
                              Connect with us

                              Contact the Help team

                              Join our newsletter
                              Moz logo
                              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                              • Accessibility
                              • Terms of Use
                              • Privacy