The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Robot.txt pattern matching

    Robot.txt pattern matching

    Technical SEO Issues
    8 4 3.3k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • STPseo
      STPseo last edited by

      Hola fellow SEO peoples!

      Site: http://www.sierratradingpost.com

      robot: http://www.sierratradingpost.com/robots.txt

      Please see the following line: Disallow: /keycodebypid~*

      We are trying to block URLs like this:

      http://www.sierratradingpost.com/keycodebypid~8855/for-the-home~d~3/kitchen~d~24/

      but we still find them in the Google index.

      1. we are not sure if we need to specify the robot to use pattern matching.

      2. we are not sure if the format is correct. Should we use Disallow: /keycodebypid*/ or /*keycodebypid/ or even /*keycodebypid~/?

      What is even more confusing is that the meta robot command line says "noindex" - yet they still show up. <meta name="robots" content="noindex, follow, noarchive" />

      Thank you!

      1 Reply Last reply Reply Quote 0
      • john4math
        john4math last edited by

        Here's a good SEOMoz post about this: http://www.seomoz.org/blog/robot-access-indexation-restriction-techniques-avoiding-conflicts.  What's most likely happening is that the disallow in robots.txt is preventing the bots from indexing the page, so they're not going to find the meta noindex tag.  If people link to one of these pages externally, the disallow in robots.txt does not prevent the page from appearing in search results.

        The robots.txt syntax you're using now looks correct to me for what you're trying to do.

        cfguti STPseo 2 Replies Last reply Reply Quote 2
        • cfguti
          cfguti last edited by

          Hi,

          then you have the robots.txt and the meta tag. I think its better the metatag (http://www.seomoz.org/learn-seo/robotstxt)

          Have you WebMaster Tools in your web? you can test your robots.txt file (http://www.google.com/support/webmasters/bin/answer.py?answer=156449)

          1 Reply Last reply Reply Quote 0
          • cfguti
            cfguti @john4math last edited by

            Well done John!!!

            😉

            1 Reply Last reply Reply Quote 0
            • STPseo
              STPseo @john4math last edited by

              Great point! I will remember that. However I have both the disallow line in the robots.txt file and I also have the noindex meta command. Yet Google shows 3000 of them!?!?!?!

              http://www.google.com/search?q=site%3Awww.sierratradingpost.com+keycodebypid

              john4math 1 Reply Last reply Reply Quote 0
              • john4math
                john4math @STPseo last edited by

                Somehow Google is finding these pages, but you're disallowing the Googlebot from reading the page, so it doesn't know anything about the meta noindex tag on the page.  If you have meta noindex tags on all of these pages, you can remove that line in your robots.txt preventing bots from reading these pages, and as Google crawls these pages, they should remove them from their SERPs.

                1 Reply Last reply Reply Quote 2
                • STPseo
                  STPseo last edited by

                  John, The article was a real eye-opener!Thanks again!

                  1 Reply Last reply Reply Quote 0
                  • SEOSHARK
                    SEOSHARK last edited by

                    ok, so not sure  sure this was shared.  Matt Cutts talking on this same subject.

                    | | <cite class="kvm">www.youtube.com/watch?v=I2giR-WKUfY</cite> |

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post
                    • Disallow wildcard match in Robots.txt
                      effectdigital
                      effectdigital
                      0
                      3
                      1.0k

                    • Robots.txt
                      MichaelC-15022
                      MichaelC-15022
                      0
                      7
                      1.0k

                    • Robots.txt
                      Dan-Lawrence
                      Dan-Lawrence
                      0
                      5
                      99

                    • Meta Robots Noindex and Robots.txt File
                      Devanur-Rafi
                      Devanur-Rafi
                      0
                      2
                      125

                    • Robots.txt
                      irvingw
                      irvingw
                      0
                      4
                      116

                    • Robots.txt query
                      Karen_Dauncey
                      Karen_Dauncey
                      0
                      5
                      308

                    • Blocking robots.txt
                      de4e
                      de4e
                      0
                      4
                      432

                    • Robots.txt
                      Tom-Anthony
                      Tom-Anthony
                      0
                      4
                      1.1k

                    Get started with Moz Pro!

                    Unlock the power of advanced SEO tools and data-driven insights.

                    Start my free trial
                    Products
                    • Moz Pro
                    • Moz Local
                    • Moz API
                    • Moz Data
                    • STAT
                    • Product Updates
                    Moz Solutions
                    • SMB Solutions
                    • Agency Solutions
                    • Enterprise Solutions
                    • Digital Marketers
                    Free SEO Tools
                    • Domain Authority Checker
                    • Link Explorer
                    • Keyword Explorer
                    • Competitive Research
                    • Brand Authority Checker
                    • Local Citation Checker
                    • MozBar Extension
                    • MozCast
                    Resources
                    • Blog
                    • SEO Learning Center
                    • Help Hub
                    • Beginner's Guide to SEO
                    • How-to Guides
                    • Moz Academy
                    • API Docs
                    About Moz
                    • About
                    • Team
                    • Careers
                    • Contact
                    Why Moz
                    • Case Studies
                    • Testimonials
                    Get Involved
                    • Become an Affiliate
                    • MozCon
                    • Webinars
                    • Practical Marketer Series
                    • MozPod
                    Connect with us

                    Contact the Help team

                    Join our newsletter
                    Moz logo
                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                    • Accessibility
                    • Terms of Use
                    • Privacy