The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?

    Are robots.txt wildcards still valid? If so, what is the proper syntax for setting this up?

    Technical SEO Issues
    3 3 7.0k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • mkhGT
      mkhGT last edited by

      I've got several URL's that I need to disallow in my robots.txt file. For example, I've got several documents that I don't want indexed and filters that are getting flagged as duplicate content. Rather than typing in thousands of URL's I was hoping that wildcards were still valid.

      1 Reply Last reply Reply Quote 0
      • Adam_Cochran
        Adam_Cochran last edited by

        Yup wildcard syntax is indeed still valid. However I can only confirm that the big 3 (Google, Yahoo and Bing) actively observe it. Other secondary search engines may not.

        In your case you are probably looking for a syntax along the lines of:

        User-agent: *
        Disallow: /*.pdf$ This would set that any user agent should be blocked from any file name that ends in .pdf (a $ ties it to the end so pdf.txt would not be blocked in this case)

        Keep an eye on how you block them. Missing a trailing slash could block a directory rather than a file, or not appending a strict symbol ($) could mean that phrases throughout a directory could be blocked rather than just a filename.

        Also keep in mind if you are using URL re-writing this may play into how you need to block things; and you may also want to remember that disallowing access in a robot.txt does NOT prevent search engines from indexing the data, it is up to them if they honor the request. So if it is very important to block the file access from search engines then robots.txt may not be the way to do it.

        DarinPirkey 1 Reply Last reply Reply Quote 1
        • DarinPirkey
          DarinPirkey @Adam_Cochran last edited by

          Great job.  I just wanted to add this from Google Webmasters

          http://googlewebmastercentral.blogspot.com/2008/06/improving-on-robots-exclusion-protocol.html

          and this from Google Developers

          https://developers.google.com/webmasters/control-crawl-index/docs/robots_txt

          1 Reply Last reply Reply Quote 1
          • 1 / 1
          • First post
            Last post
          • Robots.txt Tester - syntax not understood
            JamesHancocks1
            JamesHancocks1
            0
            3
            573

          • Disallow wildcard match in Robots.txt
            effectdigital
            effectdigital
            0
            3
            1.0k

          • Robots.txt - "File does not appear to be valid"
            PeaSoupDigital
            PeaSoupDigital
            0
            3
            319

          • Meta Robots Noindex and Robots.txt File
            Devanur-Rafi
            Devanur-Rafi
            0
            2
            125

          • Site blocked by robots.txt and 301 redirected still in SERPs
            OlegKorneitchouk
            OlegKorneitchouk
            0
            2
            621

          • Does RogerBot read URL wildcards in robots.txt
            jennita
            jennita
            0
            2
            928

          • Robots.txt and robots meta
            TheEspresseo
            TheEspresseo
            0
            5
            1.1k

          • How do I use the Robots.txt "disallow" command properly for folders I don't want indexed?
            portalseo
            portalseo
            0
            5
            1.9k

          Get started with Moz Pro!

          Unlock the power of advanced SEO tools and data-driven insights.

          Start my free trial
          Products
          • Moz Pro
          • Moz Local
          • Moz API
          • Moz Data
          • STAT
          • Product Updates
          Moz Solutions
          • SMB Solutions
          • Agency Solutions
          • Enterprise Solutions
          • Digital Marketers
          Free SEO Tools
          • Domain Authority Checker
          • Link Explorer
          • Keyword Explorer
          • Competitive Research
          • Brand Authority Checker
          • Local Citation Checker
          • MozBar Extension
          • MozCast
          Resources
          • Blog
          • SEO Learning Center
          • Help Hub
          • Beginner's Guide to SEO
          • How-to Guides
          • Moz Academy
          • API Docs
          About Moz
          • About
          • Team
          • Careers
          • Contact
          Why Moz
          • Case Studies
          • Testimonials
          Get Involved
          • Become an Affiliate
          • MozCon
          • Webinars
          • Practical Marketer Series
          • MozPod
          Connect with us

          Contact the Help team

          Join our newsletter
          Moz logo
          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
          • Accessibility
          • Terms of Use
          • Privacy