The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. XML and Disallow

    XML and Disallow

    Intermediate & Advanced SEO
    3 3 102
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • DRSearchEngOpt
      DRSearchEngOpt last edited by

      I was just curious about any potential side effects of a client Basically utilizing a catch-all solution through the use of a spider for generating their XML Sitemap and then disallowing some of the directories in the XML sitemap in the robots.txt.

      i.e.
      XML contains 500 URLs
      50 URLs contain /dirw/
      I don't want anything with /dirw/ indexed just because they are fairly useless.  No content, one image.

      They utilize the robots.txt file to " disallow: /dirw/ "

      Lets say they do this for maybe 3 separate directories making up roughly 30% of the URL's in the XML sitemap.

      I am just advising they re-do the sitemaps because that shouldn't be too dificult but I am curious about the actual ramifications of this other than "it isn't a clear and concise indication to the SE and therefore should be made such" if there are any.

      Thanks!

      1 Reply Last reply Reply Quote 0
      • RyanPurkey
        RyanPurkey last edited by

        For syntax I think you'll want:

        User-agent: *
        Disallow: /dirw/

        If the content of /dirw/ isn't worthwhile to the engines then it should be fine to disallow. It's important to note though that Google asks for CSS and Javascript to not be disallowed. Run the site through their Page Speed tool to see how this setup currently impacts that interaction. Cheers!

        1 Reply Last reply Reply Quote 0
        • DirkC
          DirkC last edited by

          Hi Thomas,

          I don't think that technically there is a problem with adding url's to a sitemap & then blocking part of them with robots.txt.

          I wouldn't do it however - and I would give the same advice as you did: regenerate the sitemap without this content. Main reason would be that it goes against the main goals of a sitemap: helping bots to crawl your site and to provide valuable metadata (https://support.google.com/webmasters/answer/156184?hl=en). Another advantage is that Google indicates the % of url's of each sitemap which is index. From that perspective, url's which are blocked for indexing have no use in a sitemap. Normally webmaster tools will generate errors, to let you know that there are issues with the sitemap.

          If you take it one step further, Google could consider you a bit of a lousy webmaster, if you keep these url's in the sitemap. Not sure if this is the case, but for something which can easily be corrected, not sure if I would take this risk (even if it's a very minor one).

          There are crawlers (like screamingfrog) which can generate sitemaps, while respecting the directives of the robots.txt - this would in my opinion be a better option.

          rgds,

          Dirk

          1 Reply Last reply Reply Quote 1
          • 1 / 1
          • First post
            Last post
          • Spotify XML Sitemap
            Martijn_Scheijbeler
            Martijn_Scheijbeler
            0
            2
            184

          • Yoast XML Sitemap Taxonomies
            0
            2
            92

          • XML sitemaps questions
            Martijn_Scheijbeler
            Martijn_Scheijbeler
            0
            3
            167

          • Sitemap.xml
            DarinPirkey
            DarinPirkey
            0
            2
            196

          • XML Sitemaps - how to create the perfect XML Sitemap
            AlanMosley
            AlanMosley
            0
            2
            3.5k

          • Video XML Sitemap
            MargaritaS
            MargaritaS
            0
            8
            447

          • How do I create a XML Sitemap?
            Johnny4B
            Johnny4B
            1
            7
            722

          • Sitemap.xml Question
            Kotkov
            Kotkov
            0
            2
            337

          Get started with Moz Pro!

          Unlock the power of advanced SEO tools and data-driven insights.

          Start my free trial
          Products
          • Moz Pro
          • Moz Local
          • Moz API
          • Moz Data
          • STAT
          • Product Updates
          Moz Solutions
          • SMB Solutions
          • Agency Solutions
          • Enterprise Solutions
          • Digital Marketers
          Free SEO Tools
          • Domain Authority Checker
          • Link Explorer
          • Keyword Explorer
          • Competitive Research
          • Brand Authority Checker
          • Local Citation Checker
          • MozBar Extension
          • MozCast
          Resources
          • Blog
          • SEO Learning Center
          • Help Hub
          • Beginner's Guide to SEO
          • How-to Guides
          • Moz Academy
          • API Docs
          About Moz
          • About
          • Team
          • Careers
          • Contact
          Why Moz
          • Case Studies
          • Testimonials
          Get Involved
          • Become an Affiliate
          • MozCon
          • Webinars
          • Practical Marketer Series
          • MozPod
          Connect with us

          Contact the Help team

          Join our newsletter
          Moz logo
          © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
          • Accessibility
          • Terms of Use
          • Privacy