The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Block subdomain directory in robots.txt

    Block subdomain directory in robots.txt

    Intermediate & Advanced SEO
    5 3 1.1k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • gamesecure
      gamesecure last edited by

      Instead of block an entire sub-domain (fr.sitegeek.com) with robots.txt, we like to block one directory (fr.sitegeek.com/blog).
      'fr.sitegeek.com/blog' and 'wwww.sitegeek.com/blog' contain the same articles in one language only labels are changed for 'fr' version and we suppose that duplicate content cause problem for SEO. We would like to crawl and index 'www.sitegee.com/blog' articles not 'fr.sitegeek.com/blog'.

      so, suggest us how to block single sub-domain directory (fr.sitegeek.com/blog) with robot.txt?

      This is only for blog directory  of 'fr' version even all other directories or pages would be crawled and indexed for 'fr' version.

      Thanks,
      Rajiv

      1 Reply Last reply Reply Quote 0
      • DirkC
        DirkC last edited by

        The easiest way would be to put the robots.txt in the root of your subdomain & block access for search engines

        User-agent: Googlebot
        Disallow: /

        If you subdomain & the main domain are sharing the same root - this option is not possible. In that case, rather than working with robots.txt I would add a canonical on each page pointing to the main domain, or block all pages in the header (if this is technically possible)

        You could also check these similar questions: http://moz.com/community/q/block-an-entire-subdomain-with-robots-txt and http://moz.com/community/q/blocking-subdomain-from-google-crawl-and-index - but the answers given are the same as the options above.

        Apart from the technical question, qiven the fact that only the labels are translated, these pages make little sense for human users. It would probably make more sense to link to the normal (English) version of the blog (and put (en Anglais) next to the link.

        rgds,

        Dirk

        gamesecure 1 Reply Last reply Reply Quote 3
        • N1ghteyes
          N1ghteyes last edited by

          Just to add to this, if your subdomain has more than /blog on it, and you only want to block /blog, change Dirk's robots.txt to:

          User-agent: Googlebot
          Disallow: /blog

          or to block more than just google:

          User-agent:*
          Disallow: /blog

          1 Reply Last reply Reply Quote 1
          • gamesecure
            gamesecure @DirkC last edited by

            Thanks Dirk,

            we will fix the issue as you suggested.

            Could you explain more on duplicate content if we post articles on both 'FR' and 'EN' versions?

            Thanks,

            Rajiv

            DirkC 1 Reply Last reply Reply Quote 0
            • DirkC
              DirkC @gamesecure last edited by

              Hi Rajiv,

              If you post the same content on both FR & EN version:

              • if both are written in English (or mainly written in English) - best option would be to have a canonical pointing to the EN version
                Example: https://fr.sitegeek.com/category/shared-hosting - most of the content is in English - so in this case I would point a canonical to the EN version

              • if the FR version is in French - you can use the HREF lang tag - you can use this tool to generate them, check here for common mistakes and doublecheck the final result here.

              Just some remarks:

              • partially translated pages offer little value for users - so it's best to fully translate them or only refer to the EN version

              • I have a strong impression that the EN version was machine translated to the FR version. (ex. French sites never use 'Maison' to link to the Homepage - they use Acceuil). Be aware that Google is perfectly capable to detect auto-translated pages and they consider it to be bad practice (check this video of Matt Cutts - starts at 1:50). So you might want to invest in proper translation or proofreading by a native French speaker.

              rgds

              Dirk

              1 Reply Last reply Reply Quote 3
              • 1 / 1
              • First post
                Last post
              • Robots.txt: how to exclude sub-directories correctly?
                MickEdwards
                MickEdwards
                1
                10
                48.0k

              • Robots.txt, does it need preceding directory structure?
                Milian
                Milian
                0
                3
                156

              • Blocking out specific URLs with robots.txt
                Modi
                Modi
                0
                3
                133

              • Why are these results being showed as blocked by robots.txt?
                eyepaq
                eyepaq
                0
                9
                203

              • What should I block with a robots.txt file?
                Travis-W
                Travis-W
                1
                3
                298

              • What content should I block in wodpress with robots.txt?
                ENSO
                ENSO
                0
                4
                518

              • Block an entire subdomain with robots.txt?
                kylesuss
                kylesuss
                1
                16
                102.1k

              • Block all search results (dynamic) in robots.txt?
                onwebtoday
                onwebtoday
                0
                9
                4.8k

              Get started with Moz Pro!

              Unlock the power of advanced SEO tools and data-driven insights.

              Start my free trial
              Products
              • Moz Pro
              • Moz Local
              • Moz API
              • Moz Data
              • STAT
              • Product Updates
              Moz Solutions
              • SMB Solutions
              • Agency Solutions
              • Enterprise Solutions
              • Digital Marketers
              Free SEO Tools
              • Domain Authority Checker
              • Link Explorer
              • Keyword Explorer
              • Competitive Research
              • Brand Authority Checker
              • Local Citation Checker
              • MozBar Extension
              • MozCast
              Resources
              • Blog
              • SEO Learning Center
              • Help Hub
              • Beginner's Guide to SEO
              • How-to Guides
              • Moz Academy
              • API Docs
              About Moz
              • About
              • Team
              • Careers
              • Contact
              Why Moz
              • Case Studies
              • Testimonials
              Get Involved
              • Become an Affiliate
              • MozCon
              • Webinars
              • Practical Marketer Series
              • MozPod
              Connect with us

              Contact the Help team

              Join our newsletter
              Moz logo
              © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
              • Accessibility
              • Terms of Use
              • Privacy