The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Can I Block https URLs using Host directive in robots.txt?

    Can I Block https URLs using Host directive in robots.txt?

    Technical SEO Issues
    4 2 760
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • TJC.co.uk
      TJC.co.uk last edited by

      Hello Moz Community,

      Recently, I have found that Google bots has started crawling HTTPs urls of my website which is increasing the number of duplicate pages at our website.

      Instead of creating a separate robots.txt file for https version of my website, can I use Host directive in the robots.txt to suggest Google bots which is the original version of the website.

      Host: http://www.example.com

      I was wondering if this method will work and suggest Google bots that HTTPs URLs are the mirror of this website.

      Thanks for all of the great responses!

      Regards,
      Ramendra

      1 Reply Last reply Reply Quote 0
      • LoganRay
        LoganRay last edited by

        Hi Ramendra,

        Based on what you said, it sounds like both versions of your site exist and are indexed, and you want to mitigate your duplicate content risk. If that's accurate, here are my recommendations on this:

        1. Robots.txt cannot be used on a HTTP site to prevent indexing/crawling of HTTPS URLs
        2. Google crawls HTTPS by default, so if your site is fully secure, then you need to redirect (this can be done with a redirect rule in HTACCESS, you don't need to do one-to-one redirects) HTTP URLs over to their HTTPS twin
        3. In addition to your HTTP>HTTPS redirects, you should also use canonical tags to push your preferred version to search engines
        4. Your HTTPS site should have its own robots.txt file
        1 Reply Last reply Reply Quote 1
        • TJC.co.uk
          TJC.co.uk last edited by

          Thanks Logan,

          I have read somewhere that using Host directive in the robots.txt file we can suggest Google bots which is the original version of the website if there are number of mirror sites. So, I was wondering if we can prevent indexing/crawling of HTTPS URLs by using Host directive in robots.txt of HTTP site.

          We are using an ecommerce SAAS platform for our website where we have only one robots.txt file that we can use for HTTP site.

          Is there any other way to prevent indexing/crawling of HTTPS URLs?

          Regards,
          Ramendra

          1 Reply Last reply Reply Quote 0
          • LoganRay
            LoganRay last edited by

            Hi Ramendra,

            To my knowledge, you can only provide directives in the robots.txt file for the domain on which it lives. This goes for both http/https and www/non-www versions of domains. This is why it's important to handle all preferred domain formatting with redirects, that point to your canonicalized version. So if you want http://www to index, all other versions redirect to that.

            There might be a work around of some sort, but honestly, what I described above with redirection towards preferred versions is the direction you should take. Then you can manage one robots.txt file and your indexing will align with what you want better.

            1 Reply Last reply Reply Quote 0
            • 1 / 1
            • First post
              Last post
            • How can I make it so that robots.txt is not ignored due to a URL re-direct?
              rodelmo4
              rodelmo4
              0
              4
              56

            • How to use robots.txt to block areas on page?
              LauraHT
              LauraHT
              0
              8
              225

            • The use of robots.txt
              ICON_Malta
              ICON_Malta
              0
              3
              84

            • "Extremely high number of URLs" warning for robots.txt blocked pages
              KristinaKledzik
              KristinaKledzik
              0
              8
              380

            • Block or remove pages using a robots.txt
              OlegKorneitchouk
              OlegKorneitchouk
              0
              2
              422

            • Using Robots.txt
              OlegKorneitchouk
              OlegKorneitchouk
              0
              2
              311

            • Can I Disallow Faceted Nav URLs - Robots.txt
              AlanMosley
              AlanMosley
              0
              5
              914

            • Blocking robots.txt
              de4e
              de4e
              0
              4
              432

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            • Digital Marketers
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • Local Citation Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy