The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. How do you disallow HTTPS?

    How do you disallow HTTPS?

    Technical SEO Issues
    9 5 12.8k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • WebsiteConsultants
      WebsiteConsultants last edited by

      I currently have a site (startuploans.org) that runs everything as http, recently we decided to start an online application to process loan apps. Now, for one certain section we configured ssl to work (https://www.startuploans.org/secure/).

      If I go to the HTTPS url for any of my other pages they show up...I was going to just 301 everything from https but because it is in a subdirectiory I can't...

      Also, canonical URL's won't work either because it's a totally different system and the pages are generated in an odd manor.

      It's really just 1 page that needs to be disallowed..

      Is there any way to disallow all HTTPS requests from robots.txt while keeping all the HTTP requests working as normal?

      1 Reply Last reply Reply Quote 0
      • RobertFisher
        RobertFisher last edited by

        Hello Rick,

        First caveat is I am not sure what you want to accomplish: You want it so that once the app is done, the person is no longer in https:// ?? If that is it, then while I am not sure I will be able to help, I want to clarify the issue.

        Currently, you have one page that is https: and that is your loan app page with url of https://startuploans.org/secure/site/step1 (I did not get a step two on my test, but the next page was https://startuploans.org/secure/step3.) You want a person to finish the app, and then not be in https when they return to the site?

        I am not a coder per se, but I am wondering if y ou change the target on the menu link to the secure pages to open in a new window there would be no option to go back. once finished, page 3 have an option to close to secure my information. Then, they are left at the page they were on before going to application.

        Now, if none of this was what you wanted, I owe you a beer.

        WebsiteConsultants 2 Replies Last reply Reply Quote 0
        • WebsiteConsultants
          WebsiteConsultants @RobertFisher last edited by

          Nope...thanks though 🙂  Code is no problem for us...it's just a technical question. Here is what I want:

          I want to restrict robots from the HTTPS version (secure) of my site while leaving the HTTP version (unsecure) perfectly normal and accessible by bots.

          Basically what I am asking is..is this the best way (below)? Is there a simpler way...to my knowledge robots.txt doesn't support protocols so doing something like disallow:https://......yada yada won't work.

          RewriteEngine on
          RewriteCond %{SERVER_PORT} ^443$
          RewriteRule ^robots.txt$ robots_ssl.txt [L]

          Francisco_Meza AlanBleiweiss 2 Replies Last reply Reply Quote 0
          • WebsiteConsultants
            WebsiteConsultants @RobertFisher last edited by

            I should have added that -the code above goes in the htaccess...that code would deliver two different robots.txt files based on if it's port 443 (secure) or the normal robots.txt file if it's any other port (normal).

            Is there any easier way? I feel like one misstep on this and I could block bots from my site.

            1 Reply Last reply Reply Quote 0
            • Francisco_Meza
              Francisco_Meza @WebsiteConsultants last edited by

              Why not just NO INDEX / NO FOLLOW the page? What is the reason behind this? Do you want Google not to index your https page? Duplicate content? All checkouts have https.

              1 Reply Last reply Reply Quote 1
              • AlanBleiweiss
                AlanBleiweiss @WebsiteConsultants last edited by

                I agree.  Best practices dictate that the proper answer is to block the entire folder from indexing.

                1 Reply Last reply Reply Quote 0
                • ShaMenz
                  ShaMenz last edited by

                  Hi Rick,

                  If you wish to use the robots.txt method to disallow all or part of your site's https protocol, you simply need to load two separate robots.txt files.

                  The http and https protocols are basically viewed by bots as if they were two completely separate root domains (which I guess you already know as you have mentioned the fact that port 443 is used for the secure protocol).

                  Google's advice is that to use this method, you should have a separate robots.txt file for each protocol with code as follows:

                  For your http protocol (http://www.startuploans.org/robots.txt😞

                  User-agent: *
                  Allow: /

                  For the https protocol (https://www.startuploans.org/robots.txt😞

                  User-agent: *
                  Disallow: /

                  However, blocking crawlers with robots.txt is not the most reliable method for excluding pages from Search engines. The reason for this is that the page will continue to be indexed if it happens to be found via a link from another page. Basically, the robots.txt is the sign on the front door that says "Please stay out of our house", but it is never seen by the people who enter via the rear exit or climb in a window!

                  The most reliable method of excluding pages is to add the noindex meta tag as suggested by MagentoWebDeveloper and Alan.When a bot encounters the noindex meta tag it will send a signal to the search engine to de-index the page and there is no further problem. 🙂

                  I would generally use noindex, follow rather than noindex, nofollow as the nofollow tag will stop the flow of link value through your site. In most cases, as long as the noindex is in place, there is no reason to be worried about the links on the pages being followed.

                  You should NEVER use both methods at the same time.

                  Hope that helps,

                  Sha

                  WebsiteConsultants 1 Reply Last reply Reply Quote 4
                  • WebsiteConsultants
                    WebsiteConsultants @ShaMenz last edited by

                    Perfect. This is the answer I was looking for...I will just use the meta tag globally in HTTPS....BUT...what about the fact that my entire site is duplicated in HTTPS?

                    It's all good for the /secure/ part, but what about my Wordpress install...how do I handle that? Maybe my best option is to just load 2 different robots.txt files...

                    ShaMenz 1 Reply Last reply Reply Quote 0
                    • ShaMenz
                      ShaMenz @WebsiteConsultants last edited by

                      Hi Rick,

                      Your first thought was correct. If you apply the noindex meta tag to every page in the secure part of the site, then all of those pages will be de-indexed and you will have no duplicate content problem.

                      For Wordpress, you just need to install a plugin that allows you to edit and apply page elements and meta tags. My preference is Yoast SEO. If you do a plugin search from your dashboard you will find it.

                      Hope that helps,

                      Sha

                      1 Reply Last reply Reply Quote 0
                      • 1 / 1
                      • First post
                        Last post
                      • Changing Domains - 301 old https to new https
                        BlueprintMarketing
                        BlueprintMarketing
                        1
                        5
                        81

                      • Do you still loose 15% of value of inbound links when you redirect your site from http to https (so all inbound links to http are being redirected to https version)?
                        LoganRay
                        LoganRay
                        0
                        5
                        74

                      • Migration to https
                        MarcelMoz
                        MarcelMoz
                        0
                        11
                        274

                      • Disallow statement - is this tiny anomaly enough to render Disallow invalid?
                        lzhao
                        lzhao
                        0
                        3
                        118

                      • Responsive web design has a crawl error of redirecting to HTTP instead of HTTPS ? is this because of the new update of google that appreciates the HTTPs more?
                        donford
                        donford
                        0
                        2
                        129

                      • Disallowing https URLs
                        AlanMosley
                        AlanMosley
                        0
                        2
                        364

                      • I have a site that has both http:// and https:// versions indexed, e.g. https://www.homepage.com/ and http://www.homepage.com/. How do I de-index the https// versions without losing the link juice that is going to the https://homepage.com/ pages?
                        fthead9
                        fthead9
                        0
                        3
                        729

                      • How to disallow google and roger?
                        RyanKent
                        RyanKent
                        0
                        2
                        729

                      Get started with Moz Pro!

                      Unlock the power of advanced SEO tools and data-driven insights.

                      Start my free trial
                      Products
                      • Moz Pro
                      • Moz Local
                      • Moz API
                      • Moz Data
                      • STAT
                      • Product Updates
                      Moz Solutions
                      • SMB Solutions
                      • Agency Solutions
                      • Enterprise Solutions
                      • Digital Marketers
                      Free SEO Tools
                      • Domain Authority Checker
                      • Link Explorer
                      • Keyword Explorer
                      • Competitive Research
                      • Brand Authority Checker
                      • Local Citation Checker
                      • MozBar Extension
                      • MozCast
                      Resources
                      • Blog
                      • SEO Learning Center
                      • Help Hub
                      • Beginner's Guide to SEO
                      • How-to Guides
                      • Moz Academy
                      • API Docs
                      About Moz
                      • About
                      • Team
                      • Careers
                      • Contact
                      Why Moz
                      • Case Studies
                      • Testimonials
                      Get Involved
                      • Become an Affiliate
                      • MozCon
                      • Webinars
                      • Practical Marketer Series
                      • MozPod
                      Connect with us

                      Contact the Help team

                      Join our newsletter
                      Moz logo
                      © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                      • Accessibility
                      • Terms of Use
                      • Privacy