The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Technical SEO Issues
    4. Duplicate content and http and https

    Duplicate content and http and https

    Technical SEO Issues
    16 10 31.7k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • hawkvt1
      hawkvt1 last edited by

      Within my Moz crawl report, I have a ton of duplicate content caused by identical pages due to identical pages of http and https URL's.

      For example:

      http://www.bigcompany.com/accomodations

      https://www.bigcompany.com/accomodations

      The strange thing is that 99% of these URL's are not sensitive in nature and do not require any security features.  No credit card information, booking, or carts.  The web developer cannot explain where these extra URL's came from or provide any further information.

      Advice or suggestions are welcome!  How do I solve this issue?

      THANKS MOZZERS

      1 Reply Last reply Reply Quote 0
      • JamesNorquay
        JamesNorquay last edited by

        You could implement the canonical tag onto the HTTP version of the website.

        Another problem when having a quick look at this website is that all your title tags are the same with the brand term at the front, this is not advisable at all you want to put the brand term at the end of the title and your generic terms first.

        I would look at getting an SEO audit done to fix the issues with the website.

        hawkvt1 1 Reply Last reply Reply Quote 1
        • mediabase
          mediabase last edited by

          Hello Hawkvt1, Fisrt of all I want to tell you that  the protocols (http/https) are different, they are considered two separate sites, so there’s a good chance to get penalized for duplicate content. If the search engine discovers two identical pages, generally it would take the page it saw first and ignore the other pages.The solutions are described below:

          S__olutions:

          1. Be smart about the site structure:  to keep the engines from crawling and indexing HTTPS pages, structure the website so that HTTPs are only accessible through a form submission (log-in, sign-up, or payment pages). The common mistake is making these pages available via a standard link (happens when you are either ignorant or  not aware that the secure version of the site is being crawled and indexed).
          2. Use Robots.txt file to control which pages will be crawled and indexed
          3. Use.htaccess file. Here’s how to do this:
          4. Create a file names robots_ssl.txt in your root.
          5. Add the following code to your .htaccessRewriteCond %{SERVER_PORT} 443 [NC]RewriteRule ^robots.txt$ robots_ssl.txt [L]
          6. Remove yourdomain.com:443 from the webmaster tools if the pages have already been crawled
          7. For dynamic pages like php, try< ?phpif ($_SERVER["SERVER_PORT"] == 443){echo “< meta name=” robots ” content=” noindex,nofollow ” > “;}?>
          8. Dramatic solution (may not always be possible): 301 redirect the HTTPS pages to the HTTP pages – with hopes that the link juice will transfer over.

          For more information please refer to this link :

          http://www.seomoz.org/ugc/solving-duplicate-content-issues-with-http-and-https

          I'm sure that your problem is solved.

          1 Reply Last reply Reply Quote 9
          • Kotkov
            Kotkov last edited by

            Harald, " robots_ssl.txt " where did you get that?

            mediabase 1 Reply Last reply Reply Quote 0
            • GKLA
              GKLA last edited by

              I would check your server for a https folder.

              add a robots.txt file in the root of the https folder:

              User-agent: *
              Disallow:/

              My guess is that the spider is following a link somewhere within your site that links to a https:// url.  The spider is than re-indexing the entire site using https://

              My 2 cents for what its worth.

              1 Reply Last reply Reply Quote 0
              • mediabase
                mediabase @Kotkov last edited by

                Hi Serge, I came to know about the "robots_ssl.txt" from the website  http://www.seoworkers.com/seo-articles-tutorials/robots-and-https.html

                1 Reply Last reply Reply Quote 0
                • Dr-Pete
                  Dr-Pete last edited by

                  I think Harald and James covered the bases here, but a couple of comments on Harald's reply:

                  (1) Definitely check this. A common cause of indexed https: pages is that a secure section of your site is being crawled (like a shopping cart), and you're using relative navigation links (like ) - when a crawler or visitor hits the nav link from a secure page, the relative link grabs the https: In most cases, you may want to NOINDEX secure pages. Shopping carts and checkout pages have no business in the search index, IMO.

                  [(2)-(5) I believe this does work, but it's very tricky, so please be careful. If anyone has linked to the https: pages, you'll lose the link-juice this way (you'll just cut those pages off). I honestly don't think it's a good choice for most sites.

                  (8) I actually believe the 301-redirect is simpler in most cases.

                  As James said, sitewide canonical tags (or on the affect pages, if they're isolated) will also work.](/contact.php)

                  hawkvt1 GTGshops 2 Replies Last reply Reply Quote 3
                  • hawkvt1
                    hawkvt1 @Dr-Pete last edited by

                    I would personally like to thank everyone that responded with an answer.  Man O Man, the best part of belonging to SEOMOZ is the community forum.  It's incredibly valuable, being able to ask a question and reach out to such talent as all of you.

                    If anyone ever gets up to Killington or Okemo skiing, the beer is on me!  I live right between both ski areas, about 8 miles to either mountain..

                    Thanks again.

                    randfish hawkvt1 2 Replies Last reply Reply Quote 3
                    • hawkvt1
                      hawkvt1 @JamesNorquay last edited by

                      Thanks James..

                      Sorry, I was using Big Company as an example and just being generic.

                      The real URL if interested is www.hawkresort.com

                      1 Reply Last reply Reply Quote 0
                      • randfish
                        randfish @hawkvt1 last edited by

                        Thanks dude! If I make it to Vermont, I might look you up 🙂

                        1 Reply Last reply Reply Quote 2
                        • hawkvt1
                          hawkvt1 @hawkvt1 last edited by

                          Anytime Rand!  I only have two simple rules:

                          1.  Talking business on ski days is not allowed

                          2.  Entry into Vermont requires a pound of Seattle's best french roast coffee.  In return, you  receive some fantastic Vermont maple syrup.

                          Simple rules to live by LOL

                          Thanks again for all of your help...

                          Peter

                          1 Reply Last reply Reply Quote 5
                          • GTGshops
                            GTGshops @Dr-Pete last edited by

                            Hi,

                            I'm still having problems with redirecting. I only have 1 duplicate page with https and http, that I want to redirect but it's the homepage.

                            i want to redirect: https://www.domain.com to http://www.domain.com

                            But keep the rest of the pages the same (half http and the other half https).

                            How do i do this?

                            1 Reply Last reply Reply Quote 0
                            • ajiabs
                              ajiabs last edited by

                              Hate to respond to a 3 year old thread. But does this solution needs to be updated?

                              Is there any change in response now, as Google is favoring https for most pages. Does google still consider http and https as two different sites? If so which one should be suppressed - http or https?

                              Aji

                              Dr-Pete 1 Reply Last reply Reply Quote 0
                              • Dr-Pete
                                Dr-Pete @ajiabs last edited by

                                If Google detects both http: and https: versions, they've started to automatically pick the https: version, but that's not consistent yet. In general, I think it's still important to set strong canonicalization signals. Google still separates your http: and https: sites in Google Search Console, too, so even they haven't quite made up their minds.

                                In general, Google is pushing sites toward https:, but that's a somewhat complex decision that depends on more than just SEO. If you're using https: and the https: URLs are indexed, then you should treat those as canonical and suppress the http: URLs, in most cases.

                                1 Reply Last reply Reply Quote 0
                                • GrowthLedge
                                  GrowthLedge last edited by

                                  I'm reading this response and this is happening on my site as well.  How did this happen in the first place?  I have duplicate content because of https and http copies of all my web pages.  If I type https://www.mywebsite.com I can't get to my site.  Could this be coming from my hosting company?  I've set up my site to simply be http://www.mywebsite.com.  I'm a little worried to change my robots.txt and I would love to know how this happened in the first place.

                                  Dr-Pete 1 Reply Last reply Reply Quote 0
                                  • Dr-Pete
                                    Dr-Pete @GrowthLedge last edited by

                                    Hard to tell without knowing the site, but it's possible there are external links to "https" versions of the pages. At this point, Google is going to increase the pressure to secure sites, and later this year Chrome will start warning users about all non-secure pages, so it may be worth making the move.

                                    1 Reply Last reply Reply Quote 0
                                    • 1 / 1
                                    • First post
                                      Last post
                                    • Switched from and HTTPS to HTTP. My home page is facing a redirect issue from the http to https. Should I no index the HTTP or find the redirect and delete it? Thank you
                                      LandmarkRecovery2017
                                      LandmarkRecovery2017
                                      0
                                      3
                                      42

                                    • #1 rankings on both HTTP and HTTPS vs duplicate content
                                      dohertyjf
                                      dohertyjf
                                      0
                                      3
                                      110

                                    • When is Duplicate Content Duplicate Content
                                      AMHC
                                      AMHC
                                      0
                                      6
                                      169

                                    • Http v https Duplicate Issues
                                      AxialDev
                                      AxialDev
                                      0
                                      5
                                      490

                                    • What is the better way to fix duplication https and http?
                                      CleverPhD
                                      CleverPhD
                                      0
                                      5
                                      162

                                    • Duplicate Content Vs No Content
                                      MoosaHemani
                                      MoosaHemani
                                      0
                                      7
                                      404

                                    • Similar Content vs Duplicate Content
                                      Izoox
                                      Izoox
                                      0
                                      4
                                      363

                                    • Forget Duplicate Content, What to do With Very Similar Content?
                                      johnshearer
                                      johnshearer
                                      0
                                      2
                                      568

                                    Get started with Moz Pro!

                                    Unlock the power of advanced SEO tools and data-driven insights.

                                    Start my free trial
                                    Products
                                    • Moz Pro
                                    • Moz Local
                                    • Moz API
                                    • Moz Data
                                    • STAT
                                    • Product Updates
                                    Moz Solutions
                                    • SMB Solutions
                                    • Agency Solutions
                                    • Enterprise Solutions
                                    • Digital Marketers
                                    Free SEO Tools
                                    • Domain Authority Checker
                                    • Link Explorer
                                    • Keyword Explorer
                                    • Competitive Research
                                    • Brand Authority Checker
                                    • Local Citation Checker
                                    • MozBar Extension
                                    • MozCast
                                    Resources
                                    • Blog
                                    • SEO Learning Center
                                    • Help Hub
                                    • Beginner's Guide to SEO
                                    • How-to Guides
                                    • Moz Academy
                                    • API Docs
                                    About Moz
                                    • About
                                    • Team
                                    • Careers
                                    • Contact
                                    Why Moz
                                    • Case Studies
                                    • Testimonials
                                    Get Involved
                                    • Become an Affiliate
                                    • MozCon
                                    • Webinars
                                    • Practical Marketer Series
                                    • MozPod
                                    Connect with us

                                    Contact the Help team

                                    Join our newsletter
                                    Moz logo
                                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                                    • Accessibility
                                    • Terms of Use
                                    • Privacy