The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. On-Page / Site Optimization
    4. Duplicate Page content | What to do?

    Duplicate Page content | What to do?

    On-Page / Site Optimization
    12 5 154
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Kalitenko2014
      Kalitenko2014 last edited by

      Hello Guys,

      I have some duplicate pages detected by MOZ. Most of the URL´s are from a registracion process for users, so the URL´s are all like this:

      www.exemple.com/user/login?destination=node/125%23comment-form

      What should I do? Add this to robot txt? If so how? Whats the command to add in Google Webmaster?

      Thanks in advance!

      Pedro Pereira

      1 Reply Last reply Reply Quote 0
      • RafalJ
        RafalJ last edited by

        Add this line to your robots.txt to prevent google from indexing these pages:

        Disallow: /*login?

        1 Reply Last reply Reply Quote 3
        • KevinBudzynski
          KevinBudzynski last edited by

          In GWT: Crawl=> URL Parameters => Configure URL Parameters => Add Parameter

          Make sure you know what you are doing as it's easy to mess up and have BIG issues.

          1 Reply Last reply Reply Quote 1
          • webmethod
            webmethod last edited by

            Just adding this to robots.txt will not stop the pages being indexed:

            Disallow: /*login?

            It just means Google won't crawl the links on that page.

            I would do one of the following:

            1. Add noindex to the page. PR will still be passed to the page but they will no longer appear in SERPs.

            2. Add a canonical on the page to: "www.exemple.com/user/login"

            You're never going to try and get these pages to rank, so although it's worth fixing I wouldn't lose too much sleep on the impact of having duplicate content on registration pages (unless there are hundreds of them!).

            Regards,

            George

            RafalJ Kalitenko2014 carlystemmer 3 Replies Last reply Reply Quote 0
            • RafalJ
              RafalJ @webmethod last edited by

              George,

              I went to check with Google to make sure I am correct and I am!

              "While Google won't crawl or index the content blocked by robots.txt, we might still find and index information about disallowed URLs from other places on the web." Source: https://support.google.com/webmasters/answer/6062608?hl=en

              Yes, he can fix these problems on page but disallowing it in robots will work fine too!

              webmethod 1 Reply Last reply Reply Quote 0
              • webmethod
                webmethod @RafalJ last edited by

                Hi Rafal,

                The key part of that statement is "we might still find and index information about disallowed URLs...". If you read the next sentence it says: "As a result, the URL address and, potentially, other publicly available information such as anchor text in links to the site can still appear in Google search results".

                If you look at moz.com/robots.txt you'll see an entry for:

                Disallow: /pages/search_results*

                But if you search this on Google:

                site:moz.com/pages/search_results

                You'll find there are 20 results in the index.

                I used to agree with you, until I found out the hard way that if Google finds a link, regardless of whether it's in robots.txt or not it can put it in the index and it will remain there until you remove the nofollow restriction and noindex it, or remove it from the index using webmaster tools.

                George

                RafalJ 1 Reply Last reply Reply Quote 1
                • RafalJ
                  RafalJ @webmethod last edited by

                  Hi George,

                  Thanks for this, It's very interesting... the urls do appear in search results but their descriptions are blocked(!)

                  Did you try configuring URL parameters in WMT as a solution?

                  webmethod 1 Reply Last reply Reply Quote 0
                  • webmethod
                    webmethod @RafalJ last edited by

                    Yes it's the worst possible scenario that they basically get trapped in SERPs. Google won't then crawl them until you allow the crawling, then set noindex (to remove from SERPS) and then add nofollow,noindex back on to keep them out of SERPs and to stop Google following any links on them.

                    Configuring URL parameters again is just a directive regarding the crawl and doesn't affect indexing status to the best of my knowledge.

                    In my experience, noindex is bulletproof but nofollow / robots.txt is very often misunderstood and can lead to a lot of problems as a result. Some SEOs think they can be clever in crafting the flow of PageRank through a site. The unsurprising reality is that Google just does what it wants.

                    George

                    1 Reply Last reply Reply Quote 1
                    • Kalitenko2014
                      Kalitenko2014 @webmethod last edited by

                      Hello,

                      Thanks for your response. I have learn more which is great 🙂

                      My question is should I add a noindex only to that page or a noidex, nofolow?

                      Thanks!

                      webmethod 1 Reply Last reply Reply Quote 0
                      • webmethod
                        webmethod @Kalitenko2014 last edited by

                        1. If you add just noindex, Google will crawl the page, drop it from the index but it will also crawl the links on that page and potentially index them too. It basically passes equity to links on the page.

                        2. If you add nofollow, noindex, Google will crawl the page, drop it from the index but it will not crawl the links on that page. So no equity will be passed to them. As already established, Google may still put these links in the index, but it will display the standard "blocked" message for the page description.

                        If the links are internal, there's no harm in them being followed unless you're opening up the crawl to expose tons of duplicate content that isn't canonicalised.

                        noindex is often used with nofollow, but sometimes this is simply due to a misunderstanding of what impact they each have.

                        George

                        1 Reply Last reply Reply Quote 0
                        • carlystemmer
                          carlystemmer @webmethod last edited by

                          Hi George,

                          I am having a similar issue with my site, and was looking for a quick clarification.

                          We have several "member" pages that have been created as a part of registration (thousands) and they are appearing as duplicate content. When you say add noindex and and a canonical, is this something that needs to be done to every individual page or is there something that can be done that would apply to the thousands of pages at once?

                          Here are a couple of examples of what the pages look like:

                          http://loyalty360.org/me/members/8003

                          http://loyalty360.org/me/members/4641

                          Thank you!

                          webmethod 1 Reply Last reply Reply Quote 0
                          • webmethod
                            webmethod @carlystemmer last edited by

                            Hi Carly,

                            It needs to be done to each of the pages. In most cases, this is just a minor change to a single page template. Someone might tell you that you can add an entry to robots.txt to solve the problem, but that won't remove them from the index.

                            Looking at the links you provided, I'm not convinced you should deindex them all - as these are member profile pages which might have some value in terms of driving organic traffic and having unique content on them. That said I'm not party to how your site works, so this is just an observation.

                            Hope that helps,

                            George

                            1 Reply Last reply Reply Quote 0
                            • 1 / 1
                            • First post
                              Last post
                            • Duplicate page content
                              StephanSolomonidis
                              StephanSolomonidis
                              0
                              3
                              55

                            • Duplicate Page Content
                              DennisSeymour
                              DennisSeymour
                              0
                              4
                              107

                            • How to optimize WordPress Pages with Duplicate Page Content?
                              lautman
                              lautman
                              0
                              4
                              125

                            • Form Only Pages Considered No Content/Duplicate Pages
                              CommercePundit
                              CommercePundit
                              0
                              3
                              96

                            • Duplicate Page Content for Product Pages
                              Marcus_Miller
                              Marcus_Miller
                              0
                              4
                              228

                            • Duplicate Page Content and Duplicate Page Title
                              AlanMosley
                              AlanMosley
                              0
                              5
                              464

                            • Crawl Diagnostics - Duplicate Content and Duplicate Page Title Errors
                              JoelWolfgang
                              JoelWolfgang
                              0
                              3
                              809

                            • Avoiding "Duplicate Page Title" and "Duplicate Page Content" - Best Practices?
                              WesleySmits
                              WesleySmits
                              5
                              29
                              17.8k

                            Get started with Moz Pro!

                            Unlock the power of advanced SEO tools and data-driven insights.

                            Start my free trial
                            Products
                            • Moz Pro
                            • Moz Local
                            • Moz API
                            • Moz Data
                            • STAT
                            • Product Updates
                            Moz Solutions
                            • SMB Solutions
                            • Agency Solutions
                            • Enterprise Solutions
                            • Digital Marketers
                            Free SEO Tools
                            • Domain Authority Checker
                            • Link Explorer
                            • Keyword Explorer
                            • Competitive Research
                            • Brand Authority Checker
                            • Local Citation Checker
                            • MozBar Extension
                            • MozCast
                            Resources
                            • Blog
                            • SEO Learning Center
                            • Help Hub
                            • Beginner's Guide to SEO
                            • How-to Guides
                            • Moz Academy
                            • API Docs
                            About Moz
                            • About
                            • Team
                            • Careers
                            • Contact
                            Why Moz
                            • Case Studies
                            • Testimonials
                            Get Involved
                            • Become an Affiliate
                            • MozCon
                            • Webinars
                            • Practical Marketer Series
                            • MozPod
                            Connect with us

                            Contact the Help team

                            Join our newsletter
                            Moz logo
                            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                            • Accessibility
                            • Terms of Use
                            • Privacy