The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Does Google see this as duplicate content?

    Does Google see this as duplicate content?

    Intermediate & Advanced SEO
    4 3 95
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • 94501
      94501 last edited by

      I'm working on a site that has too many pages in Google's index as shown in a simple count via a site search (example):

      site:http://www.mozquestionexample.com

      I ended up getting a full list of these pages and it shows pages that have been supposedly excluded from the index via GWT url parameters and/or canonicalization

      For instance, the list of indexed pages shows:

      1. http://www.mozquestionexample.com/cool-stuff

      2. http://www.mozquestionexample.com/cool-stuff?page=2

      3. http://www.mozquestionexample.com?page=3

      4. http://www.mozquestionexample.com?mq_source=q-and-a

      5. http://www.mozquestionexample.com?type=productss&sort=1date

      Example #1 above is the one true page for search and the one that all the canonicals reference.

      Examples #2 and #3 shouldn't be in the index because the canonical points to url #1.

      Example #4 shouldn't be in the index, because it's just a source code that, again doesn't change the page and the canonical points to #1.

      Example #5 shouldn't be in the index because it's excluded in parameters as not affecting page content and the canonical is in place.

      Should I worry about these multiple urls for the same page and if so, what should I do about it?

      Thanks... Darcy

      1 Reply Last reply Reply Quote 0
      • Ray-pp
        Ray-pp last edited by

        Hi 94501,

        Example #1 above is the one true page for search and the one that all the canonicals reference.

        If the pages are properly canonicalized then Example #1 will receive nearly all of the authority stemming from pages with this URL as the canonical tag.

        I.e. Example #2 and #3 will pass authority to Example #1

        Examples #2 and #3 shouldn't be in the index because the canonical points to url #1.

        Setting a canonical tag doesn't guarantee that a page will not be indexed. To do that, you'd need to add a 'noindex' tag to the page.

        Google chooses whether or not to index these pages and in many situations you want them indexed. For example: User searches for 'product X' and product x resides on the 3rd page of your category. Since Google has this page indexed (although the canonical points to the main page) it makes sense to show the page that contains the product the user was searching for.

        Example #4 shouldn't be in the index, because it's just a source code that, again doesn't change the page and the canonical points to #1.

        To make sure it is not indexed, you would need to add a 'noidex' tag and/or make sure the parameters are set in GWMT to ignore these pages.

        But again, if the canonical is set properly then the authority passes to the main page and having this page indexed may not have negative impact.

        Example #5 shouldn't be in the index because it's excluded in parameters as not affecting page content and the canonical is in place.

        How long ago was the parameter setting applied in GWMT? Sometimes it takes a couple weeks to deindex pages that were already indexed by Google.

        94501 1 Reply Last reply Reply Quote 1
        • 94501
          94501 @Ray-pp last edited by

          Hi Ray,

          Thanks for the response. To answer your question, the URL parameters have been set for months, if not years.

          I wouldn't know how to set a noindex on a url with a different source code, because it really isn't a whole new url, just different tracking. I'd be setting a noindex for the example 1 page and that would not be good.

          So, should I just not worry about it then?

          Thanks... Darcy

          Everett 1 Reply Last reply Reply Quote 0
          • Everett
            Everett @94501 last edited by

            Darcy,

            Blocking URLs in the robots.txt file will not remove them from the index if Google has already found them, nor will it prevent them from being added if Google finds links to them, such as internal navigation links or external backlinks. If this is your issue, you'll probably see something like this in the SERPs for those pages:

            "We cannot display the content because our crawlers are being blocked by this site's robots.txt file" or something like that.

            Here's a good discussion about it on WMW.

            If you have parameters set up in GWT and are using a rel canonical tag that points Google to the non-parameter version of the URL you probably don't need to block Googlebot. I would only block them if I thought crawlbudget was an issue, as in seeing Google to continue to crawl these pages within your log files, or when you potentially have millions of these types of pages.

            1 Reply Last reply Reply Quote 1
            • 1 / 1
            • First post
              Last post
            • Duplicate content - how to diagnose duplicate content from another domain before publishing pages?
              Chemometec
              Chemometec
              0
              7
              141

            • Real Estate MLS listings - Does Google Consider duplicate content?
              khi5
              khi5
              0
              3
              1.3k

            • Is Sitemap Issue Causing Duplicate Content & Unindexed Pages on Google?
              0
              1
              531

            • How to Avoid Duplicate Content Issues with Google?
              LynnPatchett
              LynnPatchett
              0
              2
              193

            • Google WMT Showing Duplicate Content, But There is None
              WebbyNabler
              WebbyNabler
              0
              3
              441

            • Why is Google Reporting big increase in duplicate content after Canonicalization update?
              Towelsrus
              Towelsrus
              0
              11
              493

            • Do you bother cleaning duplicate content from Googles Index?
              Dr-Pete
              Dr-Pete
              0
              5
              450

            • Google consolidating link juice on duplicate content pages
              Dan-Petrovic
              Dan-Petrovic
              0
              2
              783

            Get started with Moz Pro!

            Unlock the power of advanced SEO tools and data-driven insights.

            Start my free trial
            Products
            • Moz Pro
            • Moz Local
            • Moz API
            • Moz Data
            • STAT
            • Product Updates
            Moz Solutions
            • SMB Solutions
            • Agency Solutions
            • Enterprise Solutions
            • Digital Marketers
            Free SEO Tools
            • Domain Authority Checker
            • Link Explorer
            • Keyword Explorer
            • Competitive Research
            • Brand Authority Checker
            • Local Citation Checker
            • MozBar Extension
            • MozCast
            Resources
            • Blog
            • SEO Learning Center
            • Help Hub
            • Beginner's Guide to SEO
            • How-to Guides
            • Moz Academy
            • API Docs
            About Moz
            • About
            • Team
            • Careers
            • Contact
            Why Moz
            • Case Studies
            • Testimonials
            Get Involved
            • Become an Affiliate
            • MozCon
            • Webinars
            • Practical Marketer Series
            • MozPod
            Connect with us

            Contact the Help team

            Join our newsletter
            Moz logo
            © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
            • Accessibility
            • Terms of Use
            • Privacy