The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Duplicate without user-selected canonical excluded

    Duplicate without user-selected canonical excluded

    Intermediate & Advanced SEO
    6 3 1.6k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • dailynaukri
      dailynaukri last edited by

      We have pdf files uploaded in the media of wordpress and used in our website. As these pdfs are duplicate content of the original publishers, we have marked links to these pdf urls as nofollow. These pages are also disallowed in robots.txt

      Now, Google Search Console has shown these pages Excluded as "Duplicate without user-selected canonical"

      As it comes out we cannot use canonical tag with pdf pages so as to point to the original pdf source

      If we embed a pdf viewer in our website and fetch the pdfs by passing the urls of the original publisher, would the pdfs be still read as text by google and again create duplicate content issue? Another thing, when the pdf expires and is removed, it would lead to 404 error.

      If we direct our users to the third party website, then it would add up to our bounce rate.

      What should be the appropriate way to handle duplicate pdfs?

      Thanks

      1 Reply Last reply Reply Quote 1
      • Dalerio-Consulting
        Dalerio-Consulting last edited by

        If the pdfs are duplicate within your own site, then the best solution would be for you to link to the same document from different sources. Then you can delete the duplicated documents and 301 redirect them to the original.

        If the pdfs are duplicate from another site, then disallowing them on robots.txt will stop them from being marked as a duplicate, as the crawler will not be able to access them at all. It will just take some time for them to be updated on google search console.

        If however, you want to add canonical tags to the pdf documents (or other non-HTML documents), you can add it to the HTTP header through the .htaccess file. You can find a tutorial on how to do that in this article.

        Daniel Rika - Dalerio Consulting
        https://dalerioconsulting.com/
        info@dalerioconsulting.com

        1 Reply Last reply Reply Quote 1
        • dailynaukri
          dailynaukri last edited by

          Hello Daniel

          The pdfs are duplicates from another site.

          The thing is that we have already disallowed the pdfs in the robots.txt file.

          Now, what happened is this - We have a set of pages (let's call them content pages) which we had disallowed in the robots file as they had thin content. Those pages have links to their respective third party pdfs, which have been marked as nofollow. The pdfs are also disallowed in the robots file.

          Few days back, we improved our content pages and removed them from robots file so that they can be indexed. Pdfs are still disallowed. Despite being disallowed, we have come across this issue with the pdf pages as "Duplicate without user-selected canonical."

          I hope I make myself clear. Any insights now please.

          Dalerio-Consulting 1 Reply Last reply Reply Quote 0
          • Dalerio-Consulting
            Dalerio-Consulting @dailynaukri last edited by

            As the pdf pages are marked as a duplicate and not the pdf files, then you should check which page has duplicate content compared to it, and take the needed measures (canonical tags or 301 redirect) form the page with less rank to the page with more rank. Alternatively, you can edit the content so that it isn't anymore duplicate.

            If I had a link to the site and duplicate pages, I would be able to give you a more detailed response.

            Daniel Rika - Dalerio Consulting
            https://dalerioconsulting.com/
            info@dalerioconsulting.com

            1 Reply Last reply Reply Quote 0
            • dailynaukri
              dailynaukri last edited by

              Sorry, I mean pdf files only

              1 Reply Last reply Reply Quote 0
              • abtechgroup
                abtechgroup last edited by

                From what I have read, so much of the web is duplicate content so it really doesn't matter if the pdf is on other sites; let google figure it out. (example, every car brand dealer has a pdf of the same car model brochure on their dealer site) No big deal. Visitors will be landing on your site from other search relevance - the duplicate pdf doesn't matter. Just my take. Adrian

                1 Reply Last reply Reply Quote 1
                • 1 / 1
                • First post
                  Last post
                • Google user-declared canonical
                  effectdigital
                  effectdigital
                  0
                  4
                  4.7k

                • Google-selected canonical makes no sense
                  DmitriiK
                  DmitriiK
                  0
                  3
                  48

                • Duplicate pages and Canonicals
                  Christy-Correll
                  Christy-Correll
                  0
                  5
                  66

                • Canonical tags for duplicate listings
                  Andy.Drinkwater
                  Andy.Drinkwater
                  0
                  6
                  105

                • Partial duplicate content and canonical tags
                  anthonydnelson
                  anthonydnelson
                  0
                  2
                  694

                • Should canonical links be included or excluded in a sitemap?
                  Morningside
                  Morningside
                  0
                  4
                  395

                • Scanning For Duplicate Canonical Tags
                  edmundsseo
                  edmundsseo
                  0
                  3
                  308

                • Rel canonical and duplicate subdomains
                  94501
                  94501
                  0
                  7
                  2.1k

                Get started with Moz Pro!

                Unlock the power of advanced SEO tools and data-driven insights.

                Start my free trial
                Products
                • Moz Pro
                • Moz Local
                • Moz API
                • Moz Data
                • STAT
                • Product Updates
                Moz Solutions
                • SMB Solutions
                • Agency Solutions
                • Enterprise Solutions
                • Digital Marketers
                Free SEO Tools
                • Domain Authority Checker
                • Link Explorer
                • Keyword Explorer
                • Competitive Research
                • Brand Authority Checker
                • Local Citation Checker
                • MozBar Extension
                • MozCast
                Resources
                • Blog
                • SEO Learning Center
                • Help Hub
                • Beginner's Guide to SEO
                • How-to Guides
                • Moz Academy
                • API Docs
                About Moz
                • About
                • Team
                • Careers
                • Contact
                Why Moz
                • Case Studies
                • Testimonials
                Get Involved
                • Become an Affiliate
                • MozCon
                • Webinars
                • Practical Marketer Series
                • MozPod
                Connect with us

                Contact the Help team

                Join our newsletter
                Moz logo
                © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                • Accessibility
                • Terms of Use
                • Privacy