The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. PDF on financial site that duplicates ~50% of site content

    PDF on financial site that duplicates ~50% of site content

    Intermediate & Advanced SEO
    11 5 392
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • danatanseo
      danatanseo last edited by

      As long as you have rel=canonical tags properly in place, you don't need to worry about the PDF causing duplicate content problems. That way, any original content should be picked up and any duplicate can be attributed to your existing Web pages. Hope that's helpful!

      Dana

      540SEO 1 Reply Last reply Reply Quote 0
      • 540SEO
        540SEO @danatanseo last edited by

        Not sure which page I would mark as being canonical, since the pdf contains content from several different pages on the site. I don't think it's possible to assign different rel=canonical tags to separate portions of a pdf, is it?

        danatanseo 540SEO Valarlf dmccarthy EGOL 7 Replies Last reply Reply Quote 0
        • danatanseo
          danatanseo @540SEO last edited by

          Hi Keith,

          I'm sorry, I should have clarified. The rel=canonical tags would be on your Web pages, not the PDF (they are irrelevant in a PDF document). Then Google will attribute your Web page as the original source of the content and will understand that the PDF just contains bits of content from those pages. In this instance I would include a rel=canonical tag on every page of your site, just to cover your bases. Hope that helps!

          Dana

          1 Reply Last reply Reply Quote 0
          • 540SEO
            540SEO @540SEO last edited by

            I thought the idea was to put rel=canonical on the duplicated page, to signal that "hey, this page may look like duplicate content, but please refer to this canonical URL"?

            Looks like there is a pdf option for rel=canonical, I guess the question is, what page on the site to make canonical?

            http://support.google.com/webmasters/bin/answer.py?hl=en&answer=139394

            Indicate the canonical version of a URL by responding with the Link rel="canonical" HTTP header. Adding rel="canonical" to the head section of a page is useful for HTML content, but it can't be used for PDFs and other file types indexed by Google Web Search. In these cases you can indicate a canonical URL by responding with the Link rel="canonical" HTTP header, like this (note that to use this option, you'll need to be able to configure your server):

            Link: <http: www.example.com="" downloads="" white-paper.pdf="">; rel="canonical"</http:>

            1 Reply Last reply Reply Quote 0
            • Valarlf
              Valarlf last edited by

              I think the right way here is to put the rel canonical in PDF header http://googlewebmastercentral.blogspot.com/2011/06/supporting-relcanonical-http-headers.html

              1 Reply Last reply Reply Quote 0
              • Valarlf
                Valarlf @540SEO last edited by

                If you are using apache, you should put it on your .htaccess with this form

                <filesmatch “my-file.pdf”="">Header set Link ‘<http: misite="" my-file.html="">; rel=”canonical”‘</http:></filesmatch>

                1 Reply Last reply Reply Quote 1
                • 540SEO
                  540SEO @540SEO last edited by

                  Thanks. Anybody want to weigh in on where to rel=canonical to? Home page?

                  1 Reply Last reply Reply Quote 0
                  • Valarlf
                    Valarlf @540SEO last edited by

                    Personally I think it would be better not to index, it but if necessary, the index folder root seems like a good option

                    1 Reply Last reply Reply Quote 0
                    • dmccarthy
                      dmccarthy @540SEO last edited by

                      You could set the header to noindex rather than rel=canonical

                      1 Reply Last reply Reply Quote 0
                      • EGOL
                        EGOL @540SEO last edited by

                        This is what we have done with pdfs.   Assign rel="canonical" in .htaccess.

                        We did this with a few hundred files and it took google a LONG time to find and credit them.

                        1 Reply Last reply Reply Quote 0
                        • 1 / 1
                        • First post
                          Last post
                        • How bad is duplicate content for ecommerce sites?
                          BradsDeals
                          BradsDeals
                          0
                          6
                          1.7k

                        • Duplicate Multi-site Content, Duplicate URLs
                          MonicaOConnor
                          MonicaOConnor
                          0
                          2
                          129

                        • Duplicate content - how to diagnose duplicate content from another domain before publishing pages?
                          Chemometec
                          Chemometec
                          0
                          7
                          141

                        • Site been plagiarised - duplicate content
                          CommT
                          CommT
                          0
                          13
                          259

                        • Duplicate content on sites from different countries
                          simon_realbuzz
                          simon_realbuzz
                          0
                          5
                          6.4k

                        • Duplicate content on ecommerce sites
                          Dr-Pete
                          Dr-Pete
                          0
                          5
                          1.7k

                        • Mobile Site - Same Content, Same subdomain, Different URL - Duplicate Content?
                          Carson-Ward
                          Carson-Ward
                          0
                          6
                          1.5k

                        • Avoiding duplicate content on an ecommerce site
                          CMoore85
                          CMoore85
                          0
                          7
                          662

                        Get started with Moz Pro!

                        Unlock the power of advanced SEO tools and data-driven insights.

                        Start my free trial
                        Products
                        • Moz Pro
                        • Moz Local
                        • Moz API
                        • Moz Data
                        • STAT
                        • Product Updates
                        Moz Solutions
                        • SMB Solutions
                        • Agency Solutions
                        • Enterprise Solutions
                        • Digital Marketers
                        Free SEO Tools
                        • Domain Authority Checker
                        • Link Explorer
                        • Keyword Explorer
                        • Competitive Research
                        • Brand Authority Checker
                        • Local Citation Checker
                        • MozBar Extension
                        • MozCast
                        Resources
                        • Blog
                        • SEO Learning Center
                        • Help Hub
                        • Beginner's Guide to SEO
                        • How-to Guides
                        • Moz Academy
                        • API Docs
                        About Moz
                        • About
                        • Team
                        • Careers
                        • Contact
                        Why Moz
                        • Case Studies
                        • Testimonials
                        Get Involved
                        • Become an Affiliate
                        • MozCon
                        • Webinars
                        • Practical Marketer Series
                        • MozPod
                        Connect with us

                        Contact the Help team

                        Moz logo
                        © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                        • Accessibility
                        • Terms of Use
                        • Privacy