The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Intermediate & Advanced SEO
    4. Thinking about not indexing PDFs on a product page

    Thinking about not indexing PDFs on a product page

    Intermediate & Advanced SEO
    10 5 191
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • Bio-RadAbs
      Bio-RadAbs last edited by

      Our product pages generate a PDF version of the page in a different layout. This is done for 2 reasons, it's been the standard across similar industries and to help customers print them when working with the product.

      So there is a use when it comes to the customer but search? I've thought about this a lot and my thinking is why index the PDF at all? Only allow the HTML page to be indexed. The PDF files are in a subdomain, so I can easily no index them. The way I see it, I'm reducing duplicate content

      On the flip side, it is hosted in a subdomain, so the PDF appearing when a HTML page doesn't, is another way of gaining real estate. If it appears with the HTML page, more estate coverage.

      Anyone else done this? My knowledge tells me this could be a good thing, might even iron out any backlinks from being generated to the PDF and lead to more HTML backlinks

      Can PDFs solely exist as a form of data accessible once on the page and not relevant to search engines. I find them a bane when they are on a subdomain.

      1 Reply Last reply Reply Quote 0
      • ColeLusby
        ColeLusby last edited by

        "I'm reducing duplicate content " - Google cannot crawl PDFs, but they do index them and show them in search results.

        So let me ask you - why do you not want them indexed?

        I say let them be indexed.

        Cole

        Bio-RadAbs 1 Reply Last reply Reply Quote 0
        • HashtagHustler
          HashtagHustler last edited by

          Morning,

          To my knowledge Google isn't able to open a PDF. You could always present the users with the option of downloading a PDF. Any tech website I have been to generally offers it in a download, or opens it in another window.

          I don't know why it would automatically present a PDF, although, I probably don't work in the same industry! Ha!

          The other question I have is, are you getting Duplicate content warnings? Are the PDF's currently being indexed? If so, how well are they being Indexed? Google can read an open PDF, or a PDF that automatically displays, but some are easier to read that others depending on the settings of the PDF.

          http://www.searchenginejournal.com/8-tips-to-make-your-pdf-page-seo-friendly-by/59975/

          Another option is the rel canonical tags?

          Hope this helps!

          Andy.Drinkwater Bio-RadAbs 2 Replies Last reply Reply Quote 0
          • Andy.Drinkwater
            Andy.Drinkwater @HashtagHustler last edited by

            This post is deleted!
            1 Reply Last reply Reply Quote 0
            • Andy.Drinkwater
              Andy.Drinkwater last edited by

              The way I see it, I'm reducing duplicate content.

              Anything you can do that helps with this, is a good move - nothing wrong with a little tidying up.

              the PDF appearing when a HTML page doesn't, is another way of gaining real estate

              Do you currently have this happen? PDF's can actually out-rank HTM pages on occasion - they aren't the preferred media type of Google, but like any page, it's all about the content.

              -Andy

              Bio-RadAbs 1 Reply Last reply Reply Quote 0
              • Bio-RadAbs
                Bio-RadAbs @ColeLusby last edited by

                Thanks for the replies

                Cole - Google indexed our PDFs though. I tested this by doing a site:domain.com search term, and then a site:static.domain.com search term search.

                Result:

                site:static.domain.com search term

                Google showed me the PDF document that is available for download from the HTML page that ranks high for that search term search.

                So Google is indexing both the PDF and HTML. To answer your question as to why I don't want them indexed.. Well, my thinking was. If the PDF appears and if someone backlinks to it, I rather get that backlink to the HTML page. PDFs are hosted on my subdomain and I don't want the subdomain to get the rank. Back of my head, I'm also debating, whether my PDF and HTML are competing with each other?

                1 Reply Last reply Reply Quote 0
                • Bio-RadAbs
                  Bio-RadAbs @HashtagHustler last edited by

                  Yeah, we offer the same. The user is able to download the PDF or have it open in a new window. I haven't seen Google automatically present my PDF and so far my searches have shown my HTML page, but my question to Cole remains, could Google be comparing the PDF and HTML page with each other? What if in a search situation it would prefer showing the PDF higher than the HTML page?

                  On your next question, I don't get duplicate warning for PDF. I believe the PDFs are indeed being indexed as the text is readable. How well are they being indexed? I've got close to 22,000 search results for my subdomain so yeah, they are indexed.

                  I do have rel-canonical tags on the HTML page, but can't appear it on the PDF as it's a file and not a page.

                  1 Reply Last reply Reply Quote 0
                  • Bio-RadAbs
                    Bio-RadAbs @Andy.Drinkwater last edited by

                    I don't think see my PDFs show up for a search term when my HTML pages are being displayed.

                    However, there was a situation when a PDF was displayed and I created a HTML page of it and set up redirects from the PDF to the HTML page. I followed that up by reuploading the PDF as a new URL and offering to download. That way I transfered the rank juice to the HTML page.

                    In a nutshell, no I don't see my PDFs outranking my HTML pages, but I do know my PDFs are indexed and I don't know if they show up  for a different search term.

                    I guess my main question is, would not indexing them open up the chance for more backlinks to your HTML page and not the PDF? And in Google's eyes, it won't debate over which to display, the HTML page or PDF as both have the same content.

                    Maybe I'm over thinking and the straight answer is, if a HTML page exists, Google won't give preference to the PDF but in the event there is no HTML, the PDF is shown

                    1 Reply Last reply Reply Quote 0
                    • EGOL
                      EGOL last edited by

                      If you link to a pdf, some of your power flows into it.   If someone else links to a pdf, some of his power flows into it.

                      PDFs accumulate backlinks, accumulate pagerank.  You should assign these valuable assets to real pages.

                      So, if you have pdfs that are duplicates of webpages then you should use rel=canonical using htaccess to attribute them to their matched webpage.  If you don't do that then you assets are being squandered.

                      Bio-RadAbs 1 Reply Last reply Reply Quote 1
                      • Bio-RadAbs
                        Bio-RadAbs @EGOL last edited by

                        Thanks EGOL, I didn't think about using rel=canonical on htaccess. Great idea

                        1 Reply Last reply Reply Quote 0
                        • 1 / 1
                        • First post
                          Last post
                        • Crawling/indexing of near duplicate product pages
                          AMAGARD
                          AMAGARD
                          1
                          3
                          80

                        • No Index No follow instead of Rel canoncical on product pages
                          brianglassman
                          brianglassman
                          1
                          2
                          94

                        • How do we decide which pages to index/de-index? Help for a 250k page site
                          julie-getonthemap
                          julie-getonthemap
                          0
                          2
                          63

                        • Index Pages become No-Index
                          TimKelsey
                          TimKelsey
                          0
                          6
                          114

                        • De-indexing product "quick view" pages
                          Milian
                          Milian
                          0
                          4
                          808

                        • Incorrect cached page indexing in Google while correct page indexes intermittently
                          MikeRoberts
                          MikeRoberts
                          0
                          2
                          298

                        • Certain Product Pages Not Indexing
                          GManSEO
                          GManSEO
                          0
                          8
                          138

                        • E Commerce product page canonical and indexing + URL parameters
                          Dr-Pete
                          Dr-Pete
                          0
                          3
                          756

                        Get started with Moz Pro!

                        Unlock the power of advanced SEO tools and data-driven insights.

                        Start my free trial
                        Products
                        • Moz Pro
                        • Moz Local
                        • Moz API
                        • Moz Data
                        • STAT
                        • Product Updates
                        Moz Solutions
                        • SMB Solutions
                        • Agency Solutions
                        • Enterprise Solutions
                        • Digital Marketers
                        Free SEO Tools
                        • Domain Authority Checker
                        • Link Explorer
                        • Keyword Explorer
                        • Competitive Research
                        • Brand Authority Checker
                        • Local Citation Checker
                        • MozBar Extension
                        • MozCast
                        Resources
                        • Blog
                        • SEO Learning Center
                        • Help Hub
                        • Beginner's Guide to SEO
                        • How-to Guides
                        • Moz Academy
                        • API Docs
                        About Moz
                        • About
                        • Team
                        • Careers
                        • Contact
                        Why Moz
                        • Case Studies
                        • Testimonials
                        Get Involved
                        • Become an Affiliate
                        • MozCon
                        • Webinars
                        • Practical Marketer Series
                        • MozPod
                        Connect with us

                        Contact the Help team

                        Join our newsletter
                        Moz logo
                        © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                        • Accessibility
                        • Terms of Use
                        • Privacy