The Moz Q&A Forum

    • Forum
    • Questions
    • My Q&A
    • Users
    • Ask the Community

    Welcome to the Q&A Forum

    Browse the forum for helpful insights and fresh discussions about all things SEO.

    1. SEO and Digital Marketing Q&A Forum
    2. Categories
    3. Link Building
    4. Does google scrape links from PDF files? do these links pass link juice?

    Does google scrape links from PDF files? do these links pass link juice?

    Link Building
    8 4 2.9k
    • Oldest to Newest
    • Newest to Oldest
    • Most Votes
    Reply
    • Reply as question
    Log in to reply
    This topic has been deleted. Only users with topic management privileges can see it.
    • adriandg
      adriandg last edited by

      Title is pretty much the whole question.

      1 Reply Last reply Reply Quote 0
      • oznappies
        oznappies last edited by

        Have a look at this article http://searchenginewatch.com/article/2067225/Google-Does-PDF-Other-Changes it explains some of the doc library search for pdf files and Google's statement here http://googleblog.blogspot.com/2008/10/picture-of-thousand-words.html.

        adriandg 1 Reply Last reply Reply Quote 2
        • MarieHaynes
          MarieHaynes last edited by

          Google definitely does index the contents of pdf files.  I found this out the hard way as I had a real estate pdf on my site that I wanted to have listed in the index, but I didn't know that the contents would be crawled.  The pdf contained some listings that I was not legally allowed to advertise on my site.  (It was legal for me to give someone a report with the listings in it though).

          When another realtor was searching for their own listing, my pdf came up.  I got in trouble.  I'm ok now though.  🙂

          adriandg 1 Reply Last reply Reply Quote 1
          • adriandg
            adriandg @oznappies last edited by

            Hmmm although i thought you had answered my question, i actually feel that you have not... Yes the links you provided state that google scrapes pdfs and even OCRs pdfs to get a better idea what is in them, but i don't see anywhere that they mention crawling the urls they find in these pdf documents.

            1 Reply Last reply Reply Quote 0
            • adriandg
              adriandg @MarieHaynes last edited by

              yes, but do they crawl the links they find in these documents, or do they just index their contents.

              1 Reply Last reply Reply Quote 0
              • adriandg
                adriandg last edited by

                This person seems to think no: http://www.google.fr/support/forum/p/Webmasters/thread?tid=14c5fe970fe84361&hl=en

                but i'm not sure how much i can trust a random comment from a random source.  any evidence for either argument?

                EDIT: And this person seems to think they do pass link juice: http://www.whydowork.com/blog/link-building/274/

                Could a mod remove the marked as answered? i don't think i am able to remove it, and the question isn't really answered.

                1 Reply Last reply Reply Quote 0
                • oznappies
                  oznappies last edited by

                  Yes it does according to Google tech spec http://code.google.com/apis/searchappliance/documentation/50/admin_crawl/Introduction.html

                  which specifically states if follows html links in pdf 'It follows HTML links in PDF files, Word documents, and Shockwave documents'. Google's own api docs carry more weight than a comment in a forum_._ If they are licencing this out as an application it would suggest that the same technology is available in the main engine as does Dunamis's comment about a listing in a pdf document being found in search results.

                  You can test for youself by publishing a pdf with a link to a info page that does not show up in any other links. Include the pdf in your sitemap but not the test page and check if it shows in googles index site:yoursite.com the next time it crawls.

                  This also gives some insight in an interview with Matt Cutts - http://www.stonetemple.com/articles/interview-matt-cutts-012510.shtml

                  Eric Enge:  What about PDF files?

                  Matt Cutts: We absolutely do process PDF files.  I am not going to talk about whether links in PDF files pass PageRank.  But, a good way to think about PDFs is that they are kind of like Flash in that they aren't a file format that's inherent and native to the web, but they can be very useful.  In the same way that we try to find useful content within a Flash file, we try to find the useful content within a PDF file.  At the same time, users don't always like being sent to a PDF. If you can make your content in a Web-Native format, such as pure HTML, that's often a little more useful to users than just a pure PDF file.

                  1 Reply Last reply Reply Quote 1
                  • Cristina.Andrei
                    Cristina.Andrei last edited by

                    I made a test and it seems that yes, the links from pdf count for ranking.

                    The test is on my Romanian blog http://seogan.ro/link-building-pdf-urile-o-sursa-de-linkuri-test

                    You can find an English translation here: http://www.seogan.com/pdf-link-building

                    Hope it helps.

                    1 Reply Last reply Reply Quote 0
                    • 1 / 1
                    • First post
                      Last post
                    • We are moving to HTTPS and wanted to know if our link building efforts were in vain or will the link juice pass to HTTPS?
                      BlueCorona
                      BlueCorona
                      0
                      5
                      92

                    • Passing link juice through # javascript
                      seowoody
                      seowoody
                      0
                      6
                      1.1k

                    • Do footer links pass less link juice?
                      Theskimonster
                      Theskimonster
                      0
                      3
                      196

                    • Do links from the second page of an article pass link juice?
                      irvingw
                      irvingw
                      0
                      6
                      290

                    • Does a hashtag link pass the same amount of link juice as a link without a hashtag?
                      Cyrus-Shepard
                      Cyrus-Shepard
                      0
                      4
                      690

                    • Link juice pass on
                      AlanMosley
                      AlanMosley
                      0
                      2
                      528

                    • JavaScript is crawled by search engines, isn’t it? Does it mean that links embedded in JavaScript pass link juice?
                      AlanMosley
                      AlanMosley
                      0
                      3
                      523

                    • If external linking passes juice, why not just nofollow all external links?
                      shandaman
                      shandaman
                      0
                      2
                      455

                    Get started with Moz Pro!

                    Unlock the power of advanced SEO tools and data-driven insights.

                    Start my free trial
                    Products
                    • Moz Pro
                    • Moz Local
                    • Moz API
                    • Moz Data
                    • STAT
                    • Product Updates
                    Moz Solutions
                    • SMB Solutions
                    • Agency Solutions
                    • Enterprise Solutions
                    • Digital Marketers
                    Free SEO Tools
                    • Domain Authority Checker
                    • Link Explorer
                    • Keyword Explorer
                    • Competitive Research
                    • Brand Authority Checker
                    • Local Citation Checker
                    • MozBar Extension
                    • MozCast
                    Resources
                    • Blog
                    • SEO Learning Center
                    • Help Hub
                    • Beginner's Guide to SEO
                    • How-to Guides
                    • Moz Academy
                    • API Docs
                    About Moz
                    • About
                    • Team
                    • Careers
                    • Contact
                    Why Moz
                    • Case Studies
                    • Testimonials
                    Get Involved
                    • Become an Affiliate
                    • MozCon
                    • Webinars
                    • Practical Marketer Series
                    • MozPod
                    Connect with us

                    Contact the Help team

                    Join our newsletter
                    Moz logo
                    © 2021 - 2026 SEOMoz, Inc., a Ziff Davis company. All rights reserved. Moz is a registered trademark of SEOMoz, Inc.
                    • Accessibility
                    • Terms of Use
                    • Privacy