Convert keyword rich PDFs to web pages (text & images)
-
SteriPEN is a portable water purifier that kills viruses, protozoa, e-coli, etc.
Because of the technical and safety requirements nature of the product, our website has much documentation of testing, organisms affected, and more. These are in pdf form and can often be found through google search (and through links on specific pages).
Because of the keyword-richness of these documents pertaining to microbes SteriPEN kills, etc. does it make sense to convert these pdf's into html text and images?
Then I was thinking perhaps writing a blog post AND generating key links on important landing pages to these documents (as html).
Removing pdfs may be harmful? Not a clue as to the cost/benefit.
-
Google can read PDFs, and returns them in search results, but some users might prefer to view an HTML version. Also, it looks like images in PDFs are not indexed, according to the 2nd post below.
Regarding duplicate content, Google says (2nd post below):
Q: Is it considered duplicate content if I have a copy of my pages in both HTML and PDF?
A: Whenever possible, we recommend serving a single copy of your content. If this isn’t possible, make sure you indicate your preferred version by, for example, including the preferred URL in your Sitemap or by specifying the canonical version in the HTML or in the HTTP headers of the PDF resource. For more tips, read our Help Center article about canonicalization.These will be of interest to you:
http://www.google.com/support/forum/p/Webmasters/thread?tid=4472512a5515686b&hl=en&fid=4472512a5515686b00047d6de91c24fa&hltp=2
http://googlewebmastercentral.blogspot.com/2011/09/pdfs-in-google-search-results.html