Duplicate pages or note? Variations just due to language changes?
-
I have some pages marked as duplicates, so I want to do what I can to solve the issues concerned.
One issue concerns duplicates where the page content is indeed the same except for the language that the content is offered in.
The URL for example of the documentation page of the site, in English is as follows:
http://www.domain.com/support/documentationWe then have the same content in German, French, Russian using the following URLs.
http://www.domain.com/de/support/documentation
http://www.domain.com/fr/support/documentation
http://www.domain.com/ru/support/documentationEach page has links to PDFs which are all in fact in English so the links to the docs are the same. Moz is flagging up all these pages as being duplicate content (which it is when translated back into English, but is not if you just consider that they are using completely different languages!)
Has anyone any thoughts on how to solve this? Or is this something not to worry about / disregard?
Many thanks
Simon
-
Hi Simon,
Okay so crawlers can crawl PDF's unless they are encrypted / encoded. However since they link to the PDF that shouldn't be the issue.Ref: googleblog
How much content are on these pages? I ask because when there is thin content you may find that the template itself is causing the duplication problem, unless of course you are using different templates for each language as well.
Take for example a page that reads.
en: The woman eats frozen fruit daily.
de: Die Frau isst gefrorenes Gemüse jegen tag.
es: La mujer come las verduras congeladas diariaNow surround each of those pages with a header content, footer content, right / left column content same images same alt tags and the deviation of content is so small it is not noticed.
-
Hi Simon. Don has given you some good guidance. Here's a recent Moz Dev Blog post on the subject: https://moz.com/devblog/near-duplicate-detection/. Note their images explaining much of what Don described. Two pages having enough shared phrases (because of the header, footer, nav, etc) can trigger the duplicate warning. While the latter part of the dev blog post certainly gets technical, it should explain why you might be getting duplicate content warnings even further if that's your bent.
Since each tool is a bit different you can also check your pages with other tools, such as: http://www.webconfs.com. Cheers!
-
This post is deleted! -
Don - thank you so much for responding
I think you may well have identified the issue too! The documentation page is the same in each case - has a table with e.g. 4 columns and 20 or so rows - and I guess much of the structural content of the pages, irrespective of the linguistic variations of the text shown on screen, is the same.
Will look closer at this.
Assuming this is the case - the next logical questions would be: Does it matter in terms of SEO? Or is it a kind of 'false positive' which can be noted but ignored? What could I do about it anyway? I guess the answer is implied in your answer above: change the template for each language?
Allied to this, is the fact that since the site is 'growing' with multiple language versions, the problem seen with this sample page will potentially be replicated all over the site. Again, the big question is about the effect on SEO. Web pages are scoring well for brand terms and other important words, and while there are new phrases and words to focus on, I am unsure whether correcting these is to prevent strict penalties or simply to make already-decent rankings as good as they can be.
Thanks in advance for any further points you care to make.
-
Ryan - thank you too for taking the time to respond.
Had a quick peek at the blog you noted - going to go back and read it v-e-r-y slowly!
Ditto, many thanks re the webconfs link - plenty of fun tools to try out there. Am sure it will all deepen my learning / confusion!Thanks again!