Duplicate page content
-
Hi guys the feedback form my campaign suggests I have to much duplicate page content. I’ve had a look at the CSV file but it doesn’t seem to be abundantly clear as to which pages on my site have the duplicate content. Can anyone tell which columns I need to refer to on the sheet, to ascertain this information.
Also if the content is only slightly different, will Google still consider it to be duplicate? I look forward to hearing from you
-
I really don't know which columns in the CSV, but you can see those pages in the campaigns page.
Anyway, if the content is slightly different, you could consider using noindex on those pages that could generate conflicts. For example, that happens quite often on a blog, duplicate content reports on tag/category pages, in those cases it would be considered a good practice to noindex tag pages.
Just my 2 cents

-
When you download your crawl diagnostics as a csv, column A is "URL", column L is the true/flase column for "Duplicate Page Content", and column AF "duplicate_page_content" contains the urls of duplicates to the url in column A.
To look at duplicate content, I sort by column L, delete all of the false rows (because they don't have duplicate content), then I delete all of the columns except column A (URL) and column AF (duplicate_page_content), save the spreadsheet as "yyyymmdd-duplicate-content" and work from that. (Easier to see what you are doing without all the other data in the way.)
Also note that column AF "duplicate_page_content" can have more than one url in it if you have multiple versions of the same content. In this case I use Excel's "Text to Columns" function (under "Data" in the ribbon) to put each url into its own column so I can deal with them individually.
And yes, if there are just small differences Google is likely to see pages as duplicates.