Duplicate content pages
-
Crawl Diagnostics Summary shows around 15,000 duplicate content errors for one of my projects, It shows the list of pages with how many duplicate pages are there for each page. But i dont have a way of seeing what are the duplicate page URLs for a specific page without clicking on each page link and checking them manually which is gonna take forever to sort.
When i export the list as CSV, duplicate_page_content column doest show any data.
Can anyone please advice on this please.
Thanks
<colgroup><col width="1096"></colgroup>
| duplicate_page_content | -
If they are attached to specific strings ( String: After the URL it looks like this: /?alwer.ei.we ) you can block the string(s) in your robot.txt file.
Lets say there are 100 duplicates that start with"/?osifos.sdjvnksdj" block out the "?osifos" in your robot txt.
-
Sorry if my English was not clear, it's not my first language. My issue is I can't get the list of duplicate URLs of my site...
-
Hey there!
Thanks for writing in.
I downloaded the CSV from your Travel Pack campaign. It looks like all of the duplicate content pages are in the CSV that I exported. I found them by sorting the the rows in Excel. Here is a good guide on how to get started sorting in Excel: http://office.microsoft.com/en-us/excel-help/sort-data-in-a-range-or-table-HP010073947.aspx
Thanks!
Nick