Duplicate page Content
-
There has been over 300 pages on our clients site with duplicate page content. Before we embark on a programming solution to this with canonical tags, our developers are requesting the list of originating sites/links/sources for these odd URLs.
How can we find a list of the originating URLs? If you we can provide a list of originating sources, that would be helpful.
For example, our the following pages are showing (as a sample) as duplicate content:
www.crittenton.com/Video/View.aspx?id=87&VideoID=11
www.crittenton.com/Video/View.aspx?id=87&VideoID=12
"How did you get all those duplicate urls? I have tried to google the "contact us", "news", "video" pages. I didn't get all those duplicate pages. The page id=87 on the most of the duplicate pages are not supposed to be there. I was wondering how the visitors got to all those duplicate pages. Please advise."
Note, the CMS does not create this type of hybrid URLs. We are as curious as you as to where/why/how these are being created. Thanks.
-
There are many problems that makes a dublicate page:
1. www resolve (domain with www and domain without www, search engines see tem as 2 websites and consideres dublicate content)
2. Parameters used in sending them with get method between pages ex : "?pageId=12&userid=566" here you will use canonical links or exclude them from webmastertools or use robots to block the access to these pages
3. When you use filters or search querys, combobox, dropdown menu, theese components are using get method if there is not specified in form one.
4. I sow your site and you have there many
Telerik.Web.UI.RadComboBox -> here is your problem see how you manage their parameters and how the forms are validated and how page is loaded canonical tag may be your sollution generally :)