Canonical Problems
-
Hi Guys, There is so much info out there about canonical issues and 301 redirects I'm not sure what to do about my problem.
Google webmaster says I have over 2000 duplicate page titles. Google is showing most of my pages in duplicate or triplicate url format.
Example:
/store/LOVE_OIL_CARIBBEAN_ROSE/
/store/LOVE_OIL_CARIBBEAN_ROSE
/store/love_oil_caribbean_rose/Im using x-cart pro as my cart.When I look at the source code I see each one having a rel=canonical tag with the exact urls you see above. Can someone give me an example of a redirect that I can put in my .htaccess file that would work site wide?I obviously cant go through and 301 this on a page by page basis. It would take a year.Thank You Tim -
Hi Tim.
A few suggestions:
1. ALWAYS use lower case in URLs. 100% always, never break this rule.
Many of us are used to working with Windows where case sensitivity does not matter. Windows offers a character map to basically say the upper case and lower case versions of a letter are equivalent. On a Linux server (which is what most sites run off of) a lower case and upper case letter are two distinct characters.
2. Make a determination of how your web page URL structure will appear and be consistent. I prefer to use a trailing slash "/" to indicate a folder which contains additional pages, and no trailing slash to indicate a web page (i.e. you can't drill down any deeper). mysite.com/page1 <> mysite.com/page1/ These are two distinct URLs.
Can someone give me an example of a redirect that I can put in my .htaccess file that would work site wide?
You are looking for two regex expressions. I am not a regex expert but one expression should remove the trailing slash from any web pages and 301 them to their no-slash equivalent. The second expression I believe uses the NC qualifier to disregard the case sensitivity of the page.
Going forward, try to follow the rules for #1/2 when creating URLs.
-
Thanks Ryan, Funny I never realized that about windows vs. linux. That is very helpful. Now I just need the proper regex expressions. We have tried several in the past and have taken the site down in the process. I think its funny that with all of google technology they still cant determine that a page is a page.
-
I am sure you can find the expressing via a Google search, but even easier you can ask your host to do it for you. Most small sites use managed hosting, and most hosts are willing to help. Give them a call or open a help ticket and ask.