What would cause these ⠃︲蝞韤諫䴴SPপ� emblems in my urls?
-
In Search Console I am getting errors under other. It is showing urls that have this format-
https://www.site.com/Item/654321~SURE⠃︲蝞韤諫䴴SPপ�.htm
When clicked it shows 蝞韤諫䴴SPপ� instead of the % stuff.
As you can see this is an item page and the normal item page pulls up fine with no issues. This doesn't show it is linked from anywhere. Why would google pull this url? It doesn't exist on the site anywhere. It is a custom asp.net site. This started happening in mid May but we didn't make any changes then.
-
Hello,
I believe that in URLs % signs plus letters/number can be translated into different characters. For instance %20 is a space. %21 is a !. WC3 have a guide here http://www.w3schools.com/tags/ref_urlencode.asp.
I don't know why it would translate into oriental characters, but that may give you a place to start your investigation.
Hope this helps.Cheers,
Luke
-
What I want to know is why Google is finding these pages at all. I can normally look at linked from and find any problems. But if these are not linked from anywhere why is Google finding them?
-
It could be due to many reasons. One would need to know the domain in order to do further analysis on the issue.
-
They are encoded URLs. For example, Google will turn all " into %22 and all spaces in the URL to %20. You can learn more about them here http://www.w3schools.com/tags/ref_urlencode.asp . And here is a useful tool for encoding and decoding URLs: http://meyerweb.com/eric/tools/dencoder/ .
What you need to do is have the developer "escape out" or "rewrite" all non-alphanumeric characters in the URL. You'll also have to 301 redirect the old URLs to these new, search-engine friendly ones without the characters that get automatically encoded, like parentheses, commas, tildas and plus signs.