SEOMOZ Crawler unicode bug
-
for the last couple of weeks the SEOMOZ crawls my homepage only and gets 4xx error for most of the URL's.
the crawler have no issues with English url's only with the unicode(Hebrew) ones.
this is what is see in the csv export for the crawl (one sample) :
http://www.funstuff.co.il/׳ž׳¡׳
׳‘׳×-׳¨׳•׳•׳§׳•׳× 404 text/html; charset=utf-8you can see that the URL is Gibberish
please help.
-
Hey Asaf,
Thanks for writing in.
We have a known issue where Hebrew isn't parsed right by our crawler so it has caused issues in the past. The issues have been intermittent but they can affect the data you see. Sorry about that. Our engineers have been working to get a fix out there for the Hebrew character set, so stay tuned.
Best,