Robots.txt question
-
Hello,
What does the following command mean -
User-agent: * Allow: /Does it mean that we are blocking all spiders ? Is Allow supported in robots.txt ?
Thanks
-
That robots txt should be fine.
But you should also add your XML sitemap to the robots.txt file, example:
User-Agent: * Allow: / Sitemap: http://www.website.com/sitemap.xml -
Any particular reason for doing so ?
-
There's more information about robots.txt from SEOmoz at http://www.seomoz.org/learn-seo/robotstxt
SEOmoz and the robots.txt site suggest the following for allowing robots to see everying and list your sitemap:
User-agent: *
Disallow:Sitemap: http://www.example.com/none-standard-location/sitemap.xml
-
That's not really necessary unless there URLs or directories you're disallowing after the allow in your robots.txt. Allow is a directive supported by major search engines, but search engines assume they're allowed to crawl everything they find unless you disallow it specifically in your robots.txt.
The following is universally accepted by bots and essentially means the same thing as what I think you're trying to say, allowing bots to crawl everything:
User-agent: * Disallow:There's a sample use of the Allow directive on the wikipedia robots.txt page here.
-
I was assuming that by including / after allow, we are blocking the spiders and also thought that allow is not supported by search engines.
Thanks for clarifications. A better approach would be
User-Agent: * Allow:right ?
The best one of course is**User-agent: * Disallow:** -
It's a good idea to have an xml site map and make sure the search engines know where it is. It's part of the protocol that they will look in the robots.txt file for the location for your sitemap.