Load balanced Site

tpt.com

Our client ecommerce site load from 3 different servers using load balancing.

abc.com : IP: 222.222.222

Abc.com: IP: 111.111.111

For testing purpose 111.111.111 also point to beta.abc.com

Now google crawling site beta.abc.com

If we block beta.abc.com using robots.txt it will block google bot also , since beta.abc.com is really abc.com

I know its confusing but I been trying to figure out. Ofcourse I can ask my dev to remove beta.abc.com make a seperate code and block it using .htaccess

john4math

Maybe I'm not understanding, but if you can differentiate on the server the difference between beta.abc.com and abc.com, subdomains can have different robots.txt files, so you should be able to serve a file for http://beta.abc.com/robots.txt that disallows everything, with

User-agent: *
Disallow: /

and a different robots.txt file for http://abc.com/robots.txt. If you do so, it shouldn't block Googlebots access to abc.com.

SamAllen

Our solution for this is to use http authentication. So our dev sites require a simple password to access.

Here is an example: http://dev.zeta-commerce.com/

This keeps the bots out and avoids the risk of a robots.txt file being released accidentally to the live site.

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Load balanced Site

Products

Moz Solutions

Free SEO Tools

Resources

About Moz

Why Moz

Get Involved