Indexing folders in google
-
This post is deleted! -
Hi,
You can use the robot.txt to block unwanted sub domains and files from being indexed.
For example if I have website on a url www.website.com and a sub domain such as sub.webiste.com, in that root file my robot.txt file would read:
User-agent: *
Disallow: /This will stop the sub domain sub.webiste.com from being indexed.
-
Hi mmdemadi,
If you want to prevent the whole subdomain from being indexed, there are a few actions you can do:
- Prevent Google from crawling it, via a Robots.txt disallow rule. Remember that this only prevents Google from seeing your content, but if there are any outbound link to that subdomain, some pages might get indexed.
- Meta tag robots noindex or x-robots, in HTTP header, noindex. This tells Google not to index, yet pages will be crawled. Here what Google says about noindex
- If that information should not be public, then hide it behind a login. Google states that its the most effective and simplest method.
Apart from these actions you may do, keep in mind that they work differently.
As the easiest to do is to block in robots.txt, keep in mind that this won't remove whats previously indexed.
In the case you want to block the whole subdomain and remove pages already indexed, I'd often suggest these steps:- Don't block in robots.txt what you want to remove
- Add robots noindex tag and, if there are just a few, use the URL removal tool in Google Search Console
- Wait a few days and check that those pages don't appear in search
- Block that directory/subdomain in robots.txt
Some further information:
Robots.txt FAQ - Google Search Console Help
Robots Meta Directives - Moz Learning Center
Robots.txt best practices - Moz Learning CenterHope it helps.
Best luck.
Gaston -
This post is deleted! -
This post is deleted! -
301 redirection is the best way to handle those sub-domain.