Robot files
-
Say Google has seen I have specific links from one of my sites (A) to another one of my sites (B) and I didn't want them too.
Can I put a robot.txt file in the directory of site (A) that blocks it's access to help with the next time it's crawled.
Or is it too late now the links have been seen?
-
Hi Carl,
I need a bit more information in order to answer this question.
Why would you not want Google to see/follow those links?
Do you want Google to index site (B) at all?
Your next action completely depends on the above.
-
Why put links in that you don't want google to see? If you want them not to pass any juice then make them nofollow.
-
Agree with both of the above. But to answer your question regarding whether or not it's too late: once Google has crawled the page, there's the potential for indexation. If you'd like to prevent Google from indexing the page, I'd recommend adding a noindex meta tag to the page. Then once it's removed you can add the page to the robots.txt file to prevent Google and other bots from crawling it in the future.
But the question still stands: why don't you want Google to crawl the page/link? If you're concerned about penalties due to perceived manipulation, I'd just add a nofollow to the specific link in question.
Hope that helps!
-Trung
-
Thanks for your help guys.
Site A is a development version of Site B and it has links going out to my personal website (Site C) for credentials, because site A isn't populated I didn't want these credential links to be seen as pointless as site A will never be updated or have good content on it, it's for development purposes only.
If this information spurs any more ideas I'd be glad to hear them, other wise I will probably do some, if not all of the above, making the links nofollow from site A at a minimum.
-
Hi Carl,
As Site A is a development site for Site B, you will not want Google to index this or anyone to have access to this (I'm assuming), as this will have a detrimental impact on Site B for search purposes. Here is the action I propose you take:
Site A:
-
Block search indexing using Meta Tags: To prevent most search engine web crawlers from indexing a page on your site, place the following meta tag into the section of your page: . Ensure that this is done across the whole site.
-
Robot.txt file: Set up a robot.txt file for Site A and set to disallow
-
Password protection: Set up password-protection for the website
In order to ensure that Site A is not found or indexed by any search engines it's best practice to do all 3 of the above.
Let me know if you have any questions.
-
-
Great stuff SilverDoor!
I will do this ASAP
Thanks