Quickest way to deindex a large number of pages
-
Our site was recently hacked by spammers posting fake content and bringing down our servers, etc. After a few months, we finally figured out what was going on and fixed the issue. However, it turns out that Google has indexed 26K+ spammy pages and we've lost page rank and search engine rankings as a result.
What is the best and fastest way to get these pages out of Google's index?
-
Disallow in robots.txt
Add a noindex meta tag to these pages
Request Google to remove the URLs from their index via WMT URL removal request
-
Yup. Just wanted to add as well that if these pages are in a particular directory, then you can deindex the entire directory in one command using the URL removal tool.
-
Given that I'm sure you've removed these pages from your site, there will be no page to which to add a meta-noindex tag.
Disallowing these pages in robots.txt in no way signals to the search engines that they should be removed from the index, just that they should no longer be crawled. Given that they're already indexed, blocking in robots.txt would potentially save some "crawl budget" but wouldn't do anything to remove them from the index.
So submitting them to the URL Removal Tool would be by far the most effective, along with an explanation.
You'll also want to keep a very close watch on your penalty warnings within Webmaster Tools. If you get flagged, you'll want a complete history of the issue and the steps you've taken to address it in order to prepare a reinclusion request.
Lastly, don't forget to submit these same URLs to the Bing Webmaster Tools Block URLs tool. You may not get a massive amount of traffic from Bing, but there's no sense throwing it away, since you've already prepared the URL removal list anyway.
Hope that helps?
Paul