301 Question - issue
-
A while back we had a 'bleed' on one of our sites, which basically meant one of our sites started to leak across pages to another and that site started to rank for the same pages and now we have hundreds of pages ranking for urls that do not exists. It's hard to explain, bare with me.
If you were to click on the cached view in Google for the ranked page it would show you the main site, but if you were to click it as usual, then you would be taken to the site but a 404 would show as the intended page was not for that site.
We believe we fixed the 'bleed' and have setup 301s for all the affected pages to go to the home page for the site it affected. But these pages have not been removed from Google, which we thought a 301 would do. So we still have hundreds of pages being ranked but are redirected to the home page.
Why hasn't these pages been removed?
-
It's probably just taking Google a while to process all the changes. Really your 301s should point to the same content, not just all go to the homepage. If you had pages showing on two sites, the pages do 'really' exist on one site but weren't supposed to exist on the other. Correct the 301s so that they point from the URLs on the affected site, to the exact same pieces of content on the site where they were originally located (where they were supposed to be located)
If that fails use the HTTP header and X-robots (not no-index tags, fire the no-index directive from the HTTP header instead of the HTML) to tell Google not to index those URLs on the 'affected' website. In conjunction with that, alter the status code of all bogus URLs on the 'affected' site to 410, which is stronger than 404 (it means: GONE - not coming back, 404 just means temporarily gone but will return...)