Noindex Pages indexed
-
I'm having problem that gogole is index my search results pages even though i have added the "noindex" metatag. Is the best thing to block the robot from crawling that file using robots.txt?
-
robots.txt will block it, noindex is still usually able to find via search if you look specifically for it.
User-agent: google-bot
disallow: /sampledir
-
i have added "Disallow:/search.php?" to the robots.txt file. Google seemed to be adding details to the form and then adding the results to the index. This means there are 100,000's of pages in the index that i dont want in there. Hopefully stopping the pages being crawled will help.
-
Yep, that should stop them. Now just need time for Google to crawl.
You can also check WMT to see if the robots.txt is being read.
WMT > Health > Blocked URLs
-
Just be aware Google still has the URLs within their searches but without descriptions. They still 'index' the URL but don't actually crawl the pages.
-
Can you not just use href='/search.php' rel='canonical' /> in the head of all pages that serve the results?
Google will then index the original page people search from and ignore pages with query stringslike:
- /search.php?seo
- /search.php?ppc
-
Thanks. Thats exactly what i needed.