Questions
-
How Best to Handle Inherited 404s on Purchased Domain
Hi there, I'm considering that you have over 500k URLs, to be worrying about crawl efficiency. If you have less than that, please don't worry. Having 404s is completely fine, and google will eventually lower their crawl frequency to those pages. Blocking them in robots.txt will cause to google stop crawling them, but never to never remove them from the index. My advice here: don't block them in robots.txt As Rajesh pointed out, you could force those 404s into 410 to tell Google that they are gone forever. Yet, Google said that they treat 404s and 410s as the same. John Mueller said over a year ago that 4xx status codes don't incur in crawl wastage. You can check it our in these Webmasters hangout notes - Deepcrawl Hope it helps, Best luck. Gaston
Intermediate & Advanced SEO | | GastonRiera1 -
Page drops from index completely
I had a simlar issue on a couple Wordpress freebie sub domains I made while conducting reputation management for clients. What had ended up happening was The site would index immediately and then 24 hours later be ghosted completely. Turns out I was submitting the news sitemap that it automatically generated and being that I wasn't in their list of approved news sitemaps, I guess it just ripped everything out, as I'm sure the news sitemap and the regular one had the same pages listed just with more detail on the news one. I doubt it's the exact same occurrence but if you recently submitted a sitemap, I'd check it closely, as it has been known to trigger a similar problem, at least for me!
White Hat / Black Hat SEO | | TucsonAZWebDesign0 -
URL Too Long vs. 301 Redirect
Hi there, The general rule of thumb for URLs is that if the structure and layout makes sense from a user perspective then I wouldn't worry about losing any sleep over this. A good guide to this can be found here (done by Mr Fishkin himself!) - https://moz.com/blog/15-seo-best-practices-for-structuring-urls I hope this helps!
Intermediate & Advanced SEO | | Corbec8880