We got an answer from JohnMu - Webmaster Trends Analyst at Google. The reason of crawling is (as we find out) the filters which have infinite variations (one of developers was sleeping), we will correct this. Disallowing in Robot.txt is adviced as the quickest fix to stop the mega-crawling. This case will be used for further research because of the disproportionate capacity usage. You're right, Google initially will crawl everything, but they don't want Googlebot crawling looks like a "mini-Ddos-like attack".
Posts made by Olaf
-
RE: Googlebot on steroids... Why?
-
RE: Googlebot on steroids... Why?
Thanks for your help!
I think you're probably right. The initial crawling must be complete if Google wants to put everything into the right perspective. But we manage en host more than 300 sites, including large A-brand sites. And even at those sites I had not seen this kind of volumes before.
The server logs also show the same amount of request this night (day five). I will keep you posted if this still continues after the weekend.
-
RE: Googlebot on steroids... Why?
Mmm, is that correct? I thought that the amount of resources Google will put in crawling your (new) website also depends of it's authority. 9 million url's, for four days now... It seems to bee so much for this small website...
-
Googlebot on steroids... Why?
We launched a new website (www.gelderlandgroep.com). The site contains 500 pages, but some pages (like https://www.gelderlandgroep.com/collectie/) contains filters (so there are a lot possible url parameters). Last week we mentioned a tremendous amount of traffic (25 GB!!) and CPU usage on the server.
2017-12-04 16:11:57 W3SVC66 IIS14 83.219.93.171 GET /collectie model=6511,6901,7780,7830,2105-illusion&ontwerper=henk-vos,foklab 443 - 66.249.76.153 HTTP/1.1 Mozilla/5.0+(Linux;+Android+6.0.1;+Nexus+5X+Build/MMB29P)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/41.0.2272.96+Mobile+Safari/537.36+(compatible;+Googlebot/2.1;++http://www.google.com/bot.html) - - www.gelderlandgroep.com 200 0 0 9445 501 312
We find out that "Googlebot" was firing many, many requests. At first we did a nslookup for the IPadres where it actually seems to be googlebot.
Second we visited Google Searchconsole and I was really surprised... Googlebot on steroids? Googlebot requested 922.565 different url's and made combinations for every filter/ parameter combination on the site. Why? The sitemap.xml contains 500 url's... The authority of the site isn't very high, no other signal that this is a special website... Why so much "Google resources"?
Of course we will exclude the parameters in SearchConsole, but I never saw a Googlebot activity for a small website like this before! Does anybody have any clue?
Regards Olaf
-
RE: Street Address Not Appearing on Business Google+ Page
Did you check your listings at other local listing sites? Google possibly would like to check your NAP information at other local listings. You can use getlisted.org. Please make sure you accurately implement name, address and phone number of your business in all the local directories. And more importantly, make sure that all the citations have your name, address and phone number listed in exactly the same way across all the directories (including your own website).
-
RE: Analytics: Goal Tracking
I don't think so, cross tracking is sometimes very hard... I'm not sure, but using searchtype "regular expression" and goal "shop.gardio.* "?? Will that work for you?
When you just want to know the % of visitors which went to shop.domain.com maybe Enhanced Link Attribution is also interesting: http://support.google.com/analytics/answer/2558867?hl=en&ref_topic=2558810
-
RE: URL Structure for Multilingual Site With Two Major Locations
We prefer the way you suggest, use and also translate the menu names:
domain.com/location-1 – to target English visitors
domain.com/es/establecimiento-1 – to target Spanish visitors
-
RE: Analytics: Goal Tracking
Hi Sven,
This article will help you implementing cross (sub)domain tracking: https://developers.google.com/analytics/devguides/collection/gajs/gaTrackingSite
When you implement this well, you can use one goal url.
-
RE: Analytics: Goal Tracking
Maybe this article will help you: http://www.ericmobley.net/guide-to-tracking-multiple-subdomains-in-google-analytics/
-
RE: Why are some of page indexed and others not
When you're sure the page/ pages is/are accessible (no noindex
)and there is unique, text on the pages
and your pagerank >0
Google will index the page sooner or later. A good backlink or some social media posts can help.
-
RE: Why are some of page indexed and others not
Google f.e. will not crawl every page of your site. In fact, they may become aware of pages that they choose not to crawl because they are not likely to be important enough to return in a search result. Beacuse you have a new site, you need time and backlinks. If your content is unique, and when you get some strong relevant backlinks, more pages will be indexed soon.
-
RE: Conversion Rate - site feedback
Nice clean design, but not always to-the-point
- Use a pay-off at the header of the page (what can I find at this page)
- Change the elements "his week's featured footstools" and the black-and-white ilustrations (just show your product!!)
- Change the colour of your "Add to basket'" button and make the button more prominent
- Delete the red-white socks on a product detail page (and other pages exept the homepage). It's nice to use it at the homepage as a gimmick, but on every page it may be iritating. Your products are beautifull, show them!! Not the socks!
- Redesign your product detail pages: let the product be the most important part of the page not a big purple header, not the navigation on the left. Relocate 'free delivery" etc to a place above the fold.
- Remove the category navigation at your shopping basket page, just show the basket and the service claims like free delivery.
- Change the colour of your "checkout" button and simplify the form, just visit large webshops for examples.
Good Luck!
-
RE: Adwords Keyword Research - Impressions, CTR
What's a top keyword? That's the question... It's not always the keyword with the most traffic.
You can use the Adwords Keyword tool (for example) to find related search queries and the search volume. But even when their is a lot of traffic, it's not sure it's a 'good' keyword for you.
When you are running an Adwords campaign you can find search queries "Tab Keywords -> See Search terms". This Search queries are also in Google Analytics, but with information about conversion, time of site visit, etc. So you can use Adwords + Google Analytics to find 'converting' search queries. Thereafter you can start optimising (seo) for this search queries.
See also http://www.seomoz.org/ugc/advanced-seo-keyword-research-tips-and-ideas-14216 for a lot of information.
-
RE: Where to link?
Hi there, visit the blog and find everything you need to know. For example: http://www.seomoz.org/article/the-professional-guide-to-link-building-2011
-
RE: Adwords Keyword Research - Impressions, CTR
Do you mean, you spent $300 to find out the search queries?Then you beter can check Google Analytics to see which keyword phrases brought you the best visitors (in terms of conversion).
-
RE: Google plus
I agree, but we also see a lot of Twitter contacts results
-
RE: Is purchasing domain names still relevant?
It won't work when you just redirect a lot different domains to your website from a seo point-of-view. So for that reason you can skipp them.
But when the domain are strong (short, older, relevant keywords without "-") you can use them for nice sites and (later on) link to you main website. That sort of domains shouldn't expire or become available for you compititors. (Or when they bring you a lot type-in traffic, you also shoud keep them of course..)
-
RE: Websites on same c class IP address
Not necessarily. f you see different C blocks, you are usually talking about two different webhosts. So there is a chance that different sites from one owner are hosted in one C-block.
Even an IP-address is not always used for one company. You can have your own IP-address which leads to your website. But providers can also share an IP-address for different domains/ websites for different companies.
To find if website belong tot the same owner you need to check te registars database or tools like allwhois.com.