Pages are Indexed but not Cached by Google. Why?
-
Thanks. There are two main parts to Google "figuring out" your site. One: indexation. That's been solved. We know that you're getting indexed. Two: ranking. Your site being so new and young, it's going to need backlinks and network growth to experience dramatic ranking changes. If your menu isn't causing your site to be indexed poorly and your pages are being counted as unique enough, then you're ok now there as well. The next most important step is getting your domain trust and authority up.
-
Hi Massimliano. I would disagree with myself if I was talking about your site too... ;^) But in this specific case, qjamba.com is a site that needs the fundamental quality of backlinks more so than it needs Teddy to write a bot that is constantly pinging Google in order to try and decipher the incremental on-site changes he's making. I'm speaking to his need to prioritize that aspect of optimization. Copying what an SEO does when creating a nonsensical site with gibberish words in order to test on-page optimization as purely as possible with a normal, public facing website is a bad idea.
Obviously on-page optimization is important, but again, in this specific example, Teddy isn't even discussing his keyword rankings, rather he was looking to go down an on-site optimization path that might make him more and more frustrated instead of bringing about much more positive results. Cheers!
-
Some of my pages are on Google's page 2, 3 and a few on page 1 for certain search terms that don't have a lot of competition but that I know SOME people are using (they are in my logs). and those pages have virtually no backlinks. I want to boost those on page 2 or 3 to page 1 as quickly as possible because p1 is 10x or more better than p2. Time/Cost is an issue here: I can make changes overnight at no cost as opposed to blogging or paying someone to blog.
Because domain authority and usage takes so long, it seems worth tweaking/testing NOW to try to boost certain pages from p2 or 3 to page 1 virtually overnight as opposed to waiting for months on end for usage to kick in. I don't know why Google would penalize me for moving a menu or adding content--basically for performing SEO on page, so it would be nice to be able to figure out what tools (cached pages, site:www. GWT, GA or otherwise) to look at to know if Google has re-indexed the new changes.
Of course, the biggest pages with the most common search terms probably HAVE to have plenty of backlinks and usage to get there, and I know that in the long run that's the way to success overall when there is high competion, but it just seems to me that on page SEO is potentially very valuable when the competition is slimmer.
-
I think there's been a misunderstanding. I'm not writing a bot. I am talking about making programming changes and then submitting them to Google via the fetch tool to see how it affects my ranking as quickly as possible, instead of waiting for the next time Google crawls that page -- which could be weeks. I think the early reply may have given you a different impression. I want to speed up the indexing by fetching in Google the pages and then look to see what the effect is. My whole reason for starting this thread was confusion over knowing how to tell when it was indexed because of unexpected results (by me) with the cache and site:www... on Google.
-
Great! Well you have lots of insights here. It sounds like you're ready to test in the near term, and build up the domain in the long term. Good luck!
-
Well, I'm ready to test -- but still not quite sure how since I don't know how to tell when Google has indexed the new content, since sometimes it doesn't get cache'd and sometimes it disappears from the site:www.. listing. I've read it only takes a couple of days after Google crawls the page, and can go with that, but was hoping there is a way to actually 'see' the evidence that it has been indexed.
So, while I've gotten some great input, I am somewhat unsatisfied because I'm not sure how to tell WHEN my content has really been put in the index so that the algorithm is updated for the newly crawled page.
-
Ah, that answer really varies per website. For example, if you're site is a major news site, Google's indexation is extremely fast, measured in seconds not days. Even if you're not a news site, major sites (high domain authority) get crawled and indexed very rapidly. Since you're going to be testing your own changes you'll learn how long this takes for your particular site.
-
Masimilliano, thanks for your input. So you're on of them,huh?
Good points, the last thing I want to do is annoy users, yet I also want to track 'real' usage, so there is a conflict. I know it is impossible to block all that I don't want as there is always another trick to employ..I'll have to think about it more.Yeah the cut and paste blocking is annoying to anyone that would want to do it. But, none of my users should want to do it. My content is in low demand but I hate to make anything easier for potential competition, and some who might be interested won't know how to scrape. Anyway thanks for your feedback on that too.
-
I'm sorry, but once I know they have crawled a page, shouldn't there be a way to know when it has also been indexed? I know I can get them to crawl a page immediately or nearly, by fetching it. But, I can't tell about the indexing--are you saying that after they crawl the page, the 'time to indexing the crawled page' can vary by site and there really is no way to know when it is in the new index? that is, if it shows as newly cached that doesn't mean it has been indexed too, or it can be indexed and not show up as a site:www... , etc..?
-
Well, then I totally agree with you, Ryan, thanks for the answer. With a DA of 1, you are absolutely right.
-
First of all, I was just browsing and I got blocked as bot see below:
I would remove that cloaking.
Second, understanding your visitors behavior is one of the most complex task, you don't know your user behavior until you run a lot of test, survey and so on...
-
Yeup! Indexing time varies. You'll be able to tell the time between crawl and indexation by when Google shows your page version B in it's cache after you made changes from A, so if the 'example.html' page is already in Google's index you'll see this:
You make changes on a page, example.html (version A is now version B)Google crawls example.html (version B)
You check Google to see if example.html is version A or B in the cache
no?
no?
no?
no?
yes. That's how long it takes.OR, you make a new page. It gets crawled. Checking if it's indexed... no, no, no, no, yes?! That's how long it takes.
Again, this time period varies and having a site with excellent domain strength and trust usually makes it a shorter time period. It also tends to influence how many pages Google decides to keep in its index or show to users. Pretty much everything gets better for a site the stronger its domain authority and trust are.
-
Thanks for sharing that. I was only kidding above, but obviously it's no joking matter when a user gets blocked like you did.
I just looked and see that it blocks when something/someone clicks 3 times within 30 seconds. EDIT: but that's only if it isn't keeping the session between clicks--see next post
-
THANK YOU!

-
Geez, I'm so pedantic sometimes. Just need to understand what this means:
<<or, you="" make="" a="" new="" page.="" it="" gets="" crawled.="" <strong="">Checking if it's indexed... no, no, no, no, yes?! That's how long it takes.>></or,>
How do you do the bolded? site:www.site.com/thepage "my content change on the page" ?
And, you did say one can change and not the other yet the page really has been indexed, right?
-
The second example is talking about a new page that never existed before, i.e. new-example.html... So you created a wholly new page on your site. You see that it gets crawled, you go to Google to see if it gets indexed.
Again though, the lower your site's domain authority and trust, the higher chance of that site getting pages indexed slower, de-indexed, and not showing up in high in the rankings.
Remember my earlier suggestion video? You're sweating the computer and details and minutiae way too much at the expense of doing what would really move the needle for optimization at your site's stage (getting reputable links from other domains). Same goes with what you're doing with trying to block certain activity on your site. Normal user activity is getting messed up--Massimiliano and Travis' experience.
This is probably the best advice at this stage: instead of spending one more second on this Q/A thread and trying to see how many minutes transpired between your own changes and seeing them in Google, spend that time to go get 10 good links. No need to even thank me. I'll take the silence as your newly enlightened bliss.
-
Massimiliano,
Can you tell me your steps that led to that error? It looks like you went directly to www.qjamba.com/local-coupons/wildwood/mo/all and then you opened up a separate tab and went to www.qjamba.com and then either refreshed the home page or opened the home page again in another tab -- all within 30 seconds. That's the only way I have been able to reproduce this , because it looks for 3 searches without any existing session within 30 seconds by the same ip address, and the home page wipes out the session and cookies, and those are the urls the db table shows that you went to, and in that order.
Normally a user stays in the same tab, so with the 2nd search will have a session -- but your ip had no session with each search. And, normally you can't go to the home page from a location page. So, I'm confused as to what you did if it wasn't like what I wrote above. If you didn't do this then I'm worried of a serious programming problem having to do with the php sessions getting dropped between pages.
I"ve put a lot of time into this website and a ton of testing of it too, and just went live a few months ago, so these kinds of problems are disheartening. Ironically, your experience is almost identical to that of Travis, except that in your case you must have moved a little faster since you got a different message. But, it would REALLY help me to get some feedback from both of you confirming what I wrote or setting me straight if you did something different.
-
You are totally wrong guessing my path. You are going down a tunnel which doesn't have a exit. Personally I think, in this thread, you got some good advice about what you should focus on, so I would stop feeling in dismay, and confidently steer away from bad practices. Good luck.
-
Massimiliano, my guess of your path was the most logical conclusion based on the fact that I have 3 records of the urls you went to on my site, and showing that the program didn't keep any session variables between the 3 urls you came to. You first went to wildwood. Then you went to the home page. This implies that you either did that in a new tab, or you hit the back key, or you modified the url and removed the wildwood part to go to the home page, as opposed to clicking on something on the page. Telling me I'm wrong at least lets me know I may have a serious problem to fix, but you are mistaken to think that this is a robot problem. It is a php session variable problem, apparently, that none of my extensive (hundreds of hours) testing has ever had.
This is a serious problem unrelated to the OP and about 100 times more important than the OP that I was hoping to get some help with because it is very difficult to diagnose without feedback from those having the experience that you had with my site,. However, that's my problem I'll have to deal with. I don't know if you just don't remember or aren't telling me because you think it is a robot problem, but if you do happen to recall the steps (or at least tell me it was all done in the same tab or you hit the back key) I'd appreciate whatever it is you can tell me. If I can't solve the problem it probably means I'll have to shut down my website which I've put more than 4 years of my life into. Seriously.
Thanks for your various other responses though.. Take care. Ted
-
I can't really argue with log files, in most instances. Unfortunately, I didn't export crawl data. I used to irrationally horde that stuff, until I woke up one day and realized one of my drives was crammed full of spreadsheets I will never use again.
There may be some 'crawlability' issues, beyond the aggressive blocking practices. Though I managed to crawl 400+ URI before timeouts, after I throttled the crawl rate back the next day. Screaming Frog is very impressive, but Googlebot it ain't, even though it performs roughly the same function. Though, given enough RAM, it won't balk at magnitudes greater than the 400 or so URIs. (I've seen... things... ) And with default settings, Screaming Frog can easily handle tens of thousands of URI before it hits it's default RAM allocation limit.
It's more than likely worth your while to purchase an annual license at ~$150. That way, you get all the bells and whistles - though there is a stripped-down free version. There are other crawlers out there, but this one is the bee's knees. Plus you can run all kinds of theoretical crawl scenarios.
But moving along to the actual blocking, barring the crawler, I could foresee a number of legit use scenarios that would be comparable to my previous sessions. Planning night out > Pal sends link to site via whatever > Distracted by IM > Lose session in a sea of tabs > Search Google > Find Site > Phone call > Not Again... > Remember domain name > Blocked
Anyway, I just wanted to be sure that my IP isn't white listed, just unblocked. I could mess around all night trying to replicate it, without the crawling, just to find I 'could do no wrong'. XD
Otherwise it looks like this thread has become a contention of heuristics. I'm not trying to gang up on you here, but I would err on the side of plenty. Apt competition is difficult to overcome in obscurity. : )