Certainly an interesting question. It's becoming more and more evident that image recognition software (more specifically, subject recognition) is gaining traction within big names including Facebook and Google. The software (still in development) can recognize subjects, objects, settings, etc. - to the point where they can "name" an image based on these factors. Which, of course, is extremely relevant to this conversation.
That said, I disagree with the notion that incongruities between an image name, alt-text, or title and the recognized subject of that image will have any factor at any point in time. I have two main points on why I suspect this will never become practice:
- Naming an image based directly on its contents has never been a suggestible convention. Historically, naming an image has been more about the "message" or intended use of that image than about its direct, visual content. To push content creators to start doing this would be overly heavy-handed (yes, even for Google).
- The web would be utterly polluted by images with the exact same name, all over the place. As you'd brought up stock photography and its proliferation across the web, I'd counter that this is exactly why it won't happen. The amount of images by this convention that would be named "man in suit at laptop" alone is staggering.
More to the point, Google and other curators prefer specificity; so much so that it would be impossible for them to accurately define more than the visual assets - which often don't make up the bulk of a pictures meaning.
TL;DR version: Do I think what you're suggesting is possible? Absolutely. Do I think it will happen? No; this would go against naming conventions and Google's own desire for specificity.