I think this is very important to note and I agree.
Whether or not google search right now is as good as it could be should not be the main point of discussion.
We have to remember to acknowledge that the web google is indexing now differs drastically from the web it was indexing 20 years ago. Web pages are now less likely than ever to be freely accessible plain text put forth in good faith for public consumption. Google (in addition to dealing with big walled gardens designed explicitly to hide content from google) is trying to sift through basic spam, industrial scale SEO exploitation, and nation-state cyber warfare.
Bitching about google search being bad almost feels like yelling at the canary in the coal mine when it passes out.
The issue (at least for me) is that google is no longer actually searching for the thing I ask for, and it's being blatantly disrespectful of users who cared enough to learn how to actually use the search features.
Quick example from today? I did a literal two word search - gulp admzip - and while the result are okish, an increasing amount of space is taken up by results with this handy little blob at the bottom:
"Missing: gulp | Must include: gulp"
"Missing: admzip | Must include: admzip"
WTF are they smoking? I asked for two fucking words, and the top result doesn't include one of them. Then the second result doesn't include the other.
So then I add quotes around the phrase I want "gulp admzip" because I'd really only like to actually see results that include that EXACT phrase, and... drumroll... IT DOES IT FUCKING AGAIN: "Missing: gulp | Must include: gulp"
And that literally has nothing to do with the quality of the items it's searching, and everything to do with Google deciding what I meant - Clearly I meant the npmjs.com package adm-zip, because that item gets vastly more views than any of the real search results.
I couldn't have possibly meant to restrict the search to the actual fucking phrase I told it to search for, because there aren't that many results, and they don't get many views.
I highly agree that "Missing", "Must include", has to be one of the worst hijacks of search functionality I've ever seen. Please just respect my search terms, as it doesn't fully respect them even when I put them in double quotes! It takes far longer to scan my eyes across the results dump and then retroactively see that my results are absolutely not what I am looking for, than to just see that there aren't many results for my exact query.
Decreasing the feedback loop time is essential to modifying my query quickly so that I can eventually find what I am looking for.
The "Missing", "Must include" pages have always, always, ALWAYS, in every case, never given me what I am looking for. If it did, then I would have just taken out my search term.
This is the exact angle of discussion I'm saying we should try to avoid because this behavior from Google is an effect of an underlying problem with the entire web.
Most sites that include the exact phrase "gulp admzip" are empty spam of regurgitated word lists of every build tool or dev package designed to attract errant clicks that help boost ad metrics. That is why Google can't "just grep the entire internet".
Incidentally, I agree with Google here that whatever problem you're trying to solve is more likely to be an issue with either gulp or admzip and you will be better served by content specifically about one or the other.
We should avoid discussing the fact that the search engine is now hijacking my search to show me things it would prefer I have searched for?
You know what - I'd much rather just see the spam sites.
The spam sites are useful feedback that my search is either too generic, or there aren't many good hits.
Further, some of them aren't actually spam sites - I'm not afraid to click through 5 or even 6 pages of results, and I can usually visually distinguish obvious spam from content very quickly.
I can't do that if Google has removed my ability to actually filter the results to the relevant search terms, and just keeps showing me the freaking link to npm over and over again.
And the problem with the entire web is a result of incentives -- Google's algorithms don't discourage people from creating spam and clone sites. You might even say the algorithms encourage spam because they have been terrible for so long.
Have you considered that there may not actually be any good results for the exact phrase "gulp admzip"? Especially considering "admzip" is a misspelling as you admit?
That's meaningful feedback that my search needs to be improved.
Getting all spam back is actually ALSO GOOD! I can visually distinguish spam pretty quickly, and it's also meaningful feedback that my search needs to be improved.
Removing my ability to search for exact phrases is fucking BAD! I'd much rather get spam or nothing when I search for a directly quoted phrase, rather than google just start returning bullshit.
The problem is that the bullshit google returns is actually very hard to visually parse out - they're real sites that get lots of views, those views are just ENTIRELY unrelated to what I'm actually searching for. That's really hard to filter out quickly.
It's entirely possible that a person could be querying the exact variable name "admzip" for a variety results that should only be code snippets of the by-convention "AdmZip" variable name with no concern for the package name "adm-zip", which, by the way, Google will interpret hyphens as just whitespace, so it's the equivalent of searching "adm zip".
I know this could be a case because I do this kind of programmatic search all the time, and in fact I remember specifically searching the web for "AdmZip" and not "adm-zip" a few years ago.
And if there are no good results for "AdmZip", that's fine! At least I can, at a glance, quickly know that there are not many code snippets across the web lying around with that conventional variable name.
> I asked for two fucking words, and the top result doesn't include one of them
You are aware that it has always been like this? That's why the "+" operator even exists. Even 10 years ago or more (before Google+ stole the "+" operator), you could get back results that don't actually include your search terms and you'd have to do +gulp +admzip to force their inclusion>
Try it yourself, it does JACK SHIT - they still show me what they'd prefer I have searched for.
---
Edit - ok, I'm guessing you are, since on the second read I see you're joking about the google plus timing. I don't understand how that makes this better? Tooling is being stripped away in favor of showing me "popular search content" that google would prefer I see, and ads...
I'm not saying it makes anything better, I'm saying that Google has always been able to return results that don't include the search terms, that's why the '+' force inclusion operator existed. It is not a new phenomena for Google to return results that don't contain keywords, since one of the defining features of Google vs Excite/Lycos/AltaVista et al is that it wasn't a simple TFIDF (textual frequency inverse document frequency) search, many of the older search engines used a variant of that. PageRank, even in Google 1.0, probably allowed a very low ranked TFIDF result (doesn't include most of the query) to be boosted, although I'm just speculating.
I'm just saying, in the same theme of my original 'young whippersnappers' post, that a lot of things people are saying is new behavior is in fact, old behavior with a different UX.
Woah that's a weird test case! fwiw now that you've posted this, this link shows up for that search. But similar searches like "gulp adimzip" show similar issues. Is this simply a bug? Clicking on Missing: adimzip" | Must include: adimzip" now makes Google search for gulp "adimzip""
Whether or not google search right now is as good as it could be should not be the main point of discussion.
We have to remember to acknowledge that the web google is indexing now differs drastically from the web it was indexing 20 years ago. Web pages are now less likely than ever to be freely accessible plain text put forth in good faith for public consumption. Google (in addition to dealing with big walled gardens designed explicitly to hide content from google) is trying to sift through basic spam, industrial scale SEO exploitation, and nation-state cyber warfare.
Bitching about google search being bad almost feels like yelling at the canary in the coal mine when it passes out.